# ML advanced

This example shows a Machine Learning pipeline using the Python API, how to package
your projects so you can install them using `pip install .` and how to test
using `pytest`.

## Setup

Make sure you are in the `ml-advanced` folder:

In [1]:
%%sh
# install the pipeline as a package
pip install .

Processing /Users/Edu/dev/projects-ploomber/ml-advanced
Building wheels for collected packages: basic-ml
  Building wheel for basic-ml (setup.py): started
  Building wheel for basic-ml (setup.py): finished with status 'done'
  Created wheel for basic-ml: filename=basic_ml-0.1.dev0-py3-none-any.whl size=2892 sha256=68f2dcb5a434ec745b1be2618c1ebba86e83c8b00de3a447c58050685e3e5e2d
  Stored in directory: /Users/Edu/Library/Caches/pip/wheels/30/07/51/cd5a985eb72b9832d67eeb3a41a3c9e3dce38f7855b4969a9d
Successfully built basic-ml
Installing collected packages: basic-ml
  Attempting uninstall: basic-ml
    Found existing installation: basic-ml 0.1.dev0
    Uninstalling basic-ml-0.1.dev0:
      Successfully uninstalled basic-ml-0.1.dev0
Successfully installed basic-ml-0.1.dev0


## Executing pipeline

In [2]:
%%sh
ploomber build --entry-point basic_ml.pipeline.make

name      Ran?      Elapsed (s)    Percentage
--------  ------  -------------  ------------
get       False               0             0
features  False               0             0
join      False               0             0
fit       False               0             0


  0%|          | 0/4 [00:00<?, ?it/s]Rendering DAG "ml-pipeline":   0%|          | 0/4 [00:00<?, ?it/s]Rendering DAG "ml-pipeline":   0%|          | 0/4 [00:00<?, ?it/s]Rendering DAG "ml-pipeline":  50%|█████     | 2/4 [00:00<00:00, 15.57it/s]Rendering DAG "ml-pipeline":  50%|█████     | 2/4 [00:00<00:00, 15.57it/s]Rendering DAG "ml-pipeline":  75%|███████▌  | 3/4 [00:00<00:00, 13.01it/s]Rendering DAG "ml-pipeline":  75%|███████▌  | 3/4 [00:00<00:00, 13.01it/s]Rendering DAG "ml-pipeline": 100%|██████████| 4/4 [00:00<00:00,  7.02it/s]Rendering DAG "ml-pipeline": 100%|██████████| 4/4 [00:00<00:00,  7.54it/s]
0it [00:00, ?it/s]4it [00:00, 18020.64it/s]


## Testing

```bash .noeval
pip install -r requirements.txt

# incremental (will only run the tasks that have changed)
pytest

# complete (force execution of all tasks)
pytest --force

# to start a debugging session on exceptions
pytest --pdb

# to start a debugging session at the start of every test
pytest --trace
```

## Interacting with the pipeline

In a Python session (make sure `ml-advanced/env.yaml` is in the current active
directory):


In [3]:

from basic_ml.pipeline import make

dag = make()
dag.status()

HBox(children=(FloatProgress(value=0.0, max=4.0), HTML(value='')))




name,Last run,Outdated?,Product,Doc (short),Location
get,"20 minutes ago (Oct 15, 20 at 16:04)",False,/Users/Edu/dev /projects- ploomber/ml-ad vanced/output/ data.parquet,Get data,/Users/Edu/min iconda3/envs/p loomber/lib/py thon3.6/site-p ackages/basic_ ml/tasks.py:12
features,"20 minutes ago (Oct 15, 20 at 16:04)",False,/Users/Edu/dev /projects- ploomber/ml-ad vanced/output/ features.parqu et,Generate new features from existing columns,/Users/Edu/min iconda3/envs/p loomber/lib/py thon3.6/site-p ackages/basic_ ml/tasks.py:28
join,"20 minutes ago (Oct 15, 20 at 16:04)",False,/Users/Edu/dev /projects- ploomber/ml-ad vanced/output/ join.parquet,Join raw data with generated features,/Users/Edu/min iconda3/envs/p loomber/lib/py thon3.6/site-p ackages/basic_ ml/tasks.py:37
fit,"20 minutes ago (Oct 15, 20 at 16:04)",False,"{'report': Fil e(/Users/Edu/d ev/projects- ploomber/ml-ad vanced/output/ report.txt), 'model': File( /Users/Edu/dev /projects- ploomber/ml-ad vanced/output/ model.joblib)}",Fit model and generate classification report,/Users/Edu/min iconda3/envs/p loomber/lib/py thon3.6/site-p ackages/basic_ ml/tasks.py:46
