# Use case 1: 


## Dataset: Breast cancer
## Goal: Automatically containerize a reproducible Machine Learning pipeline with the aim of being run on any platform.

![](examples/leaf_draft/workflow/usecase1.png)

### Input:

Three ML steps, dataset, and a list of requirements libraries.

- gathering.py, preprocessing.py, modeling.py

- breast-cancer.csv

- requirements.txt

    #### [gathering] -> [preprocessing] -> [modeling]

### Output:

- 1 docker image. It contains the platform where the workflow runs.
- 1 docker container. Based on the previous image, start up a dashboard with the workflow metrics.

    #### [image] - [container]

# scanflow


## Setup

**Input**: 

- Folder with the python files, data set and requirements.

**Output**: 

- Platform.
- Dashboard at http://localhost:8001 for metrics. 

In [4]:
from scanflow.setup import setup

# App folder
app_dir = '/home/guess/Desktop/scanflow/examples/leaf'

# Workflow
workflow = {'gathering': 'gathering.py',
            'preprocessing': 'preprocessing.py',
            'modeling': 'modeling.py',
            'main': 'main.py'}

platform = setup.Setup(app_dir, workflow)

platform.build()
platform.run()

platform

28-Oct-19 11:41:34 -  INFO - Building platform, type: single.
28-Oct-19 11:41:34 -  INFO - Dockerfile was found.
28-Oct-19 11:41:34 -  INFO - MLproject was found.
28-Oct-19 11:41:34 -  INFO - Running platform, type: single.
28-Oct-19 11:41:34 -  INFO - Image app_single is running as app_single container.
28-Oct-19 11:41:34 -  INFO - MLflow server is running at 0.0.0.0:8001



Platform = (
    image: app_single,
    container: app_single,
    type=single),
    server=0.0.0.0:8001),

In [5]:
# platform.stop()

# Deploy

Run the workflow.

- Input: Platform.
- Output: Results of running the workflow shown at http://localhost:8001/#/. 

In [2]:
from scanflow.deploy import deploy

# Read the platform
deployer = deploy.Deploy(platform)

# Run the workflow
deployer.run_workflow(plat_container_name='app_single')

deployer

28-Oct-19 11:40:03 -  INFO - Running workflow: type=single .
28-Oct-19 11:40:03 -  INFO - Using platform container.
28-Oct-19 11:40:08 -  INFO -  Main file (main.py) output:  Run matched, but is not FINISHED, so skipping (run_id=a3622657bd894988ab6aef7f200e376b, status=FINISHED)
No matching run has been found.
2019/10/28 10:40:04 INFO mlflow.projects: === Created directory /tmp/tmp9j6uoria for downloading remote URIs passed to arguments of type 'path' ===
2019/10/28 10:40:04 INFO mlflow.projects: === Running command 'python gathering.py' in run with ID 'fd9646f8f9a54a1cbd25f9d9d9f309b0' === 
   species  specimen_number  eccentricity  ...  third_moment  uniformity  entropy
0        1                1       0.72694  ...      0.005232    0.000275  1.17560
1        1                2       0.74173  ...      0.002708    0.000075  0.69659
2        1                3       0.76722  ...      0.000921    0.000038  0.44348
3        1                4       0.73797  ...      0.001154    0.000066 


Platform = (
    server=0.0.0.0:8001),
    API=0.0.0.0:5001),