```                
  __  __ _       __ _                 _______             _    _                     ___  
 |  \/  | |     / _| |               |__   __|           | |  (_)                   |__ \ 
 | \  / | |    | |_| | _____      __    | |_ __ __ _  ___| | ___ _ __   __ _  __   __  ) |
 | |\/| | |    |  _| |/ _ \ \ /\ / /    | | '__/ _` |/ __| |/ / | '_ \ / _` | \ \ / / / / 
 | |  | | |____| | | | (_) \ V  V /     | | | | (_| | (__|   <| | | | | (_| |  \ V / / /_ 
 |_|  |_|______|_| |_|\___/ \_/\_/      |_|_|  \__,_|\___|_|\_\_|_| |_|\__, |   \_(_)____|
                                                                        __/ |             
                                                                       |___/              
```

# Introduction

### MLflow is an open source platform for managing the end-to-end machine learning lifecycle. 

#### It tackles three primary functions:

    1) Tracking experiments to record and compare parameters and results (MLflow Tracking).
    2) Packaging ML code in a reusable, reproducible form in order to share with other data scientists or transfer to production (MLflow Projects).
    3) Managing and deploying models from a variety of ML libraries to a variety of model serving and inference platforms (MLflow Models).

(source https://www.mlflow.org/docs/latest/index.html#mlflow-documentation)

-------------------------------------------------------------------------------------------------------------------------------------------------------

### This is the second version of the tutorial. We packed code into modules, some modules are pipeline modules

# Preparation

In [1]:
%cd /home/mlflow-1-tracking/

/home/mlflow-1-tracking


In [2]:
import yaml

# Look on pipelines config 
config = yaml.load(open('config/pipeline_config.yml'), Loader=yaml.FullLoader)

config

{'base': {'project': '7labs/mlflow-1-tracking',
  'name': 'iris',
  'tags': ['solution-0-prototype', 'dev'],
  'model': {'model_name': 'model.joblib', 'models_folder': 'models'},
  'experiments': {'experiments_folder': 'experiments'},
  'random_state': 42},
 'split_train_test': {'folder': 'experiments',
  'train_csv': 'data/processed/train_iris.csv',
  'test_csv': 'data/processed/test_iris.csv',
  'test_size': 0.2},
 'featurize': {'dataset_csv': 'data/raw/iris.csv',
  'featured_dataset_csv': 'data/interim/featured_iris.csv',
  'features_columns_range': ['sepal_length', 'petal_length_to_petal_width'],
  'target_column': 'species'},
 'train': {'cv': 5,
  'estimator_name': 'LogisticRegression',
  'estimators': {'LogisticRegression': {'param_grid': {'C': [0.001, 0.01],
     'max_iter': [100],
     'solver': ['lbfgs'],
     'multi_class': ['multinomial']}},
   'SVC': {'param_grid': {'C': [0.1, 1.0],
     'kernel': ['rbf', 'linear'],
     'gamma': ['scale'],
     'degree': [3, 5]}}}},
 'eval

## Browse folder with configs

# Extract features

In [7]:
!python src/pipelines/featurize.py \
    --config=config/pipeline_config.yml

In [8]:
# iris dataset with new features is created
!ls data/interim

featured_iris.csv


# Split train/test dataset

In [9]:
!python src/pipelines/split_train_test.py \
    --config=config/pipeline_config.yml
    

In [10]:
# train and test datsets are created
!ls data/processed/

test_iris.csv  train_iris.csv


# Train model

In [None]:
# train config
!cat experiments/train_config.yml

In [15]:
!python src/pipelines/train.py \
    --config=config/pipeline_config.yml

Fitting 5 folds for each of 2 candidates, totalling 10 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  10 out of  10 | elapsed:    0.1s finished
0.87024409313883


In [16]:
# model is created
!ls models/

model.joblib


# Evaluate model 

In [None]:
# evaluate config
!cat experiments/evaluate_config.yml

In [19]:
!python src/pipelines/evaluate.py \
    --config=config/pipeline_config.yml

{'f1_score': 0.9305555555555555, 'confusion_matrix': [[10, 0, 0], [0, 7, 0], [0, 2, 11]]}
[<Experiment: artifact_location='file:///home/mlflow-1-tracking/mlruns/0', experiment_id='0', lifecycle_stage='active', name='LogisticRegression'>, <Experiment: artifact_location='file:///home/mlflow-1-tracking/mlruns/1', experiment_id='1', lifecycle_stage='active', name='SVC'>]
<ActiveRun: >
<RunInfo: artifact_uri='file:///home/mlflow-1-tracking/mlruns/0/29207897c5b041f6a946ba9226ba9979/artifacts', end_time=None, experiment_id='0', lifecycle_stage='active', run_id='29207897c5b041f6a946ba9226ba9979', run_uuid='29207897c5b041f6a946ba9226ba9979', start_time=1561384732293, status='RUNNING', user_id='user'>
29207897c5b041f6a946ba9226ba9979


In [20]:
# metrics file eval.txt is created
!ls experiments

eval.txt


In [21]:
!cat experiments/eval.txt

{
  "f1_score": 0.9305555555555555,
  "confusion_matrix": [
    [
      10,
      0,
      0
    ],
    [
      0,
      7,
      0
    ],
    [
      0,
      2,
      11
    ]
  ]
}

In [22]:
evaluate_report = yaml.load(open('experiments/eval.txt'), Loader=yaml.FullLoader)
evaluate_report

{'f1_score': 0.9305555555555555,
 'confusion_matrix': [[10, 0, 0], [0, 7, 0], [0, 2, 11]]}

# Enter MLflow ui
## http://0.0.0.0:5000

------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------------------------------------

# Concluion

    Here we used another way to logging
-------------------------------------------------------------------------------------------------------------------------------

### ___Your task___

#### Train with another estimator

##### 1. Open config/pipeline_config.yml
##### 2. In section _train_ change _estimator_name_ to SVC
##### 3. Rerun stages __Train__ and  __Evaluate__ 
##### 4. Go to next section __Enter MLflow ui__

