# Demo: ML Flow

[MLFlow](https://mlflow.org/) is a tool to keep track of experiments. It is easy to setup and you can keep track of different parameters and metrics.

In [1]:
import mlflow

from sklearn.datasets import load_iris
from sklearn.metrics import precision_recall_fscore_support
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split

In [2]:
data = load_iris()

X, y = data['data'], data['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=123)

## Setup

To get started, set the experiment name. Different runs of the same experiment should be saved under the same experiment name to make it possible to compare.

You can log the parameters via `log_param(name, value)` for single parameters, and `log_params(Dict[name, value])` for multiple parameters.

You can log the metrics via `log_metric(name, value)`.

For more possibilities check the [documentation](https://mlflow.org/docs/latest/index.html).

In [3]:
mlflow.set_experiment('iris_experiment')

with mlflow.start_run():
    params = {
        'max_depth': 3,
        'criterion': 'entropy'
    }

    mlflow.log_params(params)

    clf = DecisionTreeClassifier(**params)
    clf.fit(X_train, y_train)

    precision, recall, fscore, _ = precision_recall_fscore_support(y_test, clf.predict(X_test), average='macro')
    mlflow.log_metric('precision', precision)
    mlflow.log_metric('recall', recall)
    mlflow.log_metric('f1-score', fscore)

### Second Experiment

We run two experiments to see how they behave.

In [4]:
with mlflow.start_run():
    params = {
        'max_depth': 6,
        'criterion': 'gini'
    }

    mlflow.log_params(params)

    clf = DecisionTreeClassifier(**params)
    clf.fit(X_train, y_train)

    precision, recall, fscore, _ = precision_recall_fscore_support(y_test, clf.predict(X_test), average='macro')
    mlflow.log_metric('precision', precision)
    mlflow.log_metric('recall', recall)
    mlflow.log_metric('f1-score', fscore)

## Checking the results

To check the results you should run `mlflow ui` from your terminal, or you can run it from here (**warning** if you run the cell, you need to stop the cell before running anything else).

In [5]:
!mlflow ui

[2021-08-18 22:14:21 -0300] [3908] [INFO] Starting gunicorn 20.1.0
[2021-08-18 22:14:21 -0300] [3908] [INFO] Listening at: http://127.0.0.1:5000 (3908)
[2021-08-18 22:14:21 -0300] [3908] [INFO] Using worker: sync
[2021-08-18 22:14:21 -0300] [3909] [INFO] Booting worker with pid: 3909
^C
[2021-08-18 22:15:28 -0300] [3908] [INFO] Handling signal: int
[2021-08-18 22:15:28 -0300] [3909] [INFO] Worker exiting (pid: 3909)
