# MLflow Example

Training an xgboost model on the iris dataset to understand the mlflow ui interface

## How to run

```
conda create --name mlflow_example python=3.6
```

```
conda activate mlflow_example
```

```
pip install matplotlib==3.2.2 mlflow==1.9.0 scikit-learn==0.23.1 xgboost==1.1.1 jupyterlab==2.1.5
```

```
jupyter lab
```

In [1]:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, log_loss
import xgboost as xgb
import matplotlib as mpl


import mlflow
import mlflow.xgboost

mpl.use('Agg')

In [5]:
# prepare train and test data
iris = datasets.load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

# enable auto logging
mlflow.xgboost.autolog()

with mlflow.start_run():

    # train model
    params = {
        'objective': 'multi:softprob',
        'num_class': 3,
        'learning_rate': 0.3,
        'eval_metric': 'mlogloss',
        'colsample_bytree': 0.8,
        'subsample': 0.8,
        'seed': 42,
    }
    model = xgb.train(params, dtrain, evals=[(dtrain, 'train')])

    # evaluate model
    y_proba = model.predict(dtest)
    y_pred = y_proba.argmax(axis=1)
    loss = log_loss(y_test, y_proba)
    acc = accuracy_score(y_test, y_pred)

    # log metrics
    mlflow.log_metrics({'log_loss': loss, 'accuracy': acc})

[0]	train-mlogloss:0.75473
[1]	train-mlogloss:0.55294
[2]	train-mlogloss:0.41808
[3]	train-mlogloss:0.32272
[4]	train-mlogloss:0.25458
[5]	train-mlogloss:0.20410
[6]	train-mlogloss:0.16818
[7]	train-mlogloss:0.13936
[8]	train-mlogloss:0.11815
[9]	train-mlogloss:0.10110


  all_arg_names = inspect.getargspec(original)[0]  # pylint: disable=W1505


# MLflow ui
Run
```
mlflow ui
```

Change the params from xgboost and run the experiment again multiple times to compare on `mlflow ui`