# Experiment Tracking and Model Management

![Status](https://img.shields.io/static/v1.svg?label=Status&message=Finished&color=green)

<!-- Place this tag where you want the button to render. -->
<a class="github-button" href="https://github.com/particle1331/steepest-ascent" data-color-scheme="no-preference: dark; light: light; dark: dark;" data-icon="octicon-star" data-size="large" data-show-count="true" aria-label="Star particle1331/steepest-ascent on GitHub">Star</a>
<!-- Place this tag in your head or just before your close body tag. -->
<script async defer src="https://buttons.github.io/buttons.js"></script> 


In this module, we will look **experiment tracking** and **model management**. A machine learning experiment is defined as a session or process of making machine learning models. Experiment tracking is the process of keeping track of all relevant information in an experiments. This includes source code, environment, data, models, hyperparameters, and so on which are important for reproducing the experiment as well as for making actual predictions. 

From experience, we know that manual tracking, e.g. with spreadsheets, is error prone, not standardized, has low visibility, and difficult for teams to collaborate over. As an alternative, we will experiment tracking platforms such as [MLFlow](https://mlflow.org/). MLFlow has four main components: tracking, models, model registry, and projects. In this course, we only be cover the first three. 

```{margin}
⚠️ **Attribution:** These are notes for [Module 2](https://github.com/DataTalksClub/mlops-zoomcamp/tree/main/02-experiment-tracking) of the [MLOps Zoomcamp](https://github.com/DataTalksClub/mlops-zoomcamp). The MLOps Zoomcamp is a free course from [DataTalks.Club](https://github.com/DataTalksClub).
```

As we have seen with our previous prototyping, having the ability to reproduce results is important since we want to have the same results when deploying the model in different environments. Using experiment tracking and model management platforms allows us to have better chance at reproducing our results, as well as aid in organization (staging and deploying models) and optimization (finding the best models). 

## Getting started: MLFlow UI

We can run the MLFlow UI with an SQLite backend as follows:


```bash
$ mlflow ui --backend-store-uri sqlite:///mlflow.db

[2022-05-26 19:35:22 +0800] [92498] [INFO] Starting gunicorn 20.1.0
[2022-05-26 19:35:22 +0800] [92498] [INFO] Listening at: http://127.0.0.1:5000 (92498)
[2022-05-26 19:35:22 +0800] [92498] [INFO] Using worker: sync
[2022-05-26 19:35:22 +0800] [92499] [INFO] Booting worker with pid: 92499
```

For our experiment, we will use our code and data from [Module 1](https://particle1331.github.io/inefficient-networks/notebooks/mlops/1-intro.html). So before doing any run, we either create an **experiment** or connect a run to it if the experiment already exists. This also sets the experiment tracking backend. The same one that is visualized in the UI above.

```{margin}
[`experiment.py`](https://github.com/particle1331/inefficient-networks/blob/mlops/docs/notebooks/mlops/mlflow/experiment.py)
```

```python
import mlflow

mlflow.set_tracking_uri("sqlite:///mlflow.db")
mlflow.set_experiment("nyc-taxi-experiment")
```

The next section of the code executes a **single run** of the experiment. Note the logging at the end of the script. Everything that runs inside the following context is a single run with run name `demo-lasso`:

```{margin}
[`experiment_lr.py`](https://github.com/particle1331/inefficient-networks/blob/mlops/docs/notebooks/mlops/2-mlflow/experiment_lr.py)
```
```python
with mlflow.start_run(run_name='lr'):

    train_data_path = './data/green_tripdata_2021-01.parquet'
    valid_data_path = './data/green_tripdata_2021-02.parquet'

    # In-between transformations
    transforms = [add_pickup_dropoff_pair]
    categorical = ['PU_DO']
    numerical = ['trip_distance']

    train_dicts, y_train = preprocess_dataset(train_data_path, transforms, categorical, numerical)
    valid_dicts, y_valid = preprocess_dataset(valid_data_path, transforms, categorical, numerical)

    # Fit all possible categories
    dv = DictVectorizer()
    dv.fit(train_dicts + valid_dicts)

    X_train = dv.transform(train_dicts)
    X_valid = dv.transform(valid_dicts)

    # Train model
    model = LinearRegression()
    model.fit(X_train, y_train);

    # Plot predictions vs ground truth
    fig = plot_duration_distribution(model, X_train, y_train, X_valid, y_valid)
    fig.savefig('plot.svg')

    # Print metric
    rmse_train = mean_squared_error(y_train, model.predict(X_train), squared=False)
    rmse_valid = mean_squared_error(y_valid, model.predict(X_valid), squared=False)

    # MLFlow logging
    mlflow.set_tag("author", "particle")
    mlflow.log_param('train_data_path', train_data_path)
    mlflow.log_param('valid_data_path', valid_data_path)
    mlflow.log_metric('rmse_train', rmse_train)
    mlflow.log_metric('rmse_valid', rmse_valid)
    mlflow.log_artifact('plot.svg')
```

This will reflect in the UI as a single run in the `nyc-tax-experiment` experiment. MLFlow is able to obtain the version from `git` and the user from the system, i.e. the user that is currently logged in. The other values are obtained from the logs. 

```{figure} ../../../img/single-run-mlflow.png
```

If we click on the run, we can see more details about it that we logged. Shown here are the date of the run, the user that executed it, total run time, the source code used, as well as the git commit for this code. Status `FINISHED` indicates that the script successfully ran. These are useful metadata.

```{figure} ../../../img/single-run-mlflow-details.png
```

Regarding the details of the trained model, we have parameters for the data used (only paths, no versioning). Most importantly, we can see the logged RMSEs `5.7` (train) and `7.759` (valid). The plot of the distributions of the true and predicted distributions, which we logged as a training artifact, is also conveniently displayed here. Note that we can make **queries** on the search bar, e.g. filtering runs with specific tags:

```{figure} ../../../img/mlflow-filter-tags.png
```

## Experiment tracking

In this section, we switch to a more complex XGBoost model and perform hyperparameter optimization using [Hyperopt](https://hyperopt.github.io/hyperopt/). We show how this looks in MLFlow. Note that it is easy to select the best models in an experiment by simply clicking the column header of the metrics. Finally, we look at **autologging** which makes logging simpler for frameworks with MLFlow integration.

```{margin}
[`experiment_xgb.py`](https://github.com/particle1331/inefficient-networks/blob/mlops/docs/notebooks/mlops/2-mlflow/experiment_xgb.py)
```
```python
mlflow.set_tracking_uri("sqlite:///mlflow.db")
mlflow.set_experiment("nyc-taxi-experiment")

for alpha in tqdm([100, 10, 1, 0.1, 0.01, 0.001]):
    with mlflow.start_run(run_name='lasso'):

        # ...

        model = Lasso(alpha=alpha)
        model.fit(X_train, y_train);

        # ...
```

In [6]:
from hyperopt import fmin, tpe, hp, STATUS_OK, Trials
from hyperopt.pyll import scope
import xgboost as xgb

import mlflow

from utils import *
from sklearn.feature_extraction import DictVectorizer
from sklearn.metrics import mean_squared_error

train_data_path = '../data/green_tripdata_2021-01.parquet'
valid_data_path = '../data/green_tripdata_2021-02.parquet'

# In-between transformations
transforms = [add_pickup_dropoff_pair]
categorical = ['PU_DO']
numerical = ['trip_distance']

train_dicts, y_train = preprocess_dataset(train_data_path, transforms, categorical, numerical)
valid_dicts, y_valid = preprocess_dataset(valid_data_path, transforms, categorical, numerical)

# Fit all possible categories
dv = DictVectorizer()
dv.fit(train_dicts + valid_dicts)

X_train = dv.transform(train_dicts)
X_valid = dv.transform(valid_dicts)

xgb_train = xgb.DMatrix(X_train, label=y_train)
xgb_valid = xgb.DMatrix(X_valid, label=y_valid)

In [8]:
def objective(params):
    
    with mlflow.start_run():
        mlflow.set_tag("model", "xgboost")
        mlflow.log_params(params)
        
        booster = xgb.train(
            params=params,
            dtrain=xgb_train,
            num_boost_round=100,
            evals=[(xgb_valid, 'validation')],
            early_stopping_rounds=50
        )
        y_pred = booster.predict(xgb_valid)
        rmse = mean_squared_error(y_valid, y_pred, squared=False)
        mlflow.log_metric("rmse", rmse)

    return {'loss': rmse, 'status': STATUS_OK}


mlflow.set_tracking_uri("sqlite:///mlflow.db")
mlflow.set_experiment("nyc-taxi-experiment")

search_space = {
    'max_depth': scope.int(hp.quniform('max_depth', 4, 100, 1)),
    'learning_rate': hp.loguniform('learning_rate', -3, 0),
    'reg_alpha': hp.loguniform('reg_alpha', -5, -1),
    'reg_lambda': hp.loguniform('reg_lambda', -6, -1),
    'min_child_weight': hp.loguniform('min_child_weight', -1, 3),
    'objective': 'reg:linear',
    'seed': 42
}

best_result = fmin(
    fn=objective,
    space=search_space,
    algo=tpe.suggest,
    max_evals=50,
    trials=Trials()
)


[0]	validation-rmse:18.83253                          
[1]	validation-rmse:16.81642                          
[2]	validation-rmse:15.09963                          
[3]	validation-rmse:13.64471                          
[4]	validation-rmse:12.41876                          
[5]	validation-rmse:11.38924                          
[6]	validation-rmse:10.53007                          
[7]	validation-rmse:9.81639                           
[8]	validation-rmse:9.22736                           
[9]	validation-rmse:8.74013                           
[10]	validation-rmse:8.34108                          
[11]	validation-rmse:8.01621                          
[12]	validation-rmse:7.75268                          
[13]	validation-rmse:7.53622                          
[14]	validation-rmse:7.36016                          
[15]	validation-rmse:7.21749                          
[16]	validation-rmse:7.10068                          
[17]	validation-rmse:7.00376                          
[18]	valid

KeyboardInterrupt: 