Inspired by: https://docs.databricks.com/_static/notebooks/mlflow/mlflow-quick-start-training.html
# Useful links:
- MLflow documentation: https://www.mlflow.org/docs/latest/index.html
- MLflow guide from databricks: https://docs.databricks.com/applications/mlflow/index.html

# Notes (MLflow Tracking)
documentation > https://www.mlflow.org/docs/latest/tracking.html

- Can be used to to track experiments to record and compare parameters and results.

**Vocabulary**
- *run*: corresponds to a single execution of model training code. Each run can record different informations (model parameters, metrics, tags, artifacts, etc).
- *experiment*: the primary unit of organization and access control for MLflow runs; all MLflow runs belong to an experiment. Experiments let you visualize, search for, and compare runs, as well as download run artifacts and metadata for analysis in other tools.
- *MLflow entities*: runs, parameters, metrics, tags, notes, metadata, etc
- ...

**What can be recorded by an MLflow run?** > https://www.mlflow.org/docs/latest/tracking.html#concepts

**Where runs are recorded?** > https://www.mlflow.org/docs/latest/tracking.html#where-runs-are-recorded

They can be recorded
- to local files (by default to *mlruns* directory)
    - Launch UI: `mlflow ui`
- to SQLAlchemy compatible database
    - Setup MLflow: `mlflow.set_tracking_uri('sqlite:///mlflow.db')`
    - Launch UI: `mlflow ui --backend-store-uri sqlite:///mlflow.db`
- remotely to a tracking server

To show the current tracking uri `mlflow.get_tracking_uri()`
    
**How they are recorded** > https://www.mlflow.org/docs/latest/tracking.html#how-runs-and-artifacts-are-recorded

MLflow uses two components for storage:
- backend store: for MLflow entities (runs, parameters, metrics, tags, notes, metadata, etc)
- artifact store: for artifacts (files, models, images, in-memory objects, or model summary, etc)

You can use either manual or automatic logging
- Manual logging > https://www.mlflow.org/docs/latest/tracking.html#logging-functions
    - Log the fitted model: `mlflow.sklearn.log_model(rf, 'random-forest-model')`
    - Log the model parameters:
        - One parameter at a time: `mlflow.log_param('num_trees', n_estimators)`
        - A dict of parameters: `mlflow.log_parms({'num_trees', n_estimators, 'alpha', 0.04})`
    - Log the evaluation metrics: `mlflow.log_metric('mse', mse)`
    - Log other artifacts: `mlflow.log_artifact('predictions.csv')`

- Automatic logging with MLflow autolog
    - With MLflow's autologging capabilities, a single line of code automatically logs the resulting model, the parameters used to create the model, and a model score > https://www.mlflow.org/docs/latest/tracking.html#automatic-logging
    - Call `mlflow.<framework>.autolog()` API before running training code to log model-specific metrics, parameters, and model artifacts. Supports many ML frameworks (sklearn, tensorflow, etc).


# Hello world MLflow

In [1]:
# Install mlflow
#%pip install mlflow

In [18]:
import numpy as np
import pandas as pd
import mlflow
from datetime import datetime

from sklearn.linear_model import ElasticNet
from sklearn.ensemble import RandomForestRegressor

  and should_run_async(code)


In [14]:
#mlflow.autolog()

In [4]:
# Set up the name of the experiment
#mlflow.set_experiment('my_experiment')

In [5]:
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split

def get_dataset() -> pd.DataFrame:
    db = load_diabetes()
    X, y = db.data, db.target
    return train_test_split(X, y, random_state=42)

X_train, X_test, y_train, y_test = get_dataset()
X_train.shape, X_test.shape

((331, 10), (111, 10))

In [41]:
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

def evaluate_model(model, X_test, y_test):
    y_pred = model.predict(X_test)
    
    rmse = np.sqrt(mean_squared_error(y_test, y_pred))
    mae = mean_absolute_error(y_test, y_pred)
    r2 = r2_score(y_test, y_pred)
    #mlflow.log_metric('rmse', rmse)
    #mlflow.log_metric('mae', mae)
    #mlflow.log_metric('r2', r2)
    mlflow.log_metrics({'rmse': rmse, 'mae': mae, 'r2': r2})
    print(f'RMSE = {rmse:.2f}, MAE = {mae:.2f}, R2 = {r2:.2f}')
    return rmse, mae, r2

def train_model(X_train, X_test, y_train, y_test: pd.DataFrame, model_class, **model_kwargs) -> int:
    model = model_class(**model_kwargs)
    mlflow.log_params(model_kwargs)
    model.fit(X_train, y_train)
    mlflow.sklearn.log_model(model, 'model')
    evaluate_model(model, X_test, y_test)
    return model

  and should_run_async(code)


- Setup the mlflow and use a tracking server
    - Launch the server in your terminal: `mlflow server --backend-store-uri sqlite:////tmp/mlruns.db --default-artifact-root /tmp/mlruns`

In [42]:
mlflow.set_tracking_uri('http://127.0.0.1:5000')

- When launching a *run*, always use a context manager (with statement) so that the *run* will be closed. Otherwise, you need to close it manually with *mlflow.end_run()*

In [53]:
mlflow.end_run()

In [54]:
#with mlflow.start_run(run_name=f'my_model_{datetime.now()}'):
with mlflow.start_run():
    model_kwargs = {'alpha': 0.01, 'l1_ratio': 0.75}
    print(mlflow.active_run().info.run_id)
    model = train_model(X_train, X_test, y_train, y_test, ElasticNet, **model_kwargs)

b134e2c67a9444fc8299af4337f0c96e
RMSE = 55.11, MAE = 45.22, R2 = 0.45


- Load the saved model with [load_model](https://www.mlflow.org/docs/latest/python_api/mlflow.sklearn.html#mlflow.sklearn.load_model)

In [57]:
run_id = 'b134e2c67a9444fc8299af4337f0c96e'
model_uri = f'runs:/{run_id}/model'
model = mlflow.sklearn.load_model(model_uri=model_uri)
model.coef_

array([  50.16885894,  -77.4812136 ,  301.17090269,  212.15116121,
        -11.20043952,  -32.51624248, -160.34918199,  122.9815184 ,
        230.5857864 ,  101.94149439])

In [58]:
with mlflow.start_run():
    model_kwargs = {'alpha': 0.02, 'l1_ratio': 0.7}
    model = train_model(X_train, X_test, y_train, y_test, ElasticNet, **model_kwargs)

RMSE = 59.01, MAE = 49.68, R2 = 0.37


In [None]:
with mlflow.start_run():
    model_kwargs = {'alpha': 0.1, 'l1_ratio': 0.01}
    model = train_model(X_train, X_test, y_train, y_test, ElasticNet, **model_kwargs)

In [None]:
with mlflow.start_run():
    model_kwargs = {'alpha': 0.005, 'l1_ratio': 0.8}
    model = train_model(X_train, X_test, y_train, y_test, ElasticNet, **model_kwargs)

In [None]:
mlflow.search_runs(filter_string="metric.rmse < 60")

- Get the model with the best metrics

In [31]:
best_run = mlflow.search_runs(order_by=['metrics.mae DESC']).iloc[0]
print(f'MAE of Best Run: {best_run["metrics.mae"]}')
best_run

MAE of Best Run: 63.32978720832078


  and should_run_async(code)


run_id                                            9abed34254904d64adda43f07a927d47
experiment_id                                                                    0
status                                                                    FINISHED
artifact_uri                     /tmp/mlruns/0/9abed34254904d64adda43f07a927d47...
start_time                                        2021-05-27 15:45:55.245000+00:00
end_time                                          2021-05-27 15:45:55.373000+00:00
metrics.r2                                                                0.051211
metrics.mae                                                                63.3298
metrics.rmse                                                               72.4328
params.alpha                                                                   0.1
params.l1_ratio                                                               0.01
tags.mlflow.source.type                                                      LOCAL
tags

In [33]:
best_run_id = best_run.run_id
best_run_id

'9abed34254904d64adda43f07a927d47'

https://www.mlflow.org/docs/latest/python_api/mlflow.sklearn.html#mlflow.sklearn.log_model

In [None]:
def fetch_logged_data(run_id):
    client = mlflow.tracking.MlflowClient()
    data = client.get_run(run_id).data
    tags = {k: v for k, v in data.tags.items() if not k.startswith("mlflow.")}
    artifacts = [f.path for f in client.list_artifacts(run_id, "model")]
    return data.params, data.metrics, tags, artifacts