# Experiment tracking with MLflow 📈

Running experiments in data science projects is a common task. We have run one in the previous notebook. However, it is often difficult to keep track of all the experiments that have been run by you or by your colleagues, and to compare the results between them.

In this notebook, we show how to use the MLflow library for experiment tracking.

Now, after training our model. We will use an MLflow server to:
- Log our model configuration (features, architecture, etc.)
- Log our test score
- Log the model
- Log additional artifacts, such as figures
- *Register* a model if the model is better than the rest

> **Note** An MLflow server has been spun up for you in the background. It is hosted somewhere in the cloud, so everyone can access it. **In case the server breaks**, you may run the following in your terminal to spin up an mlflow server locally:
```bash
mlflow server --default-artifact-root ./mlruns --host 0.0.0.0
```
> Make sure, in that case, to use the tracking uri (e.g. `http://0.0.0.0:5000`) instead of the one provided below and throughout the rest of the tutorial.

In [1]:
import os
import pandas as pd
import plotly_express as px
import sklearn
import sklearn.model_selection
import sklearn.pipeline
import sklearn.preprocessing
import sklearn.linear_model
import sklearn.ensemble
import sklearn.tree
import mlflow
from typing import Dict

Let's start by connecting to the MLflow server.

In [2]:
tracking_uri = "http://20.67.15.42:5000"

# Fill in your name below. This will make sure that 
# whatever you log to mlflow will be associated to you
your_name = "< fill in your name here >"

# mlflow config
os.environ["LOGNAME"] = your_name
mlflow.set_tracking_uri(tracking_uri)

And prepare a few lines of code to allow us to experiment with different models:

In [3]:
data_path = "../data/turbine-data.csv"
data = pd.read_csv(data_path).set_index("timestamp")
data.index = pd.to_datetime(data.index)

In [4]:
features = [
    "wind_speed",
    "wind_direction",
    "is_curtailed",
]

# Drop data with missing values
data_without_na = data.dropna() 

X = data_without_na[features]  # Our model's input
y = data_without_na["active_power"]  # Target values

X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(
    X,
    y,
    shuffle=False,
    # use (only) 10% of all data for training
    train_size=0.1  # don't change this number
)

In [5]:
model = sklearn.pipeline.make_pipeline(
    # Preprocessing
    # sklearn.preprocessing.StandardScaler(),
    # sklearn.preprocessing.PolynomialFeatures(degree=3),

    # Model
    sklearn.linear_model.LinearRegression(),
    # sklearn.linear_model.Ridge(),
    # sklearn.tree.DecisionTreeRegressor(),
    # sklearn.ensemble.RandomForestRegressor(),
)

model.fit(X_train, y_train)

score = model.score(X_test, y_test)
score

In [6]:
# Create figure to log to MLflow

# Apply model to all data
data.loc[X.index, "predictions"] = model.predict(X)

fig = px.line(
    data[X_test.index.min():],
    y=["active_power", "predictions"],
    title="Predicted and true generated power"
)
fig

Now let's log the results to the MLflow server!

In [7]:
mlflow.set_experiment("mlops-workshop")

model_name = "turbine-model"   # feel free to come up with your own name

# Exercise: what does `mlflow.start_run()` do?

with mlflow.start_run():
    
    # Grab the run_id, we'll use it later
    run_id = mlflow.active_run().info.run_id
        
    mlflow.log_param("features", ", ".join(features))
    mlflow.log_param("pipeline_steps", str(model))
    mlflow.log_param("model_name", model_name)
    
    # Exercise: log the model using mlflow.sklearn.log_model
    #           and use the model_name defined above
    # ... 
    
    # Exercise: log the model test score as `test_score` using mlflow.log_metric
    # ...
    
    # Exercise: log a figure with mlflow.log_figure (tip: use a `.html` extension as file name)
    # ...
    
# Exercise: try to improve your model above by changing the pipeline and features and log the results!

Visit the MLflow server:

[http://20.67.15.42:5000](http://20.67.15.42:5000)

Find your run!

🎉

## Registering a model 

Now that we have run a few experiments, we can register the best model we have found so far.

You can do so either in the user interface (UI) of MLflow, or programmatically. Check out the server explore how you can register a model in the UI.

Below we show how to do it in code.

In [13]:
# Some functions to get the all runs and scores from MLflow

def get_runs(experiment_name: str = "Default"):
    """Gets all runs from an experiment in MLflow"""
    
    client = mlflow.tracking.MlflowClient()
    experiment = client.get_experiment_by_name(experiment_name)
    runs = client.search_runs(experiment.experiment_id)
    return runs


def get_scores_per_run_id(experiment_name: str = "Default") -> Dict[str, float]:
    """Returns dictionary as {"<run_id>" : <test_score>} with all runs from an experiment in MLflow"""
    
    runs = get_runs(experiment_name)
    scores = {run.info.run_id: run.data.metrics["test_score"] for run in runs if "test_score" in run.data.metrics}
    return scores


def check_best_score(run_id: str, experiment_name: str = "Default") -> bool:
    """Checks if the given run_id has the best score in the experiment"""
    
    scores = get_scores_per_run_id(experiment_name)

    # if there are no scores, raise an error
    if len(scores) == 0:
        raise ValueError(f"No `test_score` scores found for experiment '{experiment_name}'")
    
    # score is always 'the best score' if there is only one run
    if len(scores) == 1:
        return True

    run_score = scores[run_id]
    other_scores = [score for (i, score) in scores.items() if i != run_id]
    return run_score > max(other_scores)

scores = get_scores_per_run_id(experiment_name="mlops-workshop")

if len(scores):
    print(f"Number runs found with a `test_score`: {len(scores)}")
    print(f"Average score: {sum(scores.values()) / len(scores)}")
    print(f"Best score: {max(scores.values())}")

In [14]:
def register_model(run_id: str):
    """Register the model with the given run_id and model_name to MLflow"""
    
    # Get model_name from run_id
    client = mlflow.tracking.MlflowClient()
    run = client.get_run(run_id)
    model_name = run.data.params["model_name"]
    
    model_uri = f"runs:/{run_id}/{model_name}"
    mlflow.register_model(model_uri, model_name)

In [17]:
# Exercise: 
# Do a conditional check if the model has the best score
# and register it to MLflow it it does. We are using the 
# `run_id` we have defined before.

if check_best_score(run_id, experiment_name="mlops-workshop"):
    print("Your model has the best score!")
    register_model(run_id)
else:
    print(f"Your score of {score} is not better than the max score of {max(scores.values())}")

Registered models can be moved from staging to production, simply in the MLflow UI. <br>
Check out the "Models" tab in MLflow and explore a bit.

In the next notebook, we'll show how you can load a model from production for inference.