# Lab 03: Experiment Management

## What you will learn

- How experiment management brings observability to ML model development
- Workflows for using MLFlow in experiment management, including metric logging, artifact versioning, and hyperparameter optimization


## Experiment Management with MLFLow

We will be using MLflow Tracking for experiment management. The MLflow Tracking is an API and UI for logging parameters, code versions, metrics, and output files when running your machine learning code and for later visualizing the results.

There are two important concepts:

- **Runs**: Runs are executions of some piece of data science code (e.g. `python train.py`). Each run records metadata (metrics, parameters, start and end times) and as well as the artifacts produced by the code (e.g. model weights).
- **Experiments**: An experiment groups together runs for a specific task.

![MLFlow concepts](https://mlflow.org/docs/latest/_images/tracking-basics.png)

### Launching the MLflow tracking server

There are various deployment configurations possible for MLflow. Here we'll simply run it locally, and store everything to local files, but a production setup would usually use cloud storage for artifacts and a database for metadata.

![MLflow tracking server setups](https://mlflow.org/docs/latest/_images/tracking-setup-overview.png)

To start a local tracking server, run the following in a shell:

```shell
mlflow server --host 127.0.0.1 --port 8080
```

### Using the MLflow Client API

The `MlflowClient` is one of the primary mechanisms that you will use when training ML models. It enables you to

- create new experiments
- start runs within experiments
- document parameters and metrics for your runs
- log artifacts linked to your runs

First, import the `MlflowClient`:

In [1]:
from mlflow import MlflowClient

By default, the `MlfLowClient` will designate local storage as the tracking server. This means that your experiments, data, models, and everything else you log to MLflow will be stored within the current working directory.

To connect to a tracking server, you can set the `tracking_uri` parameter.

In [2]:
client = MlflowClient(tracking_uri="http://localhost:8080")

#### The Default Experiment

The Default Experiment is a placeholder that will be used if no explicit experiment is declared. It acts as a fallback for you to ensure that your valuable tracking data is not lost, even if you forget so explicitly create an experiment.

Let's see what this default experiment looks like. We can search the available experiments using `MlflowClient.search_experiments()`.

In [None]:
experiments = client.search_experiments()
experiments

As you see, `search_experiments` returns a list of `Experiment` objects. `Experiment`s come an ID (`experiment_id`), a storage location for their artifacts (`artifact_location`) and a couple of time stamps - and tags. Tags allow you to attach more information to an experiment. The UI allows you to search for these tags. One "special" tag is `mlflow.note.content`, which you can use to attach a note to your experiment.

#### Creating an experiment

Creating an experiment is straightforward. In the following cell, we demonstrate how to create an experiment with additional metadata attached to it:

In [4]:
# Provide an Experiment description that will appear in the UI
experiment_description = (
    "This is an experiment for a coffee shop to forecast sales."
)

# Provide searchable tags that define characteristics of the Runs that
# will be in this Experiment
experiment_tags = {
    "project_name": "coffee-forecasting",
    "team": "stores-ml",
    "project_quarter": "Q1-2024",
    "mlflow.note.content": experiment_description,
}

# Create the Experiment, providing a unique name
coffee_experiment = client.create_experiment(
    name="Coffee_Models", tags=experiment_tags
)


Once you have executed the cell above, head over to your MLflow instance. You should see a new experiment in the `Experiments` menu.

![image.png](imgs/important_ui_components.png)

There are a couple of UI components that are noteworthy here:

![image.png](imgs/important_ui_concepts_annotated.png)

As you can see, some of the tags we set previously are visible in the UI. Others are not, but they can still be searched using the search mask or the API. You can search experiments using tasks by setting the `filter_string`:

In [None]:
coffee_experiment = client.search_experiments(filter_string="tags.`project_name` = 'coffee-forecasting'")
coffee_experiment

There are of course better ways of accessing experiments by name:

In [None]:
coffee_experiment = client.get_experiment_by_name("Coffee_Models")
coffee_experiment

### Logging to Mlflow

In this section we'll be taking a closer look at the core features of MLflow Tracking:
- creating new runs using the `start_run` context manager
- an introduction to logging
- the role of model signatures
- logging a trained model

#### Keeping track of training

As an example, we will be forecasting coffee shop sales (a given, after the previous lab) using machine learning.

For our forecasting needs, we will be using [`prophet`](https://facebook.github.io/prophet/). Prophet is a "forecasting procedure" developed by Meta. It is fully automated and usually a great start for any time series forecasting project. There's no need to understand the details of Prophet for the purpose of this lab.

In [7]:
import mlflow
import pandas as pd
from prophet import Prophet

We're not importing the `MlflowClient` here. Instead, we will be using the `fluent` API. The fluent API is a globally referenced state of the MLFlow tracking server. This global reference is higher-level API to perform the same actions as the `MlflowClient`.

To connect to the MLflow tracking server, simply set the tracking URI as follows.

In [8]:
mlflow.set_tracking_uri("http://localhost:8080")

Next, we set the experiment, run name and artifact path. If you do not set a run name, MLflow will generate one for you.
The artifact path is the path that your model will be saved to.

In [9]:
coffee_experiment = mlflow.set_experiment("Coffee_Models")
run_name = "coffee_forecast_prophet"
artifact_path = "coffee_prophet"

With these definitions out of the way, we can now start training our model.

In [None]:
# We begin with some data wrangling to prepare the data for Prophet
df = pd.read_csv("data/coffee_sales.csv")
subset = df[(df["product_id"] == 32) & (df["store_id"] == 8)]

# For each day in transaction_date, sum the transaction_qty
daily_sales = subset.groupby("transaction_date").agg({"transaction_qty": "sum"}).reset_index()
daily_sales.columns = ["ds", "y"]
daily_sales["y"] = daily_sales["y"].astype(float)

# Split the last 30 days of data into a test set
train = daily_sales.iloc[:-30]
test = daily_sales.iloc[-30:].reset_index()


# Define hyperparameters for the Prophet model. Their meaning is not important. 
# We are just demonstrating how to log hyperparameters
params = {
    "seasonality_mode": "multiplicative",
    "changepoint_prior_scale": 0.05,
    "seasonality_prior_scale": 10.0,
    "holidays_prior_scale": 10.0,
    "mcmc_samples": 0,
}

# Create a Prophet model and fit it to the training data
model = Prophet()
model.fit(train)

# Make predictions on the test set
forecast = model.predict(test)

# Compare forecasted values to test set
mape = (abs(test["y"] - forecast["yhat"]) / test["y"]).mean()
rmse = ((test["y"] - forecast["yhat"]) ** 2).mean() ** 0.5
metrics = {"mape": mape, "rmse": rmse}

# Start the MLflow run
with mlflow.start_run(run_name=run_name, tags={"model": "Prophet"}) as run:
    # Log the model's hyperparameters
    mlflow.log_params(params)
    # Log the model's metrics
    mlflow.log_metrics(metrics)
    # Log the model itself
    mlflow.prophet.log_model(model, artifact_path=artifact_path, input_example=train)

Let's break down the previous cell:

1. We wrangled some data - nothing new here.
2. We created a model using the parameters defined in `params` and fit it to the training data.
3. We tested it on a test set and computed some metrics.
4. This is where it gets interesting from an MLflow perspective: We created a run using the previously defined `run_name` and then logged the `params`, `metrics`, and the `model` itself to MLflow. When logging a mode, you can pass an example input. This allows MLFlow to infer the signature of your model.

Note the `mlflow.prophet.log_model` function: MLflow supports a range of machine learning and deep learning frameworks (they call them ["model flavors"](https://mlflow.org/docs/latest/models.html#built-in-model-flavors)). If there is an obscure framework they do are not supporting, you can always log [python functions](https://mlflow.org/docs/latest/models.html#python-function-python-function) and raw files directly. Generally, you can log almost everything to MLflow and they offer dedicated functions for a range of artifacts (e.g. matplotlib `Figure`s, images, numpy data). Refer to the [MLflow docs](https://mlflow.org/docs/latest/python_api/mlflow.html) for a complete list.

Your `Coffee_Models` experiment should now look something like the screenshot below.

![Coffee_Models with content](imgs/Coffee_Models_with_content.png)

You can click on the run to reveal detailed information about the run you logged, including the parameters, metrics, and artifacts.

## Deep Learning with MLflow

So far, we've seen a model that is relatively quick to train. Deep learning models, however, can train for days. We'll now see how MLflow can be used to monitor the training of deep models, similar to tools like tensorboard or weights and biases.

As an example, let's (try) solve the [(in)famous XOR-problem](https://en.wikipedia.org/wiki/Perceptron#Universal_approximation_theorem) using a pytorch model.

In [16]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
%matplotlib inline
torch.manual_seed(2)

# Data
X = torch.tensor([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=torch.float32)
y = torch.tensor([[0], [1], [1], [0]], dtype=torch.float32)

# Model
class XOR(nn.Module):
    def __init__(self, activation=F.sigmoid):
        super(XOR, self).__init__()
        self.fc1 = nn.Linear(2, 2)
        self.fc2 = nn.Linear(2, 1)
        self.activation = activation

    def forward(self, x):
        x = self.activation(self.fc1(x))
        x = self.fc2(x)
        return self.activation(x)

# We need room for improvement ;)    
activation = nn.Identity()
model = XOR(activation=activation)

loss_fn = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.02, momentum=0.9)

So far, nothing changed. You declare your model and loss function, and select an optimizer.

They only place that requires some changes is the training loop. Here, we are going to log the training loss with every epoch.

In [None]:
# Implicitly create a new experiment
mlflow.set_experiment("XOR")

epochs = 10000

with mlflow.start_run() as run:
    # Log the hyperparameters
    # Hyperparameters
    hp = {
        "activation": activation.__class__.__name__,
        "lr": 0.02,
        "momentum": 0.9,
        "epochs": epochs,
        "loss_fn": loss_fn.__class__.__name__,
        "optimizer": optimizer.__class__.__name__,
    }
    mlflow.log_params(hp)
    # Train the model

    for epoch in range(epochs):
        # Forward pass
        outputs = model(X)
        loss = loss_fn(outputs, y)

        # Backward pass and optimization
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if epoch % 100 == 0:
            print(f"Epoch [{epoch}/{epochs}], Loss: {loss.item()}")
            mlflow.log_metric("loss", f"{loss:2f}", step=epoch)

    # Save the trained model to MLflow.
    mlflow.pytorch.log_model(model, "model", input_example=X.to("cpu").numpy())

### Autologging

That was easy! It could get tedious, however, when validation and test sets get introduced. Also, we have to manually call all the logging functions everytime we want to save more data to MLflow.

Luckily, MLflow comes with `autologging`! Instead of adding all the calls yourself, simply call `mlflow.autolog` any time before `mlflow.start_run`! Make sure to checkout MLflow's guide on ["Automatic Logging with MLflow Tracking"](https://mlflow.org/docs/latest/tracking/autolog.html) if you want to learn more.

## Comparing Runs

When we look at our previous attempt at solving the XOR problem, we have to admit that we were not particularly successful.

Let's analyze the loss curve. In the UI, select the XOR experiment and then click on the `Chart` tab.

![XOR identity chart](imgs/xor_identity_chart.png)
It looks like the model hasn't learnt much. Let's swap out the `nn.Identity` for an `F.sigmoid` and rerun the training.
After refreshing the page, there should now be an additional run.

![XOR sigmoid chart](imgs/xor_sigmoid_chart.png)

It looks like the model is finally starting to learn something after step 9000. Maybe it needs more iterations? Increase the the number of epochs to `100000`.

![XOR sigmoid chart with more iterations](imgs/xor_sigmoid_chart_with_more_iterations.png)

Ah! This looks much better!

---

There are many more features to the chart view, which we invite you to explore on your own.
MLflow's comparison features really begin to shine when it comes to hyperparameter tuning. In the second part of this lab, you will be introduced to a state-of-the-art hyperparameter tuning package and get to play a game of _guess the hyperparameter_.