Lab 6: MLFlow Tracking
-------------

| | |
| --------- | --------------------------- |
| Notebook  | 6_MLFlowTracking.ipynb    |
| Builds On | none |
| Time to complete | 60 minutes

In this lab, we’ll be covering the essential basics of core MLflow functionality associated with tracking training event data.

We’ll start by learning how to start a local MLflow Tracking server, how to access and view the MLflow UI, and move on to our first interactions with the Tracking server through the use of the MLflow Client.

The lab content builds upon itself, culminating in successfully logging your first MLflow model.

The topics in this lab cover:

- Starting an MLflow Tracking Server (Optionally) and connecting to a Tracking Server
- Exploring the MlflowClient API (briefly)
- Understanding the Default Experiment
- Searching for Experiments with the MLflow client API
- Understanding the uses of tags and how to leverage them for model organization
- Creating an Experiment that will contain our run (and our model)
- Learning how to log metrics, parameters, and a model artifact to a run
- Viewing our Experiment and our first run within the MLflow UI


Using the MLflow Client API
===========================

In the previous section, we started an instance of the MLflow Tracking Server and the MLflow UI. For this stage, we’re going to be interfacing with the Tracking Server through one of the primary mechanisms that you will use when training ML models, the MlflowClient. For the duration of this lab, this client API will be your primary interface for MLflow’s tracking capabilities, enabling you to:

- Initiate a new Experiment.
- Start Runs within an Experiment.
- Document parameters, metrics, and tags for your Runs.
- Log artifacts linked to runs, such as models, tables, plots, and more.


Importing Dependencies
-----------------------

In order to use the MLflowClient API, the initial step involves
importing the necessary modules.

With these modules imported, you're now prepared to configure the client
and relay specifics about the location of your tracking server.

In [7]:
from pprint import pprint
from sklearn.ensemble import RandomForestRegressor
from mlflow import MlflowClient

Configuring the MLflow Tracking Client
--------------------------------------

By default, barring any modifications to the
`MLFLOW_TRACKING_URI` environment
variable, initializing the MlflowClient will designate your local
storage as the tracking server. This means your experiments, data,
models, and related attributes will be stored within the active
execution directory.

For the context of this guide, we'll utilize the tracking server
initialized earlier, instead of using the client to
log to the local file system directory.

In order to connect to the tracking server that we created in the
previous section of this lab, we'll need to use the uri that we
assigned the server when we started it. The two components that we
submitted as arguments to the `mlflow server` command were the `host`
and the `port`. Combined, these form
the `tracking_uri` argument that we
will specify to start an instance of the client.

In [8]:
client = MlflowClient(tracking_uri="http://127.0.0.1:5000")


We now have a client interface to the tracking server that can both send
data to and retrieve data from the tracking server.



The Default Experiment
-----------------------

Before we get to logging anything to the Tracking Server, let's take a
look at a key feature that exists at the outset of starting any MLflow
Tracking Server: the Default Experiment.

The Default Experiment is a placeholder that is used to encapsulate all
run information if an explicit Experiment is not declared. While using
MLflow, you'll be creating new experiments in order to organize
projects, project iterations, or logically group large modeling
activities together in a grouped hierarchical collection. However, if
you manage to forget to create a new Experiment before using the MLflow
tracking capabilities, the Default Experiment is a fallback for you to
ensure that your valuable tracking data is not lost when executing a
run.

Let's see what this Default Experiment looks like by using the `mlflow.client.MlflowClient.search_experiments()` API.



Searching Experiments
----------------------

The first thing that we're going to do is to view the metadata
associated with the Experiments that are on the server. We can
accomplish this through the use of the
`mlflow.client.MlflowClient.search_experiments()` API. Let's issue a search query to see what the results are.


In [9]:
all_experiments = client.search_experiments()
print(all_experiments)

[<Experiment: artifact_location='mlflow-artifacts:/860211766170465072', creation_time=1725810205880, experiment_id='860211766170465072', last_update_time=1725810205880, lifecycle_stage='active', name='Apple_Models', tags={'mlflow.note.content': 'This is the grocery forecasting project. This '
                        'experiment contains the produce models for apples.',
 'project_name': 'grocery-forecasting',
 'project_quarter': 'Q3-2023',
 'store_dept': 'produce',
 'team': 'stores-ml'}>, <Experiment: artifact_location='mlflow-artifacts:/857513909126612285', creation_time=1725485974835, experiment_id='857513909126612285', last_update_time=1725485974835, lifecycle_stage='active', name='MLflow Quickstart', tags={}>, <Experiment: artifact_location='mlflow-artifacts:/800666557001923439', creation_time=1725481171594, experiment_id='800666557001923439', last_update_time=1725481171594, lifecycle_stage='active', name='taxi', tags={}>, <Experiment: artifact_location='mlflow-artifacts:/0', creati


It is worth noting that the return type of the
`search_experiments()` API is not a
basic collection structure. Rather, it is a list of
`Experiment` objects. Many of the
return values of MLflow's client APIs return objects that contain
metadata attributes associated with the task being performed. This is an
important aspect to remember, as it makes more complex sequences of
actions easier to perform, which will be covered in later labs.

With the returned collection, we can iterate over these objects with a
comprehension to access the specific metadata attributes of the Default
experiment.

To get familiar with accessing elements from returned collections from
MLflow APIs, let's extract the `name`
and the `lifecycle_stage` from the
`search_experiments()` query and
extract these attributes into a dict.


In [10]:
default_experiment = [
    {"name": experiment.name, "lifecycle_stage": experiment.lifecycle_stage}
    for experiment in all_experiments
    if experiment.name == "Default"
][0]

pprint(default_experiment)

{'lifecycle_stage': 'active', 'name': 'Default'}


In the next step, we'll create our first experiment and dive into the
options that are available for providing metadata information that helps
to keep track of related experiments and organize our runs within
experiments so that we can effectively compare the results of different
parameters for training runs.

Creating Experiments
====================

In the previous section, we became familiar with the MLflow Client and
its `search_experiments` API. 

Notes on Tags vs Experiments
----------------------------

While MLflow does provide a default experiment, it primarily serves as a
'catch-all' safety net for runs initiated without a specified active
experiment. However, it's not recommended for regular use. Instead,
creating unique experiments for specific collections of runs offers
numerous advantages, as we'll explore below.

**Benefits of Defining Unique Experiments:**

1\. **Enhanced Organization**: Experiments allow you to group related
runs, making it easier to track and compare them. This is especially
helpful when managing numerous runs, as in large-scale projects.

2\. **Metadata Annotation**: Experiments can carry metadata that aids in
organizing and associating runs with larger projects.

Consider the scenario below: we're simulating participation in a large
demand forecasting project. This project involves building forecasting
models for various departments in a chain of grocery stores, each
housing numerous products. Our focus here is the 'produce' department,
which has several distinct items, each requiring its own forecast model.
Organizing these models becomes paramount to ensure easy navigation and
comparison.

**When Should You Define an Experiment?**

The guiding principle for creating an experiment is the consistency of
the input data. If multiple runs use the same input dataset (even if
they utilize different portions of it), they logically belong to the
same experiment. For other hierarchical categorizations, using tags is
advisable.

In the example above, we could create an experiment, creating several tags.

In [11]:
experiment_description = (
    "This is the grocery forecasting project. "
    "This experiment contains the produce models for apples."
)

experiment_tags = {
    "project_name": "grocery-forecasting",
    "store_dept": "produce",
    "team": "stores-ml",
    "project_quarter": "Q3-2023",
    "mlflow.note.content": experiment_description,
}

produce_apples_experiment = client.create_experiment(name="Apple_Models", tags=experiment_tags)

RestException: RESOURCE_ALREADY_EXISTS: Experiment 'Apple_Models' already exists.

In the next section, we'll take a look at what these tags can be used
for, which are visible in the UI, and how we can leverage the power of
`tags` to simplify access to
experiments that are part of a larger project.



Searching Experiments
=====================


In the last section, we created our first MLflow Experiment, providing
custom tags so that we can find co-related Experiments that are part of
a larger project.

In this brief section, we're going to see how to perform those searches
with the MLflow Client API.

Before we perform the search, let's take a look at our
`Apple_Models` experiment in the UI.


Seeing our new Experiment in the UI
--------------------------------------

As before, we're going to connect to our running MLflow Tracking server
to view the MLflow UI. If you've closed the browser window that was
running it, simply navigate to `http://127.0.0.1:5000` in a new browser window.

### Important components to be aware of in the UI

There are some important elements in the UI to be aware of at this
point, before we start adding more exciting things like runs to our new
experiment. Note the annotated elements on the figure below. It will be
useful to know that these bits of data are there later on.


[![Important Data on the Experiment View
Page](./images//experiment-page-elements.svg)](./images//experiment-page-elements.svg)



Searching based on tags
------------------------

Now that we've seen the experiment and understand which of the tags that
we specified during the creation of the experiment are visible within
the UI and which are not, we're going to explore the reason for defining
those tags as we apply searches against the tracking server to find
experiments whose custom tags values match our query terms.

One of the more versatile uses of setting `tags` within Experiments is to enable searching for related
Experiments based on a common tag. The filtering capabilities within the
`search_experiments` API can be seen
below, where we are searching for experiments whose custom
`project_name` tag exactly matches
`grocery-forecasting`.

Note that the format that is used for the search filtering has some
nuance to it. For named entities (for instance, here, the
`tags` term in the beginning of the
filter string), keys can be directly used. However, to reference custom
tags, note the particular syntax used. The custom tag names are wrapped
with back ticks (\`) and our matching search condition is wrapped in
single quotes.


In [12]:
# Use search_experiments() to search on the project_name tag key

apples_experiment = client.search_experiments(
    filter_string="tags.`project_name` = 'grocery-forecasting'"
)

pprint(apples_experiment[0])

<Experiment: artifact_location='mlflow-artifacts:/860211766170465072', creation_time=1725810205880, experiment_id='860211766170465072', last_update_time=1725810205880, lifecycle_stage='active', name='Apple_Models', tags={'mlflow.note.content': 'This is the grocery forecasting project. This '
                        'experiment contains the produce models for apples.',
 'project_name': 'grocery-forecasting',
 'project_quarter': 'Q3-2023',
 'store_dept': 'produce',
 'team': 'stores-ml'}>


Note

The returned results above are formatted for legibility. This return
type is an `Experiment` object, not a `dict`.

In the next section, we'll begin to use this experiment to log training
data to runs that are associated with this experiment, introducing
another aspect of both the MLflow APIs (the fluent API) and another part
of the MLflow UI (the run information page).



Create a dataset about apples
=============================

In order to produce some meaningful data (and a model) for us to log to
MLflow, we'll need a dataset. In the interests of sticking with our
theme of modeling demand for produce sales, this data will actually need
to be about apples.

There's a distinctly miniscule probability of finding an actual dataset
on the internet about this, so we can just roll up our sleeves and make
our own.


Defining a dataset generator
-----------------------------

For our examples to work, we're going to need something that can
actually fit, but not something that fits too well. We're going to be
training multiple iterations in order to show the effect of modifying
our model's hyperparameters, so there needs to be some amount of
unexplained variance in the feature set. However, we need some degree of
correlation between our target variable (`demand`, in the case of our apples sales data that we want to
predict) and the feature set.

We can introduce this correlation by crafting a relationship between our
features and our target. The random elements of some of the factors will
handle the unexplained variance portion.

In [13]:
from datetime import datetime, timedelta

import numpy as np
import pandas as pd


def generate_apple_sales_data_with_promo_adjustment(base_demand: int = 1000, n_rows: int = 5000):
    """
    Generates a synthetic dataset for predicting apple sales demand with seasonality and inflation.

    This function creates a pandas DataFrame with features relevant to apple sales.
    The features include date, average_temperature, rainfall, weekend flag, holiday flag,
    promotional flag, price_per_kg, and the previous day's demand. The target variable,
    'demand', is generated based on a combination of these features with some added noise.

    Args:
        base_demand (int, optional): Base demand for apples. Defaults to 1000.
        n_rows (int, optional): Number of rows (days) of data to generate. Defaults to 5000.

    Returns:
        pd.DataFrame: DataFrame with features and target variable for apple sales prediction.

    Example:
        >>> df = generate_apple_sales_data_with_seasonality(base_demand=1200, n_rows=6000)
        >>> df.head()
    """

    # Set seed for reproducibility
    np.random.seed(1234)

    # Create date range
    dates = [datetime.now() - timedelta(days=i) for i in range(n_rows)]
    dates.reverse()

    # Generate features
    df = pd.DataFrame(
        {
            "date": dates,
            "average_temperature": np.random.uniform(10, 35, n_rows),
            "rainfall": np.random.exponential(5, n_rows),
            "weekend": [(date.weekday() >= 5) * 1 for date in dates],
            "holiday": np.random.choice([0, 1], n_rows, p=[0.97, 0.03]),
            "price_per_kg": np.random.uniform(0.5, 3, n_rows),
            "month": [date.month for date in dates],
        }
    )

    # Introduce inflation over time (years)
    df["inflation_multiplier"] = 1 + (df["date"].dt.year - df["date"].dt.year.min()) * 0.03

    # Incorporate seasonality due to apple harvests
    df["harvest_effect"] = np.sin(2 * np.pi * (df["month"] - 3) / 12) + np.sin(
        2 * np.pi * (df["month"] - 9) / 12
    )

    # Modify the price_per_kg based on harvest effect
    df["price_per_kg"] = df["price_per_kg"] - df["harvest_effect"] * 0.5

    # Adjust promo periods to coincide with periods lagging peak harvest by 1 month
    peak_months = [4, 10]  # months following the peak availability
    df["promo"] = np.where(
        df["month"].isin(peak_months),
        1,
        np.random.choice([0, 1], n_rows, p=[0.85, 0.15]),
    )

    # Generate target variable based on features
    base_price_effect = -df["price_per_kg"] * 50
    seasonality_effect = df["harvest_effect"] * 50
    promo_effect = df["promo"] * 200

    df["demand"] = (
        base_demand
        + base_price_effect
        + seasonality_effect
        + promo_effect
        + df["weekend"] * 300
        + np.random.normal(0, 50, n_rows)
    ) * df["inflation_multiplier"]  # adding random noise

    # Add previous day's demand
    df["previous_days_demand"] = df["demand"].shift(1)
    df["previous_days_demand"].fillna(method="bfill", inplace=True)  # fill the first row

    # Drop temporary columns
    df.drop(columns=["inflation_multiplier", "harvest_effect", "month"], inplace=True)

    return df

Now, let's generate a dataset that we can use to train our model.


In [14]:
data = generate_apple_sales_data_with_promo_adjustment(base_demand=1_000, n_rows=1_000)

data[-20:]

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df["previous_days_demand"].fillna(method="bfill", inplace=True)  # fill the first row
  df["previous_days_demand"].fillna(method="bfill", inplace=True)  # fill the first row


Unnamed: 0,date,average_temperature,rainfall,weekend,holiday,price_per_kg,promo,demand,previous_days_demand
980,2024-08-20 11:18:05.239315,32.247401,8.172453,0,0,2.362199,0,971.858603,1165.445095
981,2024-08-21 11:18:05.239315,14.211707,2.721331,0,0,0.563754,0,1004.731896,971.858603
982,2024-08-22 11:18:05.239315,32.089107,3.127948,0,0,0.509169,1,1332.053165,1004.731896
983,2024-08-23 11:18:05.239315,27.569391,7.061037,0,0,1.440787,0,928.857826,1332.053165
984,2024-08-24 11:18:05.239315,20.552716,8.307609,1,0,2.584687,1,1410.495217,928.857826
985,2024-08-25 11:18:05.239315,31.582874,0.761929,1,0,1.624137,0,1334.913682,1410.495217
986,2024-08-26 11:18:05.239315,15.876081,0.423225,0,0,2.615971,1,1191.578354,1334.913682
987,2024-08-27 11:18:05.239315,24.093767,2.212338,0,0,0.775894,0,963.057655,1191.578354
988,2024-08-28 11:18:05.239315,10.664006,17.335405,0,0,0.619787,0,1153.011411,963.057655
989,2024-08-29 11:18:05.239315,19.501286,5.883864,0,0,2.741469,0,1098.865706,1153.011411



Logging our first runs with MLflow
==================================

In our previous segments, we worked through setting up our first MLflow
Experiment and equipped it with custom tags. These tags, as we'll soon
discover, are instrumental in seamlessly retrieving related experiments
that belong to a broader project.

In the last section, we created a dataset that we'll be using to train a
series of models.

As we advance in this section, we'll delve deeper into the core features
of MLflow Tracking:

- Making use of the `start_run` context for creating and efficiently managing runs.
- An introduction to logging, covering tags, parameters, and metrics.
- Understanding the role and formation of a model signature.
- Logging a trained model, solidifying its presence in our MLflow run.

But first, a foundational step awaits us. For our upcoming tasks, we
need a dataset, specifically focused on apple sales. While it's tempting
to scour the internet for one, crafting our own dataset will ensure it
aligns perfectly with our objectives.


Crafting the Apple Sales Dataset
--------------------------------

Let's roll up our sleeves and construct this dataset.

We need a data set that defines the dynamics of apple sales influenced
by various factors like weekends, promotions, and fluctuating prices.
This dataset will serve as the bedrock upon which our predictive models
will be built and tested.

Before we get to that, though, let's take a look at what we've learned
so far and how these principles were used when crafting this data set
for the purposes of this lab.


### Using Experiments in early-stage project development

As the diagram below shows, I tried taking a series of shortcuts. In
order to record what I was trying, I created a new MLflow Experiment to
record the state of what I tried. Since I was using different data sets
and models, each subsequent modification that I was trying necessitated
a new Experiment.


[![Using MLflow Tracking for building this
demo](./images//dogfood-diagram.svg)](./images//dogfood-diagram.svg)



After finding a workable approach for the dataset generator, the results
can be seen in the MLflow UI.


[![Checking the results of the
test](./images//dogfood.gif)](./images//dogfood.gif)



Once I found something that actually worked, I cleaned everything up
(deleted them).


[![Tidying
up](./images//cleanup-experiments.gif)](./images//cleanup-experiments.gif)


Note

If you're precisely following along to this lab and you delete your
`Apple_Models` Experiment, recreate it
before proceeding to the next step in the lab.


Using MLflow Tracking to keep track of training
-----------------------------------------------

Now that we have our data set and have seen a little bit of how runs are
recorded, let's dive in to using MLflow to tracking a training
iteration.

To start with, we will need to import our required modules.

In [15]:
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.model_selection import train_test_split

import mlflow


Notice that here we aren't importing the `MlflowClient` directly. For this portion, we're going to be
using the `fluent` API. The fluent APIs
use a globally referenced state of the MLflow tracking server's uri.
This global instance allows for us to use these 'higher-level' (simpler)
APIs to perform every action that we can otherwise do with the
`MlflowClient`, with the addition of
some other useful syntax (such as context handlers that we'll be using
very shortly) to make integrating MLflow to ML workloads as simple as
possible.

In order to use the `fluent` API, we'll
need to set the global reference to the Tracking server's address. We do
this via the following command:


In [16]:

# Use the fluent API to set the tracking uri and the active experiment
mlflow.set_tracking_uri("http://127.0.0.1:5000")



Once this is set, we can define a few more constants that we're going to
be using when logging our training events to MLflow in the form of runs.
We'll start by defining an Experiment that will be used to log runs to.
The parent-child relationship of Experiments to Runs and its utility
will become very clear once we start iterating over some ideas and need
to compare the results of our tests.



In [17]:


# Sets the current active experiment to the "Apple_Models" experiment and returns the Experiment metadata
apple_experiment = mlflow.set_experiment("Apple_Models")

# Define a run name for this iteration of training.
# If this is not set, a unique name will be auto-generated for your run.
run_name = "apples_rf_test"

# Define an artifact path that the model will be saved to.
artifact_path = "rf_apples"

With these variables defined, we can commence with actually training a
model.

Firstly, let's look at what we're going to be running. Following the
code display, we'll look at an annotated version of the code.

In [18]:
# Split the data into features and target and drop irrelevant date field and target field
X = data.drop(columns=["date", "demand"])
y = data["demand"]

# Split the data into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

params = {
    "n_estimators": 100,
    "max_depth": 6,
    "min_samples_split": 10,
    "min_samples_leaf": 4,
    "bootstrap": True,
    "oob_score": False,
    "random_state": 888,
}

# Train the RandomForestRegressor
rf = RandomForestRegressor(**params)

# Fit the model on the training data
rf.fit(X_train, y_train)

# Predict on the validation set
y_pred = rf.predict(X_val)

# Calculate error metrics
mae = mean_absolute_error(y_val, y_pred)
mse = mean_squared_error(y_val, y_pred)
rmse = np.sqrt(mse)
r2 = r2_score(y_val, y_pred)

# Assemble the metrics we're going to write into a collection
metrics = {"mae": mae, "mse": mse, "rmse": rmse, "r2": r2}

# Initiate the MLflow run context
with mlflow.start_run(run_name=run_name) as run:
    # Log the parameters used for the model fit
    mlflow.log_params(params)

    # Log the error metrics that were calculated during validation
    mlflow.log_metrics(metrics)

    # Log an instance of the trained model for later use
    mlflow.sklearn.log_model(sk_model=rf, input_example=X_val, artifact_path=artifact_path)




To aid in visualizing how MLflow tracking API calls add in to an ML training code base, see the figure below.


[![Explanation of MLflow integration into ML training
code](./images//training-annotation.png)](./images//training-annotation.png)