# MLflow + whylogs Integration

In this notebook, we will explore the [MLflow](https://mlflow.org/) integration in `whylogs`.

This example uses the data from [MLflow's tutorial](https://mlflow.org/docs/latest/tutorials-and-examples/tutorial.html) for demonstration purposes.

This tutorial showcases how you can use the whylogs integration to:
* Capture data quality metrics while training a linear regression model in `mlflow`
* Extract whylogs data back into an in-memory format from the MLflow backend
* Visualize this data

# Getting Started
To run this tutorial:
* Install [conda](https://conda.io/projects/conda/en/latest/user-guide/install/index.html)
* Create a new environment with conda via `conda create --name whylogs-mlflow python=3.8`
    * You'll need to activate the environment with `conda activate whylogs-mlflow`
    * You'll need to install pip into the Conda environment `conda install pip`
    * To make the environment work with Jupyter notebooks, run `pip install ipykernel` to install the kernel module
    * Install the environment as a Jupyter notebook kernel via `python -m ipykernel install --user --name=whylogs-mlflow`
* Install MLflow with scikit-learn via `pip install mlflow[extras]`
* Install whylogs with matplotlib via `pip install whylogs[viz]`
* You can also install the necessary libraries separately:
    * MLflow: `pip install mlflow`
    * whylogs: `pip install whylogs`
    * scikit-learn: `pip install scikit-learn`
    * matplotlib: `pip install matplotlib`
* In your notebook, ensure you select `whylogs-mlflow` as your kernel

# Setup
First, we want to filter out noisy warnings

In [None]:
import warnings
warnings.filterwarnings("ignore")
warnings.simplefilter('ignore')


In [None]:
import random
import time

import pandas as pd
import mlflow
import whylogs
from whylogs import get_or_create_session

from sklearn.metrics import mean_absolute_error
from sklearn.model_selection import train_test_split
from sklearn.linear_model import ElasticNet

# Enable whylogs Integration

Enable whylogs in MLflow to allow storing whylogs statistical profiles with every run. This method returns `True` if whylogs is able to patch MLflow.

In [None]:
assert whylogs.__version__ >= "0.1.13" # we need 0.1.13 or later for MLflow integration
session = get_or_create_session(".whylogs_mlflow.yaml")
whylogs.enable_mlflow(session)

# Dataset Preparation

Download and prepare the UCI wine quality dataset. We sample the test dataset further to represent batches of data produced every second.

In [None]:
data_url = "http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
data = pd.read_csv(data_url, sep=";")
data

In [None]:
# Split the data into training and test sets
train, test = train_test_split(data)

We can quickly get a sense of the shape of the training dataset by profiling it with whylogs. Later, we can compare the baseline data metrics to the profiles of the batches as they flow through our model.

If you'd like to learn more about whylogs, check out our [introductory notebook](https://github.com/whylabs/whylogs-examples/blob/mainline/python/GettingStarted.ipynb).

In [None]:
from whylogs import get_or_create_session

session = get_or_create_session()
summary = session.profile_dataframe(train, "training-data").flat_summary()['summary']

summary

Now that we've taken a peek at our training data metrics, there's one last item on our to-do list: split the test data into batches, so we can feed them through our model later on.

In [None]:
# Relocate predicted variable "quality" to y vectors
train_x = train.drop(["quality"], axis=1).reset_index(drop=True)
test_x = test.drop(["quality"], axis=1).reset_index(drop=True)
train_y = train[["quality"]].reset_index(drop=True)
test_y = test[["quality"]].reset_index(drop=True)

subset_test_x = []
subset_test_y = []
num_batches = 20
for i in range(num_batches):
    indices = random.sample(range(len(test)), 5)
    subset_test_x.append(test_x.loc[indices, :])
    subset_test_y.append(test_y.loc[indices, :])

# Train the model
We'll train an ElasticNet model using scikit-learn, then run this model for each of the batches of data, logging the model parameters, mean absolute error, and whylogs data metrics.

Note that whylogs profiler data is automatically logged when `mlflow.end_run()` is called (implicitly or explicitly).

In [None]:
# Create an MLflow experiment for our demo
experiment_name = "whylogs demo"
mlflow.set_experiment(experiment_name)

model_params = {"alpha": 1.0,
                "l1_ratio": 0.7}

lr = ElasticNet(**model_params)
lr.fit(train_x, train_y)
print("ElasticNet model (%s):" % model_params)

In [None]:
# run predictions on the batches of data we set up earlier and log whylogs data
for i in range(num_batches):
    with mlflow.start_run(run_name=f"Run {i+1}"):
        batch = subset_test_x[i]
        predicted_output = lr.predict(batch)

        mae = mean_absolute_error(subset_test_y[i], predicted_output)
        print("Subset %.0f, mean absolute error: %s" % (i + 1, mae))

        mlflow.log_params(model_params)
        mlflow.log_metric("mae", mae)

        # use whylogs to log data quality metrics for the current batch
        mlflow.whylogs.log_pandas(batch)

        # wait a second between runs to create a time series of prediction results
        time.sleep(1)

# Accessing whylogs Data From Your Experiment
Now, let's explore our whylogs data inside the MLflow experiment.

In [None]:
client = mlflow.tracking.MlflowClient()
experiment = client.get_experiment_by_name(experiment_name)
experiment.name, experiment.experiment_id

MLflow stores the data profiles as *artifacts*. These can be retrieved in the same way you access MLflow projects, parameters, and metrics.

whylogs exposes helper API for accessing whylogs-specific output of an experiment.

In [None]:
whylogs.mlflow.list_whylogs_runs(experiment.experiment_id)

# Visualizing whylogs Data
Our integration allows you to quickly collect the statistical profiles produced during experimentation.

In [None]:
mlflow_profiles = whylogs.mlflow.get_experiment_profiles(experiment.experiment_id)
mlflow_profiles

You can then use `whylogs.viz` to easily produce visualizations for the whylogs profile data.

Below, you can see how the data changed over time in our batches for the column called `free sulfur dioxide`.

In [None]:
from whylogs.viz import ProfileVisualizer

viz = ProfileVisualizer()
viz.set_profiles(mlflow_profiles)

In [None]:
viz.plot_distribution("free sulfur dioxide", ts_format="%d-%b-%y %H:%M:%S")

In [None]:
viz.plot_uniqueness("free sulfur dioxide", ts_format="%d-%b-%y %H:%M:%S")

In [None]:
viz.plot_missing_values("free sulfur dioxide", ts_format="%d-%b-%y %H:%M:%S")

In [None]:
viz.plot_data_types("free sulfur dioxide", ts_format="%d-%b-%y %H:%M:%S")

We can also plot the mean error of each run for comparison.

In [None]:
import matplotlib.pyplot as plt

plt.close('all')

runs = mlflow.search_runs(experiment.experiment_id)
plt.figure(figsize=(10,2))
plt.plot(runs['start_time'], runs['metrics.mae'])
plt.show()

With whylogs, collecting and visualizing data quality metrics at both training and inference time for your MLflow runs is made dead simple. These metrics can be invaluable when trying to debug model failures or optimize their performance.

whylogs data can be visualized in more complex ways. Check out [whylogs.viz](https://whylogs.readthedocs.io/en/latest/api/whylogs.viz.html) for details on the API.

In addition, you can also check out how **WhyLabs** can help you visualize data quality metrics by visiting [our sandbox](http://try.whylabsapp.com/). Feel free to reach out to our [slack channel](http://join.slack.whylabs.ai/) if you have any questions!