<img src="https://storage.googleapis.com/arize-assets/arize-logo-white.jpg" width="200"/>

# **Arize and MLflow Walkthrough**

Let's get started on using Arize with MLflow! ✨

**Arize** and **MLflow** are MLOps tools that aim to improve connected, but different parts of your ML pipeline and ML workflow. 

**MLflow** is an open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, model registry. **Arize** is an observability & monitoring tool that helps you pre-launch validate those model experiments & versions, and allows you to benchmark, monitor, and visualize your production model performance, data drift, data quality, and explainability after it has been deployed in production.

By integrating Arize and MLflow, you will be able to train, manage, and register the best models with reproducible results with Arize. At the same time, by using lightweight integrations at different stages of your ML Life Cycle (training & Serving), you can continuously monitor and ensure those performances tracked by MLflow is preserved in production.


## ✔️ Steps for this Walkthrough
1. Examples of setting up Arize and MLflow
2. Experiment Managing with MLflow + Production Benchmarking with Arize
3. Storing and Loading model with MLflow
4. Integrating Serving End-point with Arize + MLflow
5. Some key take-aways for the joint value add of using two platforms together

# Step 0. Setup and Getting the Data
We import the iris dataset and fit an Multi-Layered Perceptron classification model. We will be showing how to track experiments with MLflow.

## Install Dependencies and Import Libraries 📚

In [None]:
!pip install -q arize
!pip install -q mlflow

import uuid
import warnings

import mlflow
import numpy as np
import pandas as pd
from arize.pandas.logger import Client, Schema
from arize.utils.types import Environments, ModelTypes
from mlflow.models.signature import ModelSignature, infer_signature
from mlflow.types.schema import ColSpec, Schema
from sklearn import datasets
from sklearn.metrics import mean_absolute_error, mean_squared_error
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier

warnings.filterwarnings("ignore")

## **🌐 Download the Data**

In [None]:
iris = datasets.load_iris()
X, y = pd.DataFrame(iris["data"], columns=iris["feature_names"]), iris["target"]
X_train, X_prod, y_train, y_prod = train_test_split(
    X, y, test_size=0.5, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.5, random_state=42)

# Step 1.  Sending Data into Arize 💫

First, copy the Arize `API_KEY` and `SPACE_KEY` from your admin page linked below!



<img src="https://storage.googleapis.com/arize-assets/fixtures/copy-keys.jpeg" width="600">

In [None]:
ArizeSchema = Schema  # Schema is a name clash later, so we store it here

SPACE_KEY = "SPACE_KEY"
API_KEY = "API_KEY"
arize_client = Client(space_key=SPACE_KEY, api_key=API_KEY)

if SPACE_KEY == "SPACE_KEY" or API_KEY == "API_KEY":
    raise ValueError("❌ NEED TO CHANGE SPACE AND/OR API_KEY")
else:
    print("✅ Arize setup complete!")

## Setting-up MLflow Run Record URI
For this example, we will start a tracking location to the colab environment. You can set your tracking environment to your own machine, a SQLAlchemy compatible database, or a remote server. To read about alternative based on your training environment, [read here](https://www.mlflow.org/docs/latest/tracking.html#where-runs-are-recorded).

In [None]:
run_uri = "file:///content/mlruns"
mlflow.set_tracking_uri(run_uri)

# Step 2. Experiment Managing with MLFlow + Production Benchmarking with Arize
### Integrating MLFlow and Arize can be easily integrated starting at the training stage!


**MLflow Tracking** provides an API for managing and logging model parameters, model version, evaluation metrics, and outputs across different training runs. The MLflow UI is useful to help you organize and find the best model before deployment.

<img src="https://storage.googleapis.com/arize-assets/fixtures/MLflow/mlflow-tracking.png" width="800">

You can easily integrate **MLFlow Tracking** with **Arize** to visualize performance benchmarks and data quality across model types and versions to validate model readiness for production.

Let's first use a very simple example of logging of training metrics and parameters using mlflow.

In [None]:
tags = {"model_id": "mlp-classifier", "version": "v1.0"}
params = {
    "hidden_layer_sizes": 3,
    "activation": "relu",
    "learning_rate_init": 0.001,
}
metrics = {
    "mse": mean_squared_error,
    "mae": mean_absolute_error,
}
# start a new run with MLFlow tags
if mlflow.active_run() != None:
    mlflow.end_run()
mlflow.start_run(tags=tags)

with mlflow.active_run() as run:
    # Tracking experiment to MLflow Tracking
    model = MLPClassifier(
        hidden_layer_sizes=params["hidden_layer_sizes"],
        activation=params["activation"],
        learning_rate_init=params["learning_rate_init"],
    ).fit(X_train, y_train)
    y_train_pred = model.predict(X_train)
    for key in params.keys():
        mlflow.log_param(key, params[key])
    for key in metrics.keys():
        error = metrics[key](y_train_pred, y_train)
        mlflow.log_metric(key, error)
print("✅ Success fully tracked aggregate metric and parameter to MLflow")


<img src="https://storage.googleapis.com/arize-assets/fixtures/MLflow/api-compare.png" width="900">

By using `mlflow.run.tags`, we can create a one-to-one correspondance between `model_id` and `model_version` on MLflow and Arize. By including the `run.info.run_id` of each training run as part of your model `batch_id`, we make each training environment discoverable on the Arize platform in the future when the model is deployed in production post training.
```
# Example of tracking MLflow model params to Arize
arize_client.log(
    ...
    model_id=run.data.tags['model_id'],
    model_version=run.data.tags['model_version'],
    batch_id=f'training {run.info.experiment_id}'
    ...
    environment=Environments.TRAINING,
    ...
)
```
See below for an example of matching MLflow Tracking with Arize!

In [None]:
# Logging training to Arize
logging_df = X_train.copy()
logging_df["prediction_ids"] = pd.Series(
    (str(uuid.uuid4()) for _ in range(len(X))), index=X.index
)
logging_df["predictions"] = y_train_pred
logging_df["actuals"] = y_train

# Defining a schema to log training records
schema = ArizeSchema(
    feature_column_names=X.columns,
    prediction_id_column_name="prediction_ids",
    prediction_label_column_name="predictions",
    actual_label_column_name="actuals",
)

# We use the MLflow experiments to track to Arize
train_response = arize_client.log(
    dataframe=logging_df,
    model_id=run.data.tags["model_id"],
    model_version=run.data.tags["version"],
    batch_id=f"training {run.info.experiment_id}",
    model_type=ModelTypes.NUMERIC,
    environment=Environments.TRAINING,
    schema=schema,
)
print("✅ Success fully logged training evaluations and parameter to Arize")

## **Takeaways**: Why track your model to both MLflow and Arize?
Logging/tracking your training & validation data to both platforms offers many joint value to your ML workflow.

1. MLFlow collects **aggregate metrics** (error across entire training/validation set). Arize is an **evaluation store**, allowing you to check error rate across slices of your data (i.e where features are less than a specific value).
2. MLFlow allows you to track **training benchmark** metrics in training and validation environment, so you can tune your parameters knowing your experiements won't be lost. Arize tracks the same training/validation data, calculates the same metrics, and uses it as the **production baseline** later to ensure that they are preserved in production.
3. Matching `run_id` with `batch_id` allows for best practice reproducibilty, so your training and validation environments are always reproducible and visualized on Arize.

# Step 3. Storing and Loading Model in Production
A MLflow Model is a standard format for packaging machine learning models that can be used in a variety of downstream tools. In this example, we show an example of how to us the `mlflow.schema` to store and load our `sklearn` model.

The model directory packaging can then be used in production, whether through REST API or through another serving end-point such as SageMaker, Azure, Algorithmia, etc.

In [None]:
MLflowSchema = Schema  # Since there is a name clash in importS

conda_env = {
    "channels": ["conda-forge"],
    "dependencies": ["python=3.8.8", "pip"],
    "pip": [
        "mlflow",
        "scikit-learn==0.23.2",
    ],
    "name": "mlflow-env",
}

input_schema = MLflowSchema(
    [
        ColSpec("double", "sepal length (cm)"),
        ColSpec("double", "sepal width (cm)"),
        ColSpec("double", "petal length (cm)"),
        ColSpec("double", "petal width (cm)"),
    ]
)
output_schema = MLflowSchema([ColSpec("long")])
signature = ModelSignature(inputs=input_schema, outputs=output_schema)

# Storing model in your active r
mlflow.sklearn.log_model(
    model, "model_directory", conda_env=conda_env, signature=signature
)
print("✅ Success stored model in current run directory")

In [None]:
# Reloading from your run_uri directory
# This also works if you are using a remote directory
production_model = mlflow.sklearn.load_model(
    run_uri + f"/0/{mlflow.active_run().info.run_id}/artifacts/model_directory"
)
production_model

# Step 4. Integrating Serving End-Point with MLflow Signature and Arize Schema

**MLflow Signature** is used to specify input (and output) format of your data, enforcing shapes, data types, and/or feature names. The input requirements are checked before calls are made to the underlying model (i.e `sklearn` model), allowing for another abstraction layer to avoid production issues of input mismatch.

**Arize Schema** is used by the Arize SDK to specify model and input data format such as feature column names and prediction column name. It can be easily integrated with the **MLflow Signature** approach, allowing your model schema to be automatically detected as you version your model.

<img src="https://storage.googleapis.com/arize-assets/fixtures/MLflow/signature_inference.png" width="900">

The `ColSpec` abstraction of feature columns allows for Arize to detect and construct the `Schema` object at the production serving end-point. 

## Usage Case Example

Suppose for model version 2.0, **you add or remove features** for your MLflow model. When you change your **MLflow Signature**, and Arize can directly infer and understand the change in your model Schema, making production monitoring pipeline painless to manage. The segment of code below is an example of how to allow Arize to automatically infer model structure using the MLflow Signature. 

**Note: This also ensures that the feature names with exact one-to-one mapping on both MLflow and Arize.**

In this example, we used the function `predict_and_track` to both predict and log automatically to Arize

In [None]:
def predict_and_track(model, model_version, X_prod, prediction_ids):
    """
    Example function that shows how to integrate Arize with MLflow Signature
    """
    y_pred = model.predict(X_prod)
    # Step 1: Discover model schema from MLFlow
    signature = infer_signature(X_prod, y_pred)
    # extract the feature columns from our signature
    feature_column_names = signature.inputs.column_names()
    schema = ArizeSchema(
        feature_column_names=feature_column_names,
        prediction_id_column_name="prediction_ids",
        prediction_label_column_name="predictions",
    )
    # Step 2: Log to Arize with the same schema
    logging_df = X_prod.copy()
    logging_df["prediction_ids"] = prediction_ids
    logging_df["predictions"] = y_pred
    logging_response = arize_client.log(
        dataframe=logging_df,
        model_id="mlp-classifier",
        model_version=model_version,
        model_type=ModelTypes.NUMERIC,
        environment=Environments.PRODUCTION,
        schema=schema,
    )
    # Step 3: Return production predictions
    print(
        f"✅ Successfully tracked model with {len(feature_column_names)} features: {feature_column_names}"
    )
    return y_pred

Here, we use our originally defined Schema object. 

In [None]:
# Reusuing the same Schema
input_schema = MLflowSchema(
    [
        ColSpec("double", "sepal length (cm)"),
        ColSpec("double", "sepal width (cm)"),
        ColSpec("double", "petal length (cm)"),
        ColSpec("double", "petal width (cm)"),
    ]
)
output_schema = MLflowSchema([ColSpec("long")])
signature = ModelSignature(inputs=input_schema, outputs=output_schema)

# Saving and reloading model
mlflow.sklearn.log_model(
    model, "model_directory", conda_env=conda_env, signature=signature
)
loaded_model = mlflow.sklearn.load_model(
    run_uri + f"/0/{mlflow.active_run().info.run_id}/artifacts/model_directory"
)

# Simulate production environment with a new v2.0 model
prediction_ids = pd.Series(
    (str(uuid.uuid4()) for _ in range(len(X_prod))), index=X_prod.index
)
y_prod_pred = predict_and_track(model, "production v2.0", X_prod, prediction_ids)

### **Note:** In this section, we didn't having to modify Arize Schema since it is directly inferred based on the MLflow Schema in production, only requiring you to define your MLflow Schema.

In [None]:
# Fit a new model version with only 3 features
X_train_3feature = X_prod.drop(columns=["petal width (cm)"])
X_prod_3feature = X_prod.drop(columns=["petal width (cm)"])
new_model = MLPClassifier().fit(X_train_3feature, y_train)

# Redefine our Schema from MLflow
input_schema = MLflowSchema(
    [
        ColSpec("double", "sepal length (cm)"),
        ColSpec("double", "sepal width (cm)"),
        ColSpec("double", "petal length (cm)"),
        # ColSpec("double", "petal width (cm)") we remove this feature to simulate iterating
    ]
)

# Saving and reloading model
mlflow.sklearn.log_model(
    model, "model_directory", conda_env=conda_env, signature=signature
)
loaded_model = mlflow.sklearn.load_model(
    run_uri + f"/0/{mlflow.active_run().info.run_id}/artifacts/model_directory"
)

# Simulate production environment with a new v2.0 model
prediction_ids = pd.Series(
    (str(uuid.uuid4()) for _ in range(len(X_prod_3feature))),
    index=X_prod_3feature.index,
)
X_prod_pred_3feature = predict_and_track(
    new_model, "production v2.0", X_prod_3feature, prediction_ids
)

# Appendix: Integrating With Deployment Tools as Serving End-Point
If you are using one the MLflow [Built-in Deployment Tools](https://www.mlflow.org/docs/latest/models.html#built-in-deployment-tools) such as SageMaker or Azure, you may also want to see our serving end-point integrations below.

[![Buttons_OpenOrange.png](https://storage.googleapis.com/arize-assets/fixtures/Buttons_OpenOrange.png)](https://docs.arize.com/arize/integrations/integrations)

# 📚 **Final Takeaways**: Why use Arize with MLflow?
Logging/tracking your training & validation data to both platforms offers many joint value to your ML workflow.

1. MLFlow collects **aggregate metrics** (error across entire training/validation set). Arize is an **evaluation store**, allowing you to check error rate across slices of your data (i.e where features are less than a specific value).
2. MLFlow allows you to track **training benchmark** metrics in training and validation environment, so you can tune your parameters knowing your experiements won't be lost. Arize tracks the same training/validation data, calculates the same metrics, and uses it as the **production baseline** later to ensure that they are preserved in production.
3. Integrating Arize with MLflow at production serving end-point is simple. Redefining the MLflow `Schema` object each time you modify your feature names or add new features will allow Arize to infer feature information about your models.

In [None]:
print("🎉 You are done!")

# Overview
Arize is an end-to-end ML observability and model monitoring platform. The platform is designed to help ML engineers and data science practitioners surface and fix issues with ML models in production faster with:
- Automated ML monitoring and model monitoring
- Workflows to troubleshoot model performance
- Real-time visualizations for model performance monitoring, data quality monitoring, and drift monitoring
- Model prediction cohort analysis
- Pre-deployment model validation
- Integrated model explainability

### Website
Visit Us At: https://arize.com/model-monitoring/

### Additional Resources
- [What is ML observability?](https://arize.com/what-is-ml-observability/)
- [Playbook to model monitoring in production](https://arize.com/the-playbook-to-monitor-your-models-performance-in-production/)
- [Using statistical distance metrics for ML monitoring and observability](https://arize.com/using-statistical-distance-metrics-for-machine-learning-observability/)
- [ML infrastructure tools for data preparation](https://arize.com/ml-infrastructure-tools-for-data-preparation/)
- [ML infrastructure tools for model building](https://arize.com/ml-infrastructure-tools-for-model-building/)
- [ML infrastructure tools for production](https://arize.com/ml-infrastructure-tools-for-production-part-1/)
- [ML infrastructure tools for model deployment and model serving](https://arize.com/ml-infrastructure-tools-for-production-part-2-model-deployment-and-serving/)
- [ML infrastructure tools for ML monitoring and observability](https://arize.com/ml-infrastructure-tools-ml-observability/)

Visit the [Arize Blog](https://arize.com/blog) and [Resource Center](https://arize.com/resource-hub/) for more resources on ML observability and model monitoring.
