# MLFlow Pre-packaged Model Server AB Test Deployment 
In this example we will build two models with MLFlow and we will deploy them as an A/B test deployment. The reason this is powerful is because it allows you to deploy a new model next to the old one, distributing a percentage of traffic. These deployment strategies are quite simple using Seldon, and can be extended to shadow deployments, multi-armed-bandits, etc.

## Tutorial Overview

This tutorial will follow closely break down in the following sections:

1. Train the MLFlow elastic net wine example

2. Deploy your trained model leveraging our pre-packaged MLFlow model server

3. Test the deployed MLFlow model by sending requests

4. Deploy your second model as an A/B test

5. Visualise and monitor the performance of your models using Seldon Analytics

It will follow closely our talk at the [Spark + AI Summit 2019 on Seldon and MLflow](https://www.youtube.com/watch?v=D6eSfd9w9eA).

## Dependencies

For this example to work you must be running Seldon 0.3.2 or above - you can follow our [getting started guide for this](https://docs.seldon.io/projects/seldon-core/en/latest/workflow/install.html).

In regards to other dependencies, make sure you have installed:

* Helm v3.0.0+
* kubectl v1.14+
* Python 3.6+
* MLFlow 1.1.0
* [pygmentize](https://pygments.org/docs/cmdline/)

We will also take this chance to load the Python dependencies we will use through the tutorial:

In [None]:
### Installation of packages
!pip install --upgrade pip
!pip install pandas
!pip install seldon-core

In [3]:
import numpy as np
import pandas as pd

from seldon_core.seldon_client import SeldonClient

Collecting pip
  Using cached pip-22.3.1-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 22.1.2
    Uninstalling pip-22.1.2:
      Successfully uninstalled pip-22.1.2
Successfully installed pip-22.3.1


ModuleNotFoundError: No module named 'seldon_core'

Let's get started! 🚀🔥

## 1. Train the first MLFlow Elastic Net Wine example

For our example, we will use the elastic net wine example from [MLflow's tutorial](https://github.com/mlflow/mlflow/tree/master/examples/sklearn_elasticnet_wine).

In [83]:
!git clone https://github.com/mlflow/mlflow.git
!ls ./mlflow/examples/sklearn_elasticnet_wine

fatal: destination path 'mlflow' already exists and is not an empty directory.
MLproject        python_env.yaml  train.py
conda.yaml       train.ipynb      wine-quality.csv


### MLproject

As any other MLflow project, it is defined by its `MLproject` file:

In [84]:
PROJECT_DIR='./mlflow/examples/sklearn_elasticnet_wine'
!pygmentize -l yaml $PROJECT_DIR/MLproject

[94mname[39;49;00m:[37m [39;49;00mtutorial[37m[39;49;00m
[37m[39;49;00m
[94mpython_env[39;49;00m:[37m [39;49;00mpython_env.yaml[37m[39;49;00m
[37m[39;49;00m
[94mentry_points[39;49;00m:[37m[39;49;00m
[37m  [39;49;00m[94mmain[39;49;00m:[37m[39;49;00m
[37m    [39;49;00m[94mparameters[39;49;00m:[37m[39;49;00m
[37m      [39;49;00m[94malpha[39;49;00m:[37m [39;49;00m{[94mtype[39;49;00m:[37m [39;49;00m[31mfloat[39;49;00m,[94m default[39;49;00m:[37m [39;49;00m[31m0.5[39;49;00m}[37m[39;49;00m
[37m      [39;49;00m[94ml1_ratio[39;49;00m:[37m [39;49;00m{[94mtype[39;49;00m:[37m [39;49;00m[31mfloat[39;49;00m,[94m default[39;49;00m:[37m [39;49;00m[31m0.1[39;49;00m}[37m[39;49;00m
[37m    [39;49;00m[94mcommand[39;49;00m:[37m [39;49;00m[33m"[39;49;00m[33mpython[39;49;00m[31m [39;49;00m[33mtrain.py[39;49;00m[31m [39;49;00m[33m{alpha}[39;49;00m[31m [39;49;00m[33m{l1_ratio}[39;49;00m[33m"[39;49;00m[37m

We can see that this project uses Conda for the environment and that it's defined in the `conda.yaml` file:

In [85]:
!pygmentize $PROJECT_DIR/conda.yaml

[94mname[39;49;00m:[37m [39;49;00mtutorial[37m[39;49;00m
[94mchannels[39;49;00m:[37m[39;49;00m
[37m  [39;49;00m-[37m [39;49;00mconda-forge[37m[39;49;00m
[94mdependencies[39;49;00m:[37m[39;49;00m
[37m  [39;49;00m-[37m [39;49;00mpython=3.8[37m[39;49;00m
[37m  [39;49;00m-[37m [39;49;00mpip[37m[39;49;00m
[37m  [39;49;00m-[37m [39;49;00m[94mpip[39;49;00m:[37m[39;49;00m
[37m      [39;49;00m-[37m [39;49;00mscikit-learn==0.23.2[37m[39;49;00m
[37m      [39;49;00m-[37m [39;49;00mmlflow>=1.0[37m[39;49;00m
[37m      [39;49;00m-[37m [39;49;00mpandas[37m[39;49;00m


Lastly, we can also see that the training will be performed by the `train.py` file, which receives two parameters `alpha` and `l1_ratio`:

In [86]:
!pygmentize $PROJECT_DIR/train.py

[37m# The data set used in this example is from http://archive.ics.uci.edu/ml/datasets/Wine+Quality[39;49;00m
[37m# P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.[39;49;00m
[37m# Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.[39;49;00m

[34mimport[39;49;00m [04m[36mos[39;49;00m
[34mimport[39;49;00m [04m[36msys[39;49;00m

[34mimport[39;49;00m [04m[36mpandas[39;49;00m [34mas[39;49;00m [04m[36mpd[39;49;00m
[34mimport[39;49;00m [04m[36mnumpy[39;49;00m [34mas[39;49;00m [04m[36mnp[39;49;00m
[34mfrom[39;49;00m [04m[36msklearn[39;49;00m[04m[36m.[39;49;00m[04m[36mmetrics[39;49;00m [34mimport[39;49;00m mean_squared_error, mean_absolute_error, r2_score
[34mfrom[39;49;00m [04m[36msklearn[39;49;00m[04m[36m.[39;49;00m[04m[36mmodel_selection[39;49;00m [34mimport[39;49;00m train_test_split
[34mfrom[39;49;00m [04m[36msklearn[39;4

### Dataset

We will use the wine quality dataset.
Let's load it to see what's inside:

In [88]:
filename = "./mlflow/examples/sklearn_elasticnet_wine/wine-quality.csv"
data = pd.read_csv(filename)
data.head()

Unnamed: 0,fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality
0,7.0,0.27,0.36,20.7,0.045,45.0,170.0,1.001,3.0,0.45,8.8,6
1,6.3,0.3,0.34,1.6,0.049,14.0,132.0,0.994,3.3,0.49,9.5,6
2,8.1,0.28,0.4,6.9,0.05,30.0,97.0,0.9951,3.26,0.44,10.1,6
3,7.2,0.23,0.32,8.5,0.058,47.0,186.0,0.9956,3.19,0.4,9.9,6
4,7.2,0.23,0.32,8.5,0.058,47.0,186.0,0.9956,3.19,0.4,9.9,6


### Training

We've set up our MLflow project and our dataset is ready, so we are now good to start training.
MLflow allows us to train our model with the following command:

``` bash
$ mlflow run . -P alpha=... -P l1_ratio=...
```

On each run, `mlflow` will set up the Conda environment defined by the `conda.yaml` file and will run the training commands defined in the `MLproject` file.

In [89]:
# !pip install mlflow
!mlflow run $PROJECT_DIR -P alpha=0.5 -P l1_ratio=0.5

2022/12/07 04:42:54 INFO mlflow.utils.virtualenv: Installing python 3.8.14 if it does not exist
2022/12/07 04:42:54 INFO mlflow.utils.virtualenv: Environment /Users/dileep.gadiraju/.mlflow/envs/mlflow-eee90753fcbd811c61a366651b95e9611177504d already exists
2022/12/07 04:42:55 INFO mlflow.projects.utils: === Created directory /var/folders/p9/nndbrbws0j9ghcfk3yw_0smh0000gp/T/tmpqztfkxys for downloading remote URIs passed to arguments of type 'path' ===
2022/12/07 04:42:55 INFO mlflow.projects.backend.local: === Running command 'source /Users/dileep.gadiraju/.mlflow/envs/mlflow-eee90753fcbd811c61a366651b95e9611177504d/bin/activate && python train.py 0.5 0.5' in run with ID '80c1581b2caa4ab4b42b2e1d882f2293' === 
Elasticnet model (alpha=0.500000, l1_ratio=0.500000):
  RMSE: 0.7931640229276851
  MAE: 0.6271946374319586
  R2: 0.10862644997792614
2022/12/07 04:43:10 INFO mlflow.projects: === Run (ID '80c1581b2caa4ab4b42b2e1d882f2293') succeeded ===


Each of these commands will create a new run which can be visualised through the MLFlow dashboard as per the screenshot below.

![mlflow-dashboard](images/mlflow-dashboard.png)


Each of these models can actually be found on the `mlruns` folder:

In [90]:
!tree -L 1 mlruns/0

[01;34mmlruns/0[0m
├── [01;34m02024ee8d05741b5b0f71938007e3ea7[0m
├── [01;34m5072d00e159e41d08fe3b46eb678a84e[0m
├── [01;34m80c1581b2caa4ab4b42b2e1d882f2293[0m
└── [00mmeta.yaml[0m

3 directories, 1 file


### MLmodel

Inside each of these folders, MLflow stores the parameters we used to train our model, any metric we logged during training, and a snapshot of our model.
If we look into one of them, we can see the following structure:

In [91]:
!tree mlruns/0/$(ls mlruns/0 | head -1)

[01;34mmlruns/0/02024ee8d05741b5b0f71938007e3ea7[0m
├── [01;34martifacts[0m
│   └── [01;34mmodel[0m
│       ├── [00mMLmodel[0m
│       ├── [00mconda.yaml[0m
│       ├── [00mmodel.pkl[0m
│       ├── [00mpython_env.yaml[0m
│       └── [00mrequirements.txt[0m
├── [00mmeta.yaml[0m
├── [01;34mmetrics[0m
│   ├── [00mmae[0m
│   ├── [00mr2[0m
│   └── [00mrmse[0m
├── [01;34mparams[0m
│   ├── [00malpha[0m
│   └── [00ml1_ratio[0m
└── [01;34mtags[0m
    ├── [00mmlflow.gitRepoURL[0m
    ├── [00mmlflow.log-model.history[0m
    ├── [00mmlflow.project.backend[0m
    ├── [00mmlflow.project.entryPoint[0m
    ├── [00mmlflow.project.env[0m
    ├── [00mmlflow.runName[0m
    ├── [00mmlflow.source.git.commit[0m
    ├── [00mmlflow.source.git.repoURL[0m
    ├── [00mmlflow.source.name[0m
    ├── [00mmlflow.source.type[0m
    └── [00mmlflow.user[0m

5 directories, 22 files


In particular, we are interested in the `MLmodel` file stored under `artifacts/model`:

In [92]:
!pygmentize -l yaml mlruns/0/$(ls mlruns/0 | head -1)/artifacts/model/MLmodel

[94martifact_path[39;49;00m:[37m [39;49;00mmodel[37m[39;49;00m
[94mflavors[39;49;00m:[37m[39;49;00m
[37m  [39;49;00m[94mpython_function[39;49;00m:[37m[39;49;00m
[37m    [39;49;00m[94menv[39;49;00m:[37m[39;49;00m
[37m      [39;49;00m[94mconda[39;49;00m:[37m [39;49;00mconda.yaml[37m[39;49;00m
[37m      [39;49;00m[94mvirtualenv[39;49;00m:[37m [39;49;00mpython_env.yaml[37m[39;49;00m
[37m    [39;49;00m[94mloader_module[39;49;00m:[37m [39;49;00mmlflow.sklearn[37m[39;49;00m
[37m    [39;49;00m[94mmodel_path[39;49;00m:[37m [39;49;00mmodel.pkl[37m[39;49;00m
[37m    [39;49;00m[94mpredict_fn[39;49;00m:[37m [39;49;00mpredict[37m[39;49;00m
[37m    [39;49;00m[94mpython_version[39;49;00m:[37m [39;49;00m3.8.14[37m[39;49;00m
[37m  [39;49;00m[94msklearn[39;49;00m:[37m[39;49;00m
[37m    [39;49;00m[94mcode[39;49;00m:[37m [39;49;00mnull[37m[39;49;00m
[37m    [39;49;00m[94mpickled_model[39;49;00m:[37m [39;49

This file stores the details of how the model was stored.
With this information (plus the other files in the folder), we are able to load the model back.
Seldon's MLflow server will use this information to serve this model.

Now we should upload our newly trained model into a public Google Bucket or S3 bucket.
We have already done this to make it simpler, which you will be able to find at `gs://seldon-models/mlflow/model-a`.

## 2. Deploy your model using the Pre-packaged Moldel Server for MLFlow

Now we can deploy our trained MLFlow model.

For this we have to create a Seldon definition of the model server definition, which we will break down further below.

We will be using the model we updated to our google bucket (gs://seldon-models/mlflow/elasticnet_wine_1.8.0), but you can use your model if you uploaded it to a public bucket.

### Setup Seldon Core

Use the setup notebook to [Setup Cluster](https://docs.seldon.io/projects/seldon-core/en/latest/examples/seldon_core_setup.html#Setup-Cluster) with [Ambassador Ingress](https://docs.seldon.io/projects/seldon-core/en/latest/examples/seldon_core_setup.html#Ambassador) and [Install Seldon Core](https://docs.seldon.io/projects/seldon-core/en/latest/examples/seldon_core_setup.html#Install-Seldon-Core). Instructions [also online](https://docs.seldon.io/projects/seldon-core/en/latest/examples/seldon_core_setup.html).

In [93]:
!pygmentize ./seldon-core/examples/models/mlflow_server_ab_test_ambassador/mlflow-model-server-seldon-config.yaml

[04m[36m---[39;49;00m[37m[39;49;00m
[94mapiVersion[39;49;00m:[37m [39;49;00mmachinelearning.seldon.io/v1alpha2[37m[39;49;00m
[94mkind[39;49;00m:[37m [39;49;00mSeldonDeployment[37m[39;49;00m
[94mmetadata[39;49;00m:[37m[39;49;00m
[37m  [39;49;00m[94mname[39;49;00m:[37m [39;49;00mmlflow-deployment[37m[39;49;00m
[94mspec[39;49;00m:[37m[39;49;00m
[37m  [39;49;00m[94mname[39;49;00m:[37m [39;49;00mmlflow-deployment[37m[39;49;00m
[37m  [39;49;00m[94mpredictors[39;49;00m:[37m[39;49;00m
[37m    [39;49;00m-[37m [39;49;00m[94mgraph[39;49;00m:[37m[39;49;00m
[37m        [39;49;00m[94mchildren[39;49;00m:[37m [39;49;00m[][37m[39;49;00m
[37m        [39;49;00m[94mimplementation[39;49;00m:[37m [39;49;00mMLFLOW_SERVER[37m[39;49;00m
[37m        [39;49;00m[94mmodelUri[39;49;00m:[37m [39;49;00mgs://seldon-models/mlflow/model-a[37m[39;49;00m
[37m        [39;49;00m[94mname[39;49;00m:[37m [39;49;00mwines-classifier

In [94]:
### Install Ambassador Edge Stack
!helm install ambassador datawire/ambassador \
    --set image.repository=docker.io/datawire/ambassador \
    --set crds.keep=false \
    --namespace seldon-system

Error: INSTALLATION FAILED: cannot re-use a name that is still in use


Once we write our configuration file, we are able to deploy it to our cluster by running it with our command

In [98]:
!cat ./seldon-core/examples/models/mlflow_server_ab_test_ambassador/mlflow-model-server-seldon-config.yaml
!kubectl delete -f ./seldon-core/examples/models/mlflow_server_ab_test_ambassador/mlflow-model-server-seldon-config.yaml
!kubectl apply -f ./seldon-core/examples/models/mlflow_server_ab_test_ambassador/mlflow-model-server-seldon-config.yaml

---
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: mlflow-deployment
spec:
  name: mlflow-deployment
  predictors:
    - graph:
        children: []
        implementation: MLFLOW_SERVER
        modelUri: gs://seldon-models/mlflow/model-a
        name: wines-classifier
      name: mlflow-deployment-dag
      replicas: 1
Error from server (NotFound): error when deleting "./seldon-core/examples/models/mlflow_server_ab_test_ambassador/mlflow-model-server-seldon-config.yaml": seldondeployments.machinelearning.seldon.io "mlflow-deployment" not found
Error from server (InternalError): error when creating "./seldon-core/examples/models/mlflow_server_ab_test_ambassador/mlflow-model-server-seldon-config.yaml": Internal error occurred: failed calling webhook "v1.vseldondeployment.kb.io": Post "https://seldon-webhook-service.seldon-system.svc:443/validate-machinelearning-seldon-io-v1-seldondeployment?timeout=10s": dial tcp 10.96.28.175:443: connect: connec

Once it's created we just wait until it's deployed. 

It will basically download the image for the pre-packaged MLFlow model server, and initialise it with the model we specified above.

You can check the status of the deployment with the following command:

In [72]:
!kubectl rollout status deployment.apps/mlflow-deployment-mlflow-deployment-dag-0-wines-classifier

Waiting for deployment "mlflow-deployment-mlflow-deployment-dag-0-wines-classifier" rollout to finish: 0 of 1 updated replicas are available...
^C


Once it's deployed, we should see a "succcessfully rolled out" message above. We can now test it!

## 3. Test the deployed MLFlow model by sending requests
Now that our model is deployed in Kubernetes, we are able to send any requests.

We will first need the URL that is currently available through Ambassador. 

If you are running this locally, you should be able to reach it through localhost, in this case we can use port 80.

In [96]:
!kubectl get svc -n seldon-system | grep ambassador

ambassador               LoadBalancer   10.96.209.201   <pending>     80:30785/TCP,443:31534/TCP   28m
ambassador-admin         ClusterIP      10.96.11.228    <none>        8877/TCP,8005/TCP            28m
ambassador-redis         ClusterIP      10.96.152.118   <none>        6379/TCP                     28m


Now we will select the first datapoint in our dataset to send to the model.

In [80]:
x_0 = data.drop(["quality"], axis=1).values[:1]
print(list(x_0[0]))

[7.0, 0.27, 0.36, 20.7, 0.045, 45.0, 170.0, 1.001, 3.0, 0.45, 8.8]


We can try sending a request first using curl:

In [None]:
## Ensure gateway port forwarded using 
## kubectl port-forward $(kubectl get pods -l istio=ingressgateway -n istio-system -o jsonpath='{.items[0].metadata.name}') -n istio-system 8004:8080

In [81]:
!curl -X POST -H 'Content-Type: application/json' \
    -d '{"data": {"names": [], "ndarray": [[7.0, 0.27, 0.36, 20.7, 0.045, 45.0, 170.0, 1.001, 3.0, 0.45, 8.8]]}}' \
    http://localhost:8004/seldon/seldon/mlflow-deployment/api/v0.1/predictions

We can also send the request by using our python client

In [None]:
import math
import subprocess

import numpy as np

from seldon_core.seldon_client import SeldonClient

HOST = "localhost"  # Add the URL you found above
port = "80"  # Make sure you use the port above
batch = x_0
payload_type = "ndarray"

sc = SeldonClient(
    gateway="ambassador", namespace="seldon", gateway_endpoint=HOST + ":" + port
)

client_prediction = sc.predict(
    data=batch, deployment_name="mlflow-deployment", names=[], payload_type=payload_type
)

print(client_prediction.response)

## 4. Deploy your second model as an A/B test

Now that we have a model in production, it's possible to deploy a second model as an A/B test.
Our model will also be an Elastic Net model but using a different set of parameters.
We can easily train it by leveraging MLflow:

In [None]:
!mlflow run $PROJECT_DIR -P alpha=0.75 -P l1_ratio=0.2

As we did before, we will now need to upload our model to a cloud bucket.
To speed things up, we already have done so and the second model is now accessible in `gs://seldon-models/mlflow/model-b`.

### A/B test

We will deploy our second model as an A/B test.
In particular, we will redirect 20% of the traffic to the new model.

This can be done by simply adding a `traffic` attribute on our `SeldonDeployment` spec:

In [None]:
!pygmentize ab-test-mlflow-model-server-seldon-config.yaml

And similar to the model above, we only need to run the following to deploy it:

In [None]:
!kubectl apply -f ab-test-mlflow-model-server-seldon-config.yaml

We can check that the models have been deployed and are running with the following command.

We should now see the "a-" model and the "b-" models.

In [None]:
!kubectl get pods

## 5. Visualise and monitor the performance of your models using Seldon Analytics

This section is optional, but by following the instructions you will be able to visualise the performance of both models as per the chart below.

In order for this example to work you need to install and run the [Grafana Analytics package for Seldon Core](https://docs.seldon.io/projects/seldon-core/en/latest/analytics/analytics.html#helm-analytics-chart).

For this we can access the URL with the command below, it will request an admin and password which by default are set to the following:
* Username: admin
* Password: password

You can access the grafana dashboard through the port provided below:

In [None]:
!kubectl get svc grafana-prom -o jsonpath='{.spec.ports[0].nodePort}'

Now that we have both models running in our Kubernetes cluster, we can analyse their performance using Seldon Core's integration with Prometheus and Grafana.
To do so, we will iterate over the training set (which can be found in `wine-quality.csv`), making a request and sending the feedback of the prediction.

Since the `/feedback` endpoint requires a `reward` signal (i.e. the higher the better), we will simulate one as:

$$
  R(x_{n})
    = \begin{cases}
        \frac{1}{(y_{n} - f(x_{n}))^{2}} &, y_{n} \neq f(x_{n}) \\
        500 &, y_{n} = f(x_{n})
      \end{cases}
$$

, where $R(x_{n})$ is the reward for input point $x_{n}$, $f(x_{n})$ is our trained model and $y_{n}$ is the actual value.

In [None]:
sc = SeldonClient(
    gateway="ambassador", namespace="seldon", deployment_name="wines-classifier"
)


def _get_reward(y, y_pred):
    if y == y_pred:
        return 500

    return 1 / np.square(y - y_pred)


def _test_row(row):
    input_features = row[:-1]
    feature_names = input_features.index.to_list()
    X = input_features.values.reshape(1, -1)
    y = row[-1].reshape(1, -1)

    # Note that we are re-using the SeldonClient defined previously
    r = sc.predict(deployment_name="mlflow-deployment", data=X, names=feature_names)

    y_pred = r.response["data"]["tensor"]["values"]
    reward = _get_reward(y, y_pred)
    sc.feedback(
        deployment_name="mlflow-deployment",
        prediction_request=r.request,
        prediction_response=r.response,
        reward=reward,
    )

    return reward[0]


data.apply(_test_row, axis=1)

You should now be able to see Seldon's pre-built Grafana dashboard.

![grafana-mlflow](images/grafana-mlflow.jpg)

In bottom of the dashboard you can see the following charts: 

- On the left: the requests per second, which shows the different traffic breakdown we specified.
- On the center: the reward, where you can see how model `a` outperforms model `b` by a large margin.
- On the right, the latency for each one of them.

You are able to add your own custom metrics, and try out other more complex deployments by following further guides at https://docs.seldon.io/projects/seldon-core/en/latest/examples/mlflow_server_ab_test_ambassador.html