# MLFlow and Seldon

End to end example integrating MLFlow and Seldon, with A/B testing of the models.
The slides accompanying this demo can be [found here](https://docs.google.com/presentation/d/1QXiOZkd_XNw6PbUalhYDajljKYQjgKczzNncTyLk9uA/edit?usp=sharing).

## Pre-requisites

### Python

The training part of the example assumes that you are able to run `mlflow` on your local environment.
To set it up, you can run:

In [2]:
!pip install -r requirements.txt

You should consider upgrading via the '/home/agm/.virtualenvs/mlflow-talk/bin/python -m pip install --upgrade pip' command.[0m


### Kubernetes

The serving side of the example assumes that you've got access to a Kubernetes cluster where Seldon Core is installed.
If you don't have access to a local cluster, feel free to use [`kind`](https://kind.sigs.k8s.io/).

For instructions on how to install Seldon Core, please check their [setup docs](https://docs.seldon.io/projects/seldon-core/en/latest/workflow/install.html).

### Analytics

Additionally, after we deploy the models, we will compare their performance using Seldon Core's integration with Prometheus and Grafana.
For that part to work, we will need to install Prometheus and Grafana.

To speed things up, we can do this through the [`seldon-core-analytics` chart](https://docs.seldon.io/projects/seldon-core/en/latest/charts/seldon-core-analytics.html).

In [5]:
!helm install seldon-core-analytics \
    seldon-core-analytics \
    --namespace seldon-system \
    --repo https://storage.googleapis.com/seldon-charts \
    --set grafana.adminPassword=password \
    --set grafana.adminUser=admin

NAME: seldon-core-analytics
LAST DEPLOYED: Mon Oct 26 17:46:16 2020
NAMESPACE: seldon-system
STATUS: deployed
REVISION: 1


## Training

This first section will cover how to train models using MLFlow.

### MLflow Project

The MLproject file defines:
- The environment where the training runs.
- The hyperparameters that can be tweaked. In our case, these are $\{\alpha, l_{1}\}$.
- The interface to train the model.

In [6]:
%%writefile ./training/MLproject
name: mlflow-talk

conda_env: conda.yaml

entry_points:
  main:
    parameters:
      alpha: float
      l1_ratio: {type: float, default: 0.1}
    command: "python train.py {alpha} {l1_ratio}"

Overwriting ./training/MLproject


This allows us to have a single command to train the model. 

``` bash
$ mlflow run ./training -P alpha=... -P l1_ratio=...
```

For our example, we will train two versions of the model, which we'll later compare using A/B testing.

- $M_{1}$ with $\alpha = 0.5$
- $M_{2}$ with $\alpha = 0.75$

In [31]:
!mlflow run ./training -P alpha=0.1

2020/10/27 12:54:06 INFO mlflow.projects.utils: === Created directory /tmp/tmpk8nx2_0w for downloading remote URIs passed to arguments of type 'path' ===
2020/10/27 12:54:06 INFO mlflow.projects.backend.local: === Running command 'source /opt/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-8dcff9dffe20ef7df93f638f77ce4b6257ec7b18 1>&2 && python train.py 0.1 0.1' in run with ID 'c047eddcdc2d4a08963f08516fd18d74' === 
Elasticnet model (alpha=0.100000, l1_ratio=0.100000):
  RMSE: 0.7792546522251949
  MAE: 0.6112547988118587
  R2: 0.2157063843066196
2020/10/27 12:54:08 INFO mlflow.projects: === Run (ID 'c047eddcdc2d4a08963f08516fd18d74') succeeded ===


In [32]:
!mlflow run ./training -P alpha=1.0

2020/10/27 12:54:12 INFO mlflow.projects.utils: === Created directory /tmp/tmp82tamhfc for downloading remote URIs passed to arguments of type 'path' ===
2020/10/27 12:54:12 INFO mlflow.projects.backend.local: === Running command 'source /opt/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-8dcff9dffe20ef7df93f638f77ce4b6257ec7b18 1>&2 && python train.py 1.0 0.1' in run with ID '1aab766b2d9246ed85f8dbfec4e8743d' === 
Elasticnet model (alpha=1.000000, l1_ratio=0.100000):
  RMSE: 0.8107373707184711
  MAE: 0.6241295925236751
  R2: 0.15105362812007328
2020/10/27 12:54:13 INFO mlflow.projects: === Run (ID '1aab766b2d9246ed85f8dbfec4e8743d') succeeded ===


### MLflow Tracking

The `train.py` script uses the `mlflow.log_param()` and `mlflow.log_metric()` commands to track each experiment. These are part of the `MLtrack` API, which tracks experiments parameters and results. These can be stored on a remote server, which can then be shared across the entire team. However, on our example we will store these locally on a `mlruns` folder.

In [3]:
!ls mlruns/0

06d9317c0c6945f5a0e57321acb5385b  9735daaa839b4934abeaf2768fc1be93
169119a7fe1e4b31a746e891499552b0  9cf21e1060974725936e4489ce75c640
182d08b34ff042c0b5f572dfa2e754db  ed0705646b8e40008cbfe440d74e4fb2
25396a9f87fd42bfadf98fb1802917c9  ef3d34fa48ac4ac29a85b1d6bd7236d0
373c06380a55446caa7ced6c5b5a1b96  f34e5dd1221b4d6da41268c8cf1691ff
45631cca7b8c4200aaee8b84a3c1f03d  meta.yaml
4f7b5bfa0db7404d915d6429902bd1c2  tags
5a6be5a1ef844783a50a6577745dbdc3


We can also run `mlflow ui` to show these visually. This will start the MLflow server in http://localhost:5000.

```bash
$ mlflow ui
```

![MLFlow UI](./images/mlflow-ui.png)

### MLflow Model

The `MLmodel` file allows us to version and share models easily. Below we can see an example.

In [34]:
!cat ./mlruns/0/5a6be5a1ef844783a50a6577745dbdc3/artifacts/model/MLmodel

artifact_path: model
flavors:
  python_function:
    data: model.pkl
    env: conda.yaml
    loader_module: mlflow.sklearn
    python_version: 3.6.9
  sklearn:
    pickled_model: model.pkl
    serialization_format: cloudpickle
    sklearn_version: 0.19.1
run_id: 5a6be5a1ef844783a50a6577745dbdc3
utc_time_created: '2019-10-02 14:21:15.783806'


As we can see above the `MLmodel` keeps track, between others, of

- The experiment id, `5a6be5a1ef844783a50a6577745dbdc3`
- Date 
- Version of `sklearn` 
- How the model was stored

As we shall see shortly, the pre-packaged Seldon's model server will use this file to serve this model.

#### Upload models (optional)

As a last step, we will persist the models we have just trained using `MLflow`. For that, we will upload them into Google Cloud Storage. Note that to run these commands you need write access into the `gs://seldon-models` bucket and you need to have `gsutil` set up.

Note that in a production setting, MLflow would be configured to log models against a persistent data store (e.g. GCS, Minio, etc.). In that case, this manual step wouldn't be needed.

We will upload both versions of the model to:

- `gs://seldon-models/mlflow/model-a`
- `gs://seldon-models/mlflow/model-b`

In [35]:
!gsutil cp -r mlruns/0/c047eddcdc2d4a08963f08516fd18d74/artifacts/model/* gs://seldon-models/mlflow/model-a
!gsutil cp -r mlruns/0/1aab766b2d9246ed85f8dbfec4e8743d/artifacts/model/* gs://seldon-models/mlflow/model-b

Copying file://mlruns/0/c047eddcdc2d4a08963f08516fd18d74/artifacts/model/conda.yaml [Content-Type=application/octet-stream]...
Copying file://mlruns/0/c047eddcdc2d4a08963f08516fd18d74/artifacts/model/MLmodel [Content-Type=application/octet-stream]...
Copying file://mlruns/0/c047eddcdc2d4a08963f08516fd18d74/artifacts/model/model.pkl [Content-Type=application/octet-stream]...
\ [3 files][  1.1 KiB/  1.1 KiB]                                                
Operation completed over 3 objects/1.1 KiB.                                      
Copying file://mlruns/0/1aab766b2d9246ed85f8dbfec4e8743d/artifacts/model/conda.yaml [Content-Type=application/octet-stream]...
Copying file://mlruns/0/1aab766b2d9246ed85f8dbfec4e8743d/artifacts/model/MLmodel [Content-Type=application/octet-stream]...
Copying file://mlruns/0/1aab766b2d9246ed85f8dbfec4e8743d/artifacts/model/model.pkl [Content-Type=application/octet-stream]...
\ [3 files][  1.1 KiB/  1.1 KiB]                                                
Op

## Serving

To serve this model we will use Seldon.

### Deploy models

Once the cluster is set up, the next step will to upload these models into a common repository and to deploy two `SeldonDeployment` specs to `k8s`. As we can see below, we will route 50% of the traffic to each of the models.

In [39]:
%%writefile ./serving/model-a-b.yaml
---
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: wines-classifier
spec:
  annotations:
    seldon.io/executor: "false" 
  predictors:
  - graph:
      children: []
      implementation: MLFLOW_SERVER
      modelUri: gs://seldon-models/mlflow/model-a
      name: wines-classifier
    name: model-a
    replicas: 1
    traffic: 50
    componentSpecs:
    - spec:
        # We are setting high failureThreshold as installing conda dependencies
        # can take long time and we want to avoid k8s killing the container prematurely
        containers:
        - name: wines-classifier
          livenessProbe:
            initialDelaySeconds: 60
            failureThreshold: 100
            periodSeconds: 5
            successThreshold: 1
            httpGet:
              path: /health/ping
              port: http
              scheme: HTTP
          readinessProbe:
            initialDelaySeconds: 60
            failureThreshold: 100
            periodSeconds: 5
            successThreshold: 1
            httpGet:
              path: /health/ping
              port: http
              scheme: HTTP
  - graph:
      children: []
      implementation: MLFLOW_SERVER
      modelUri: gs://seldon-models/mlflow/model-b
      name: wines-classifier
    name: model-b
    replicas: 1
    traffic: 50
    componentSpecs:
    - spec:
        # We are setting high failureThreshold as installing conda dependencies
        # can take long time and we want to avoid k8s killing the container prematurely
        containers:
        - name: wines-classifier
          livenessProbe:
            initialDelaySeconds: 60
            failureThreshold: 100
            periodSeconds: 5
            successThreshold: 1
            httpGet:
              path: /health/ping
              port: http
              scheme: HTTP
          readinessProbe:
            initialDelaySeconds: 60
            failureThreshold: 100
            periodSeconds: 5
            successThreshold: 1
            httpGet:
              path: /health/ping
              port: http
              scheme: HTTP

Overwriting ./serving/model-a-b.yaml


In [37]:
!kubectl apply -f ./serving/model-a-b.yaml

seldondeployment.machinelearning.seldon.io/wines-classifier created


We can verify these have been deployed by checking the pods and `SeldonDeployment` resources in the cluster.

In [38]:
!kubectl get pods

NAME                                                           READY   STATUS    RESTARTS   AGE
wines-classifier-model-a-0-wines-classifier-676985589d-97rh5   0/2     Running   0          24s
wines-classifier-model-b-0-wines-classifier-678d9477b9-zt4k6   0/2     Running   0          24s


In [33]:
!kubectl get sdep

NAME               AGE
wines-classifier   3m46s


### Test models

We will now run a sample query to test that the inference graph is working.

In [25]:
!curl \
    -X POST \
    -H 'Content-Type: application/json' \
    -d '{\
        "data": { \
            "names": ["fixed acidity","volatile acidity","citric acid","residual sugar","chlorides","free sulfur dioxide","total sulfur dioxide","density","pH","sulphates","alcohol"], \
            "ndarray": [[7,0.27,0.36,20.7,0.045,45,170,1.001,3,0.45,8.8]] \
        } \
    }' \
    http://localhost:8083/seldon/default/wines-classifier/api/v1.0/predictions

{
  "meta": {
    "puid": "2a05tajkudskg2lpav076ilsee",
    "tags": {
    },
    "routing": {
    },
    "requestPath": {
      "wines-classifier": "seldonio/mlflowserver_rest:1.3.0"
    },
    "metrics": []
  },
  "data": {
    "names": [],
    "ndarray": [5.554168965299016]
  }
}

## Analytics

To access Grafana, it will be necessary to forward the port to the respective pod as we did previously to access the Seldon Core deployment.
The credentials will be simply `admin` // `password`.

This command needs to run constantly on the background, so **please make sure you run it on a separate terminal**.

```bash
$ kubectl port-forward \
    $(kubectl get pods \
        -l app=grafana-prom-server -o jsonpath='{.items[0].metadata.name}') \
    3000:3000
```

Now that we have both models running in production, we can analyse their performance using Seldon Core's integration with Prometheus and Grafana.
To do so, we will iterate over the training set (which can be foud in `./training/wine-quality.csv`), making a request and sending the feedback of the prediction.

Since the `/feedback` endpoint requires a `reward` signal (i.e. higher better), we will simulate one as

$$
  R(x_{n})
    = \begin{cases}
        \frac{1}{(y_{n} - f(x_{n}))^{2}} &, y_{n} \neq f(x_{n}) \\
        500 &, y_{n} = f(x_{n})
      \end{cases}
$$

, where $R(x_{n})$ is the reward for input point $x_{n}$, $f(x_{n})$ is our trained model and $y_{n}$ is the actual value.

In [18]:
### %%writefile feedback.py
import pandas as pd
import numpy as np
from seldon_core.seldon_client import SeldonClient

sc = SeldonClient(
    gateway="istio", 
    namespace="default",
    gateway_endpoint="localhost:8083",
    deployment_name='wines-classifier')

df = pd.read_csv("./training/wine-quality.csv")

def _get_reward(y, y_pred):
    if y == y_pred:
        return 500    
    
    return 1 / np.square(y - y_pred)

def _test_row(row):
    input_features = row[:-1]
    feature_names = input_features.index.to_list()
    X = input_features.values.reshape(1, -1)
    y = row[-1].reshape(1, -1)
    
    r = sc.predict(
        data=X,
        names=feature_names)
    
    y_pred = r.response['data']['tensor']['values']
    reward = _get_reward(y, y_pred)
    sc.feedback(
        prediction_request=r.request,
        prediction_response=r.response,
        reward=reward)
    
    return reward[0]

df.apply(_test_row, axis=1)

Overwriting feedback.py


In [29]:
!python feedback.py

^C
Traceback (most recent call last):
  File "feedback.py", line 39, in <module>
    df.apply(_test_row, axis=1)
  File "/home/agm/.virtualenvs/mlflow-talk/lib/python3.7/site-packages/pandas/core/frame.py", line 7548, in apply
    return op.get_result()
  File "/home/agm/.virtualenvs/mlflow-talk/lib/python3.7/site-packages/pandas/core/apply.py", line 180, in get_result
    return self.apply_standard()
  File "/home/agm/.virtualenvs/mlflow-talk/lib/python3.7/site-packages/pandas/core/apply.py", line 271, in apply_standard
    results, res_index = self.apply_series_generator()
  File "/home/agm/.virtualenvs/mlflow-talk/lib/python3.7/site-packages/pandas/core/apply.py", line 300, in apply_series_generator
    results[i] = self.f(v)
  File "feedback.py", line 27, in _test_row
    names=feature_names)
  File "/home/agm/.virtualenvs/mlflow-talk/lib/python3.7/site-packages/seldon_core/seldon_client.py", line 387, in predict
    return rest_predict_gateway(**k)
  File "/home/agm/.virtualenvs/m

 We can now access the Grafana dashboard in http://localhost:3000 (credentials are `admin` // `password`). Inside the portal, we will go to the Prediction Analytics dashboard.
 
 
We can see a snapshot below.

![Seldon Analytics](./images/seldon-analytics.png)