# MLFlow and Seldon

End to end example integrating MLFlow and Seldon, with A/B testing of the models.
We also cover how to explain the model's predictions with Alibi.

## Training

This first section covers how to train models using MLFlow.

### MLproject

The MLproject file defines:
- The environment where the training runs.
- The hyperparameters that can be tweaked. In our case, these are $\{\alpha, l_{1}\}$.
- The interface to train the model.

In [2]:
!ccat ./training/MLproject

[34mname[39;49;00m[31m:[39;49;00m [34mmlflow[39;49;00m[31m-[39;49;00m[34mtalk[39;49;00m

[34mconda_env[39;49;00m[31m:[39;49;00m [34mconda[39;49;00m[31m.[39;49;00m[34myaml[39;49;00m

[34mentry_points[39;49;00m[31m:[39;49;00m
  [34mmain[39;49;00m[31m:[39;49;00m
    [34mparameters[39;49;00m[31m:[39;49;00m
      [34malpha[39;49;00m[31m:[39;49;00m [34mfloat[39;49;00m
      [34ml1_ratio[39;49;00m[31m:[39;49;00m [31m{[39;49;00m[34mtype[39;49;00m[31m:[39;49;00m [34mfloat[39;49;00m[31m,[39;49;00m [34mdefault[39;49;00m[31m:[39;49;00m [34m0.1[39;49;00m[31m}[39;49;00m
    [34mcommand[39;49;00m[31m:[39;49;00m [33m"python train.py {alpha} {l1_ratio}"[39;49;00m


This allows us to have a single command to train the model. 

``` bash
$ mlflow run ./training -P alpha=... -P l1_ratio=...
```

For our example, we will train two versions of the model, which we'll later compare using A/B testing.

- $M_{1}$ with $\alpha = 0.5$
- $M_{2}$ with $\alpha = 0.75$

In [71]:
!mlflow run ./training -P alpha=0.5

2019/10/02 21:45:18 INFO mlflow.projects: === Created directory /tmp/tmpnwewijx3 for downloading remote URIs passed to arguments of type 'path' ===
2019/10/02 21:45:18 INFO mlflow.projects: === Running command 'source /opt/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-1ecba04797edb7e7f7212d429debd9b664c31651 1>&2 && python train.py 0.5 0.1' in run with ID '9735daaa839b4934abeaf2768fc1be93' === 
Elasticnet model (alpha=0.500000, l1_ratio=0.100000):
  RMSE: 0.7947931019036529
  MAE: 0.6189130834228138
  R2: 0.18411668718221819
2019/10/02 21:45:19 INFO mlflow.projects: === Run (ID '9735daaa839b4934abeaf2768fc1be93') succeeded ===


In [6]:
!mlflow run ./training -P alpha=1.0

2019/10/02 15:13:04 INFO mlflow.projects: === Created directory /tmp/tmpm5k6nyxe for downloading remote URIs passed to arguments of type 'path' ===
2019/10/02 15:13:04 INFO mlflow.projects: === Running command 'source /opt/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-1ecba04797edb7e7f7212d429debd9b664c31651 1>&2 && python train.py 0.75 0.1' in run with ID '272fe7c7ffb345f089cfd6285772ef69' === 
Elasticnet model (alpha=0.750000, l1_ratio=0.100000):
  RMSE: 0.8037846644965104
  MAE: 0.6221040103321569
  R2: 0.16555194969389364
2019/10/02 15:13:05 INFO mlflow.projects: === Run (ID '272fe7c7ffb345f089cfd6285772ef69') succeeded ===


### MLtrack

The `train.py` script uses the `mlflow.log_param()` and `mlflow.log_metric()` commands to track each experiment. These are part of the `MLtrack` API, which tracks experiments parameters and results. These can be stored on a remote server, which can then be shared across the entire team. However, on our example we will store these locally on a `mlruns` folder.

In [7]:
!tree mlruns

[01;34mmlruns[00m
└── [01;34m0[00m
    ├── [01;34m272fe7c7ffb345f089cfd6285772ef69[00m
    │   ├── [01;34martifacts[00m
    │   │   └── [01;34mmodel[00m
    │   │       ├── conda.yaml
    │   │       ├── MLmodel
    │   │       └── model.pkl
    │   ├── meta.yaml
    │   ├── [01;34mmetrics[00m
    │   │   ├── mae
    │   │   ├── r2
    │   │   └── rmse
    │   ├── [01;34mparams[00m
    │   │   ├── alpha
    │   │   └── l1_ratio
    │   └── [01;34mtags[00m
    │       ├── mlflow.project.backend
    │       ├── mlflow.project.entryPoint
    │       ├── mlflow.project.env
    │       ├── mlflow.source.name
    │       ├── mlflow.source.type
    │       └── mlflow.user
    ├── [01;34m29876d07ee1b48d7b460cf38366eda06[00m
    │   ├── [01;34martifacts[00m
    │   │   └── [01;34mmodel[00m
    │   │       ├── conda.yaml
    │   │       ├── MLmodel
    │   │       └── model.pkl
    │   ├── meta.yaml
    │   ├── [01;34mmetrics[00m
    │   │   ├── mae
    │   │   ├── r2
   

We can also run `mlflow ui` to show these visually.

```bash
$ mlflow ui
```

![MLFlow UI](./images/mlflow-ui.png)

### MLmodel

The `MLmodel` file allows us to version and share models easily. Below we can see an example.

In [13]:
!ccat ./mlruns/0/5a6be5a1ef844783a50a6577745dbdc3/artifacts/model/MLmodel

[34martifact_path[39;49;00m[31m:[39;49;00m [34mmodel[39;49;00m
[34mflavors[39;49;00m[31m:[39;49;00m
  [34mpython_function[39;49;00m[31m:[39;49;00m
    [34mdata[39;49;00m[31m:[39;49;00m [34mmodel[39;49;00m[31m.[39;49;00m[34mpkl[39;49;00m
    [34menv[39;49;00m[31m:[39;49;00m [34mconda[39;49;00m[31m.[39;49;00m[34myaml[39;49;00m
    [34mloader_module[39;49;00m[31m:[39;49;00m [34mmlflow[39;49;00m[31m.[39;49;00m[34msklearn[39;49;00m
    [34mpython_version[39;49;00m[31m:[39;49;00m [34m3.6[39;49;00m[34m.9[39;49;00m
  [34msklearn[39;49;00m[31m:[39;49;00m
    [34mpickled_model[39;49;00m[31m:[39;49;00m [34mmodel[39;49;00m[31m.[39;49;00m[34mpkl[39;49;00m
    [34mserialization_format[39;49;00m[31m:[39;49;00m [34mcloudpickle[39;49;00m
    [34msklearn_version[39;49;00m[31m:[39;49;00m [34m0.19[39;49;00m[34m.1[39;49;00m
[34mrun_id[39;49;00m[31m:[39;49;00m [34m5[39;49;00m[34ma6be5a1ef844783a50a6577745dbdc3[39;49;00m


As we can see above the `MLmodel` keeps track, between others, of

- The experiment id, `5a6be5a1ef844783a50a6577745dbdc3`
- Date 
- Version of `sklearn` 
- How the model was stored

As we shall see shortly, the pre-packaged Seldon's model server will use this file to serve this model.

## Serving

To serve this model we will use Seldon.

### Set up

Before anything, we will first set up the `k8s` cluster.

#### Create k8s cluster

Firstly, we will create a cluster using [kind](https://kind.sigs.k8s.io).

In [16]:
!kind create cluster
!export KUBECONFIG="$(kind get kubeconfig-path --name=kind)"

Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.15.3) 🖼
 ✓ Preparing nodes 📦 
 ✓ Creating kubeadm config 📜 
 ✓ Starting control-plane 🕹️ 
 ✓ Installing CNI 🔌 
 ✓ Installing StorageClass 💾 
Cluster creation complete. You can now use the cluster with:

export KUBECONFIG="$(kind get kubeconfig-path --name="kind")"
kubectl cluster-info


We then install Helm and a corresponding service account.

In [17]:
!helm init --history-max 200
!kubectl rollout status deploy/tiller-deploy -n kube-system
!kubectl create serviceaccount --namespace kube-system tiller
!kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
!kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'

$HELM_HOME has been configured at /home/agm/.helm.

Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.

Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
To prevent this, run `helm init` with the --tiller-tls-verify flag.
For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation
Waiting for deployment spec update to be observed...
Waiting for deployment spec update to be observed...
Waiting for deployment "tiller-deploy" rollout to finish: 0 out of 1 new replicas have been updated...
Waiting for deployment "tiller-deploy" rollout to finish: 0 of 1 updated replicas are available...
deployment "tiller-deploy" successfully rolled out
serviceaccount/tiller created
clusterrolebinding.rbac.authorization.k8s.io/tiller-cluster-rule created
deployment.extensions/tiller-deploy patched


We can now install `seldon-core` on the new cluster using `helm`.

In [19]:
!helm install \
    seldon-core-operator \
    --name seldon-core \
    --repo https://storage.googleapis.com/seldon-charts \
    --namespace seldon-system \
    --set usagemetrics.enabled=true \
    --set ambassador.enabled=true
!kubectl rollout status statefulset.apps/seldon-operator-controller-manager -n seldon-system

NAME:   seldon-core
LAST DEPLOYED: Wed Oct  2 18:34:43 2019
NAMESPACE: seldon-system
STATUS: DEPLOYED

RESOURCES:
==> v1/ClusterRole
NAME                          AGE
seldon-operator-manager-role  0s

==> v1/ClusterRoleBinding
NAME                                 AGE
seldon-operator-manager-rolebinding  0s

==> v1/ConfigMap
NAME           DATA  AGE
seldon-config  1     0s

==> v1/Pod(related)
NAME                                  READY  STATUS             RESTARTS  AGE
seldon-operator-controller-manager-0  0/1    ContainerCreating  0         0s

==> v1/Secret
NAME                                   TYPE    DATA  AGE
seldon-operator-webhook-server-secret  Opaque  0     0s

==> v1/Service
NAME                                        TYPE       CLUSTER-IP     EXTERNAL-IP  PORT(S)  AGE
seldon-operator-controller-manager-service  ClusterIP  10.110.207.29  <none>       443/TCP  0s
webhook-server-service                      ClusterIP  10.97.95.144   <none>       443/TCP  0s

==> v1/ServiceAcco

Finally, we install `ambassador` which will allow us to reach the Seldon engine in the cluster.

In [20]:
!helm install stable/ambassador --name ambassador --set crds.keep=false
!kubectl rollout status deployment.apps/ambassador

NAME:   ambassador
LAST DEPLOYED: Wed Oct  2 18:52:59 2019
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/Deployment
NAME        READY  UP-TO-DATE  AVAILABLE  AGE
ambassador  0/3    3           0          0s

==> v1/Pod(related)
NAME                         READY  STATUS             RESTARTS  AGE
ambassador-5c76696fcc-7rdlh  0/1    ContainerCreating  0         0s
ambassador-5c76696fcc-m5ndq  0/1    ContainerCreating  0         0s
ambassador-5c76696fcc-p6ddq  0/1    ContainerCreating  0         0s

==> v1/Service
NAME              TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)                     AGE
ambassador        LoadBalancer  10.103.89.133  <pending>    80:31527/TCP,443:31186/TCP  0s
ambassador-admin  ClusterIP     10.111.105.26  <none>       8877/TCP                    0s

==> v1/ServiceAccount
NAME        SECRETS  AGE
ambassador  1        0s

==> v1beta1/ClusterRole
NAME             AGE
ambassador       0s
ambassador-crds  0s

==> v1beta1/ClusterRoleBinding
NAME      

#### Forward port

Once the cluster has been created, we need to allow access from the outside to the `ambassador` gateway.
One way to do this is to use the `kubectl port-forward` command.
In particular, we will forward port `8003` of our local host to the cluster's gateway.

This command needs to run constantly on the background, so **please make sure you run it on a separate terminal**.

```bash
kubectl \
    port-forward \
    $(kubectl get pods \
        -l app.kubernetes.io/name=ambassador -o jsonpath='{.items[0].metadata.name}') \
    8003:8080
```

#### Install Seldon Core Analytics

Later, after we deploy the models, we will compare their performance using Seldon Core's integration with Prometheus and Grafana.
For that part to work, we first need to install Grafana.

In [39]:
!helm install seldon-core-analytics --name seldon-core-analytics \
     --repo https://storage.googleapis.com/seldon-charts \
     --set grafana_prom_admin_password=password \
     --set persistence.enabled=false

NAME:   seldon-core-analytics
LAST DEPLOYED: Wed Oct  2 20:19:38 2019
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/ConfigMap
NAME                       DATA  AGE
alertmanager-server-conf   1     0s
grafana-import-dashboards  11    0s
prometheus-rules           0     0s
prometheus-server-conf     1     0s

==> v1/Job
NAME                            COMPLETIONS  DURATION  AGE
grafana-prom-import-dashboards  0/1          0s        0s

==> v1/Pod(related)
NAME                                      READY  STATUS             RESTARTS  AGE
alertmanager-deployment-db58649dd-mkcjn   0/1    ContainerCreating  0         0s
grafana-prom-deployment-8564b575dd-gvpfk  0/1    ContainerCreating  0         0s
grafana-prom-import-dashboards-xx9gl      0/1    ContainerCreating  0         0s
prometheus-deployment-d57b5c748-fk5gp     0/1    Pending            0         0s
prometheus-node-exporter-bcgkr            0/1    Pending            0         0s

==> v1/Secret
NAME                 TYPE    DAT

To access Grafana, it will be necessary to forward the port to the respective pod as we did previously to access the Seldon Core deployment.
The credentials will be simply `admin` // `password`.

This command needs to run constantly on the background, so **please make sure you run it on a separate terminal**.

```bash
$ kubectl port-forward \
    $(kubectl get pods \
        -l app=grafana-prom-server -o jsonpath='{.items[0].metadata.name}') \
    3000:3000
```

### Deploy models

Once the cluster is set up, the next step will to upload these models into a common repository and to deploy two `SeldonDeployment` specs to `k8s`.

#### Upload models (optional)

To make sure our `k8s` pods have access to the models we have just trained using `MLflow`, we will upload them into Google Cloud Storage. Note that to run these commands you need write access into the `gs://seldon-models` bucket and you need to have `gsutil` set up.

We will upload both versions of the model to:

- `gs://seldon-models/mlflow/model-a`
- `gs://seldon-models/mlflow/model-b`

In [8]:
!gsutil cp -r mlruns/0/169119a7fe1e4b31a746e891499552b0/artifacts/model gs://seldon-models/mlflow/model-a
!gsutil cp -r mlruns/0/5a6be5a1ef844783a50a6577745dbdc3/artifacts/model gs://seldon-models/mlflow/model-b

Copying file://mlruns/0/169119a7fe1e4b31a746e891499552b0/artifacts/model/model.pkl [Content-Type=application/octet-stream]...
Copying file://mlruns/0/169119a7fe1e4b31a746e891499552b0/artifacts/model/conda.yaml [Content-Type=application/octet-stream]...
Copying file://mlruns/0/169119a7fe1e4b31a746e891499552b0/artifacts/model/MLmodel [Content-Type=application/octet-stream]...
- [3 files][  1.1 KiB/  1.1 KiB]                                                
Operation completed over 3 objects/1.1 KiB.                                      
Copying file://mlruns/0/5a6be5a1ef844783a50a6577745dbdc3/artifacts/model/model.pkl [Content-Type=application/octet-stream]...
Copying file://mlruns/0/5a6be5a1ef844783a50a6577745dbdc3/artifacts/model/conda.yaml [Content-Type=application/octet-stream]...
Copying file://mlruns/0/5a6be5a1ef844783a50a6577745dbdc3/artifacts/model/MLmodel [Content-Type=application/octet-stream]...
- [3 files][  1.1 KiB/  1.1 KiB]                                                
Op

#### Deploy specs

We will deploy our A/B inference graph to our `k8s` cluster. As we can see below, we will route 50% of the traffic to each of the models.

In [24]:
!pygmentize ./serving/model-a-b.yaml
!kubectl apply -f ./serving/model-a-b.yaml 

[04m[36m---[39;49;00m
[94mapiVersion[39;49;00m: machinelearning.seldon.io/v1alpha2
[94mkind[39;49;00m: SeldonDeployment
[94mmetadata[39;49;00m:
  [94mname[39;49;00m: wines-classifier
[94mspec[39;49;00m:
  [94mname[39;49;00m: wines-classifier
  [94mpredictors[39;49;00m:
  - [94mgraph[39;49;00m:
      [94mchildren[39;49;00m: []
      [94mimplementation[39;49;00m: MLFLOW_SERVER
      [94mmodelUri[39;49;00m: gs://seldon-models/mlflow/model-a
      [94mname[39;49;00m: wines-classifier
    [94mname[39;49;00m: model-a
    [94mreplicas[39;49;00m: 1
    [94mtraffic[39;49;00m: 50
  - [94mgraph[39;49;00m:
      [94mchildren[39;49;00m: []
      [94mimplementation[39;49;00m: MLFLOW_SERVER
      [94mmodelUri[39;49;00m: gs://seldon-models/mlflow/model-b
      [94mname[39;49;00m: wines-classifier
    [94mname[39;49;00m: model-b
    [94mreplicas[39;49;00m: 1
    [94mtraffic[39;49;00m: 50
seldondeployment.machinelearning.seldon.io/wines-classifier created

We can verify these have been deployed by checking the pods and `SeldonDeployment` resources in the cluster.

In [25]:
!kubectl get pods

NAME                                                READY   STATUS     RESTARTS   AGE
ambassador-5c76696fcc-7rdlh                         1/1     Running    0          4m32s
ambassador-5c76696fcc-m5ndq                         1/1     Running    0          4m32s
ambassador-5c76696fcc-p6ddq                         1/1     Running    0          4m32s
wines-classifier-model-a-77efeb1-76b468f4dc-969jv   0/2     Init:0/1   0          4s
wines-classifier-model-b-77efeb1-64f6d4ddc-7rfcj    0/2     Init:0/1   0          4s


In [26]:
!kubectl get sdep

NAME               AGE
wines-classifier   4s


#### Test models

We will now run a sample query to test that the inference graph is working.

In [52]:
!http \
    --print b \
    localhost:8003/seldon/default/wines-classifier/api/v0.1/predictions \
    data:='{\
        "names": ["fixed acidity","volatile acidity","citric acid","residual sugar","chlorides","free sulfur dioxide","total sulfur dioxide","density","pH","sulphates","alcohol"], \
        "ndarray": [[7,0.27,0.36,20.7,0.045,45,170,1.001,3,0.45,8.8]] \
    }'

{
    [94m"data"[39;49;00m: {
        [94m"names"[39;49;00m: [],
        [94m"ndarray"[39;49;00m: [
            [34m5.554168965299016[39;49;00m
        ]
    },
    [94m"meta"[39;49;00m: {
        [94m"metrics"[39;49;00m: [],
        [94m"puid"[39;49;00m: [33m"oqotru9j2ga7h7gfcs8csqqhq6"[39;49;00m,
        [94m"requestPath"[39;49;00m: {
            [94m"wines-classifier"[39;49;00m: [33m"seldonio/mlflowserver_rest:0.2"[39;49;00m
        },
        [94m"routing"[39;49;00m: {},
        [94m"tags"[39;49;00m: {}
    }
}



### Analytics

Now that we have both models running in production, we can analyse their performance using Seldon Core's integration with Prometheus and Grafana.
To do so, we will iterate over the training set (which can be foud in `./training/wine-quality.csv`), making a request and sending the feedback of the prediction.

Since the `/feedback` endpoint requires a `reward` signal (i.e. higher better), we will simulate one as

$$
  R(x_{n})
    = \begin{cases}
        \frac{1}{(y_{n} - f(x_{n}))^{2}} &, y_{n} \neq f(x_{n}) \\
        500 &, y_{n} = f(x_{n})
      \end{cases}
$$

, where $R(x_{n})$ is the reward for input point $x_{n}$, $f(x_{n})$ is our trained model and $y_{n}$ is the actual value.

In [70]:
import pandas as pd
import numpy as np
from seldon_core.seldon_client import SeldonClient

sc = SeldonClient(
    gateway="ambassador", 
    namespace="default",
    deployment_name='wines-classifier')

df = pd.read_csv("./training/wine-quality.csv")

def _get_reward(y, y_pred):
    if y == y_pred:
        return 500    
    
    return 1 / np.square(y - y_pred)

def _test_row(row):
    input_features = row[:-1]
    feature_names = input_features.index.to_list()
    X = input_features.values.reshape(1, -1)
    y = row[-1].reshape(1, -1)
    
    r = sc.predict(
        data=X,
        names=feature_names)
    
    y_pred = r.response.data.tensor.values
    reward = _get_reward(y, y_pred)
    sc.feedback(
        prediction_request=r.request,
        prediction_response=r.response,
        reward=reward)
    
    return reward[0]

df.apply(_test_row, axis=1)

0        [5.031058953096899]
1        [3.570741580101588]
2        [70.55147231720159]
3       [10.762015969288063]
4       [10.762015969288063]
                ...         
4893     [375.9028525034419]
4894    [1.8348875422756648]
4895     [9.053003755655473]
4896    [3.8999412500944777]
4897     [22.25374886390087]
Length: 4898, dtype: object

 We can visualise the Grafana dashboard below.

![Seldon Analytics](./images/seldon-analytics.png)