# MLFlow and Seldon

### End to end example integrating MLFlow and Seldon, with A/B testing of the models.

![MLFlow](../images/mlflow_framework.png)

## Training

This first section covers how to train models using MLFlow.

### MLproject

The MLproject file defines:
- The environment where the training runs.
- The hyperparameters that can be tweaked. In our case, these are $\{\alpha, l_{1}\}$.
- The interface to train the model.

In [1]:
!ccat ./training/MLproject

[34mname[39;49;00m[31m:[39;49;00m [34mmlflow[39;49;00m[31m-[39;49;00m[34mtalk[39;49;00m

[34mconda_env[39;49;00m[31m:[39;49;00m [34mconda[39;49;00m[31m.[39;49;00m[34myaml[39;49;00m

[34mentry_points[39;49;00m[31m:[39;49;00m
  [34mmain[39;49;00m[31m:[39;49;00m
    [34mparameters[39;49;00m[31m:[39;49;00m
      [34malpha[39;49;00m[31m:[39;49;00m [34mfloat[39;49;00m
      [34ml1_ratio[39;49;00m[31m:[39;49;00m [31m{[39;49;00m[34mtype[39;49;00m[31m:[39;49;00m [34mfloat[39;49;00m[31m,[39;49;00m [34mdefault[39;49;00m[31m:[39;49;00m [34m0.1[39;49;00m[31m}[39;49;00m
    [34mcommand[39;49;00m[31m:[39;49;00m [33m"python train.py {alpha} {l1_ratio}"[39;49;00m


This allows us to have a single command to train the model. 

``` bash
$ mlflow run ./training -P alpha=... -P l1_ratio=...
```

For our example, we will train two versions of the model, which we'll later compare using A/B testing.

- $M_{1}$ with $\alpha = 0.5$
- $M_{2}$ with $\alpha = 0.75$

In [2]:
!mlflow run ./training -P alpha=0.5

2019/11/13 16:06:20 INFO mlflow.projects: === Created directory /tmp/tmp65kd3by1 for downloading remote URIs passed to arguments of type 'path' ===
2019/11/13 16:06:20 INFO mlflow.projects: === Running command 'source /home/akash/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-1ecba04797edb7e7f7212d429debd9b664c31651 1>&2 && python train.py 0.5 0.1' in run with ID '7d6024ced4fa4a23958e769993084a59' === 
Elasticnet model (alpha=0.500000, l1_ratio=0.100000):
  RMSE: 0.7947931019036529
  MAE: 0.6189130834228138
  R2: 0.18411668718221819
2019/11/13 16:06:22 INFO mlflow.projects: === Run (ID '7d6024ced4fa4a23958e769993084a59') succeeded ===


In [15]:
!mlflow run ./training -P alpha=1.0

2019/11/04 18:52:21 INFO mlflow.projects: === Created directory /tmp/tmpv4thjgnr for downloading remote URIs passed to arguments of type 'path' ===
2019/11/04 18:52:21 INFO mlflow.projects: === Running command 'source /home/akash/miniconda3/bin/../etc/profile.d/conda.sh && conda activate mlflow-1ecba04797edb7e7f7212d429debd9b664c31651 1>&2 && python train.py 1.0 0.1' in run with ID 'a3072d27f2cc40b990b9ff633c2c4131' === 
Elasticnet model (alpha=1.000000, l1_ratio=0.100000):
  RMSE: 0.8107373707184711
  MAE: 0.6241295925236751
  R2: 0.15105362812007328
2019/11/04 18:52:22 INFO mlflow.projects: === Run (ID 'a3072d27f2cc40b990b9ff633c2c4131') succeeded ===


### MLtrack

The `train.py` script uses the `mlflow.log_param()` and `mlflow.log_metric()` commands to track each experiment. These are part of the `MLtrack` API, which tracks experiments parameters and results. These can be stored on a remote server, which can then be shared across the entire team. However, on our example we will store these locally on a `mlruns` folder.

In [3]:
!tree mlruns

[01;34mmlruns[00m
└── [01;34m0[00m
    ├── [01;34m1ff63a6b537444df94458059bce313a7[00m
    │   ├── [01;34martifacts[00m
    │   │   └── [01;34mmodel[00m
    │   │       ├── conda.yaml
    │   │       ├── MLmodel
    │   │       └── model.pkl
    │   ├── meta.yaml
    │   ├── [01;34mmetrics[00m
    │   │   ├── mae
    │   │   ├── r2
    │   │   └── rmse
    │   ├── [01;34mparams[00m
    │   │   ├── alpha
    │   │   └── l1_ratio
    │   └── [01;34mtags[00m
    │       ├── mlflow.project.backend
    │       ├── mlflow.project.entryPoint
    │       ├── mlflow.project.env
    │       ├── mlflow.source.git.commit
    │       ├── mlflow.source.name
    │       ├── mlflow.source.type
    │       └── mlflow.user
    ├── [01;34m7d6024ced4fa4a23958e769993084a59[00m
    │   ├── [01;34martifacts[00m
    │   │   └── [01;34mmodel[00m
    │   │       ├── conda.yaml
    │   │       ├── MLmodel
    │   │       └── model.pkl
    │   ├── meta.yaml
   

We can also run `mlflow ui` to show these visually. This will start the MLflow server in http://localhost:5000.

```bash
$ mlflow ui
```

In [18]:
# !mlflow ui

[2019-11-04 18:52:34 +0530] [11409] [INFO] Starting gunicorn 19.9.0
[2019-11-04 18:52:34 +0530] [11409] [INFO] Listening at: http://127.0.0.1:5000 (11409)
[2019-11-04 18:52:34 +0530] [11409] [INFO] Using worker: sync
[2019-11-04 18:52:34 +0530] [11412] [INFO] Booting worker with pid: 11412
^C
[2019-11-04 18:52:45 +0530] [11409] [INFO] Handling signal: int
[2019-11-04 18:52:46 +0530] [11412] [INFO] Worker exiting (pid: 11412)


![MLFlow UI](../images/mlflow-ui.png)

### MLmodel

The `MLmodel` file allows us to version and share models easily. Below we can see an example.

In [4]:
!ccat ./mlruns/0/a3072d27f2cc40b990b9ff633c2c4131/artifacts/model/MLmodel

[34martifact_path[39;49;00m[31m:[39;49;00m [34mmodel[39;49;00m
[34mflavors[39;49;00m[31m:[39;49;00m
  [34mpython_function[39;49;00m[31m:[39;49;00m
    [34mdata[39;49;00m[31m:[39;49;00m [34mmodel[39;49;00m[31m.[39;49;00m[34mpkl[39;49;00m
    [34menv[39;49;00m[31m:[39;49;00m [34mconda[39;49;00m[31m.[39;49;00m[34myaml[39;49;00m
    [34mloader_module[39;49;00m[31m:[39;49;00m [34mmlflow[39;49;00m[31m.[39;49;00m[34msklearn[39;49;00m
    [34mpython_version[39;49;00m[31m:[39;49;00m [34m3.6[39;49;00m[34m.9[39;49;00m
  [34msklearn[39;49;00m[31m:[39;49;00m
    [34mpickled_model[39;49;00m[31m:[39;49;00m [34mmodel[39;49;00m[31m.[39;49;00m[34mpkl[39;49;00m
    [34mserialization_format[39;49;00m[31m:[39;49;00m [34mcloudpickle[39;49;00m
    [34msklearn_version[39;49;00m[31m:[39;49;00m [34m0.19[39;49;00m[34m.1[39;49;00m
[34mrun_id[39;49;00m[31m:[39;49;00m [34ma3072d27f2cc40b990b9ff633c2c4131[39;49;00m
[34

As we can see above the `MLmodel` keeps track, between others, of

- The experiment id, `a3072d27f2cc40b990b9ff633c2c4131`
- Date 
- Version of `sklearn` 
- How the model was stored

As we shall see shortly, the pre-packaged Seldon's model server will use this file to serve this model.

## Serving

### To serve this model we will use Seldon.
### Seldon Core is an open source platform for deploying machine learning models on a Kubernetes cluster.

![Seldon](../images/seldon.png)

### Why do we need this?

...

### Set up

Before anything, we will first set up the `k8s` cluster.

#### Create k8s cluster

We will create a local cluster using [kind](https://kind.sigs.k8s.io).

In [80]:
!kind create cluster
# !export KUBECONFIG="$(kind get kubeconfig-path --name=kind)"
# !kind delete cluster

Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.15.3) 🖼
 ✓ Preparing nodes 📦 
 ✓ Creating kubeadm config 📜 
 ✓ Starting control-plane 🕹️ 
 ✓ Installing CNI 🔌 
 ✓ Installing StorageClass 💾 
Cluster creation complete. You can now use the cluster with:

export KUBECONFIG="$(kind get kubeconfig-path --name="kind")"
kubectl cluster-info


In [81]:
!kubectl create namespace seldon
!kubectl config set-context $(kubectl config current-context) --namespace=seldon

namespace/seldon created
Context "kubernetes-admin@kind" modified.


In [82]:
!kubectl create clusterrolebinding kube-system-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default

clusterrolebinding.rbac.authorization.k8s.io/kube-system-cluster-admin created


We then install Helm and a corresponding service account.

In [83]:
!kubectl -n kube-system create sa tiller
!kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
!helm init --service-account tiller

serviceaccount/tiller created
clusterrolebinding.rbac.authorization.k8s.io/tiller created
$HELM_HOME has been configured at /home/akash/.helm.

Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.

Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
To prevent this, run `helm init` with the --tiller-tls-verify flag.
For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation
Happy Helming!


In [84]:
!kubectl rollout status deploy/tiller-deploy -n kube-system

Waiting for deployment "tiller-deploy" rollout to finish: 0 of 1 updated replicas are available...
deployment "tiller-deploy" successfully rolled out


In [85]:
!kind get clusters
!echo $KUBECONFIG
!kubectl cluster-info
!helm init --history-max 200
# !kubectl rollout status deploy/tiller-deploy -n kube-system
# !kubectl create serviceaccount --namespace kube-system tiller
# !kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
# !kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'

kind
/home/akash/.kube/kind-config-kind
[0;32mKubernetes master[0m is running at [0;33mhttps://127.0.0.1:41081[0m
[0;32mKubeDNS[0m is running at [0;33mhttps://127.0.0.1:41081/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy[0m

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
$HELM_HOME has been configured at /home/akash/.helm.
(Use --client-only to suppress this message, or --upgrade to upgrade Tiller to the current version.)
Happy Helming!


We can now install `seldon-core` on the new cluster using `helm`.

In [86]:
!helm install \
    seldon-core-operator \
    --name seldon-core \
    --repo https://storage.googleapis.com/seldon-charts \
    --namespace seldon-system \
    --set usagemetrics.enabled=true \
    --set ambassador.enabled=true

NAME:   seldon-core
LAST DEPLOYED: Wed Nov 13 18:42:47 2019
NAMESPACE: seldon-system
STATUS: DEPLOYED

RESOURCES:
==> v1/Role
NAME                         AGE
seldon-leader-election-role  0s
seldon-manager-cm-role       0s

==> v1/Deployment
NAME                       DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
seldon-controller-manager  1        1        1           0          0s

==> v1/ServiceAccount
NAME            SECRETS  AGE
seldon-manager  1        0s

==> v1beta1/CustomResourceDefinition
NAME                                         AGE
seldondeployments.machinelearning.seldon.io  0s

==> v1/ClusterRole
seldon-manager-css-role  0s
seldon-manager-role      0s
seldon-manager-sas-role  0s
seldon-proxy-role        0s

==> v1/ClusterRoleBinding
NAME                            AGE
seldon-manager-css-rolebinding  0s
seldon-manager-rolebinding      0s
seldon-manager-sas-rolebinding  0s
seldon-proxy-rolebinding        0s

==> v1/RoleBinding
NAME                                AGE
seldo

In [87]:
!kubectl rollout status deploy/seldon-controller-manager -n seldon-system

Waiting for deployment "seldon-controller-manager" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-controller-manager" successfully rolled out


Finally, we install `ambassador` which will allow us to reach the Seldon engine in the cluster.

In [88]:
!helm install stable/ambassador --name ambassador --set crds.keep=false

NAME:   ambassador
LAST DEPLOYED: Wed Nov 13 18:43:41 2019
NAMESPACE: seldon
STATUS: DEPLOYED

RESOURCES:
==> v1beta1/ClusterRoleBinding
NAME        AGE
ambassador  0s

==> v1/Service
NAME               TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)                     AGE
ambassador-admins  ClusterIP     10.101.122.99  <none>       8877/TCP                    0s
ambassador         LoadBalancer  10.102.36.166  <pending>    80:30649/TCP,443:31205/TCP  0s

==> v1/Deployment
NAME        DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
ambassador  3        3        3           0          0s

==> v1/Pod(related)
NAME                         READY  STATUS             RESTARTS  AGE
ambassador-79744f49fd-9w5pt  0/1    ContainerCreating  0         0s
ambassador-79744f49fd-wnnvz  0/1    ContainerCreating  0         0s
ambassador-79744f49fd-z4dcl  0/1    ContainerCreating  0         0s

==> v1/ServiceAccount
NAME        SECRETS  AGE
ambassador  1        1s

==> v1beta1/CustomResourceDefinition
NAM

In [89]:
!kubectl rollout status deployment.apps/ambassador

Waiting for deployment "ambassador" rollout to finish: 0 of 3 updated replicas are available...
Waiting for deployment "ambassador" rollout to finish: 1 of 3 updated replicas are available...
Waiting for deployment "ambassador" rollout to finish: 2 of 3 updated replicas are available...
deployment "ambassador" successfully rolled out


#### Forward port

Once the cluster has been created, we need to allow access from the outside to the `ambassador` gateway.
One way to do this is to use the `kubectl port-forward` command.
In particular, we will forward port `8003` of our local host to the cluster's gateway.

This command needs to run constantly on the background, so **please make sure you run it on a separate terminal**.

```bash
kubectl \
    port-forward \
    $(kubectl get pods \
        -l app.kubernetes.io/name=ambassador -o jsonpath='{.items[0].metadata.name}') \
    8003:8080
```

#### Install Seldon Core Analytics

Later, after we deploy the models, we will compare their performance using Seldon Core's integration with Prometheus and Grafana.
For that part to work, we first need to install Grafana.

In [100]:
!helm install seldon-core-analytics --name seldon-core-analytics \
     --repo https://storage.googleapis.com/seldon-charts \
     --set grafana_prom_admin_password=password \
     --set persistence.enabled=false

NAME:   seldon-core-analytics
LAST DEPLOYED: Wed Nov 13 19:16:27 2019
NAMESPACE: seldon
STATUS: DEPLOYED

RESOURCES:
==> v1/Pod(related)
NAME                                      READY  STATUS             RESTARTS  AGE
grafana-prom-import-dashboards-d867r      0/1    ContainerCreating  0         1s
alertmanager-deployment-56c4cb6977-86rmk  0/1    ContainerCreating  0         1s
grafana-prom-deployment-8564b575dd-kgk6g  0/1    ContainerCreating  0         0s
prometheus-node-exporter-tkw6t            0/1    ContainerCreating  0         0s
prometheus-deployment-847fdcf987-lcrvp    0/1    Pending            0         0s

==> v1beta1/ClusterRole
NAME        AGE
prometheus  1s

==> v1/Job
NAME                            DESIRED  SUCCESSFUL  AGE
grafana-prom-import-dashboards  1        0           1s

==> v1beta1/Deployment
NAME                     DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
alertmanager-deployment  1        1        1           0          1s
grafana-prom-deployment  1      

To access Grafana, it will be necessary to forward the port to the respective pod as we did previously to access the Seldon Core deployment.
The credentials will be simply `admin` // `password`.

This command needs to run constantly on the background, so **please make sure you run it on a separate terminal**.

```bash
$ kubectl port-forward \
    $(kubectl get pods \
        -l app=grafana-prom-server -o jsonpath='{.items[0].metadata.name}') \
    3000:3000
```

### Deploy models

Once the cluster is set up, the next step will be to upload these models into a common repository and to deploy two `SeldonDeployment` specs to `k8s`.

#### Upload models (optional)

To make sure our `k8s` pods have access to the models we have just trained using `MLflow`, we will upload them into Google Cloud Storage. Note that to run these commands you need write access into the `gs://seldon-models` bucket and you need to have `gsutil` set up.

We will upload both versions of the model to:

- `gs://seldon-models/mlflow/model-a`
- `gs://seldon-models/mlflow/model-b`

In [31]:
# !gsutil cp -r mlruns/0/a3072d27f2cc40b990b9ff633c2c4131/artifacts/model gs://seldon-models-dhs/mlflow/model-a
!gsutil ls

gs://seldon-models-dhs/


#### Deploy specs

We will deploy our A/B inference graph to our `k8s` cluster. As we can see below, we will route 50% of the traffic to each of the models.

In [90]:
!pygmentize ./serving/model-a.yaml
!kubectl apply -f ./serving/model-a.yaml

[04m[36m---[39;49;00m
[94mapiVersion[39;49;00m: machinelearning.seldon.io/v1alpha2
[94mkind[39;49;00m: SeldonDeployment
[94mmetadata[39;49;00m:
  [94mname[39;49;00m: model-a
[94mspec[39;49;00m:
  [94mname[39;49;00m: model-a
  [94mpredictors[39;49;00m:
  - [94mgraph[39;49;00m:
      [94mchildren[39;49;00m: []
      [94mimplementation[39;49;00m: MLFLOW_SERVER
      [94mmodelUri[39;49;00m: gs://seldon-models-dhs/mlflow/model-a
      [94mname[39;49;00m: wines-classifier
    [94mname[39;49;00m: default
    [94mreplicas[39;49;00m: 1
seldondeployment.machinelearning.seldon.io/model-a created


We can verify these have been deployed by checking the pods and `SeldonDeployment` resources in the cluster.

In [94]:
!kubectl get pods

NAME                                       READY   STATUS    RESTARTS   AGE
ambassador-79744f49fd-9w5pt                1/1     Running   0          9m12s
ambassador-79744f49fd-wnnvz                1/1     Running   0          9m12s
ambassador-79744f49fd-z4dcl                1/1     Running   0          9m12s
model-a-default-77efeb1-7f687c5b8b-hjts2   2/2     Running   0          6m8s


In [92]:
!kubectl get deploy

NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
ambassador                3/3     3            3           4m8s
model-a-default-77efeb1   0/1     1            0           64s


In [93]:
!kubectl rollout status deploy/model-a-default-77efeb1

Waiting for deployment "model-a-default-77efeb1" rollout to finish: 0 of 1 updated replicas are available...
deployment "model-a-default-77efeb1" successfully rolled out


In [95]:
!kubectl get sdep

NAME      AGE
model-a   6m9s


#### Test models

We will now run a sample query to test that the inference graph is working.

In [96]:
from seldon_core.seldon_client import SeldonClient
sc = SeldonClient(deployment_name="model-a",namespace="seldon")

In [113]:
r = sc.predict(gateway="ambassador",transport="rest",shape=(1,11))
print(r)

Success:True message:
Request:
data {
  tensor {
    shape: 1
    shape: 11
    values: 0.679423648454573
    values: 0.40174071357426766
    values: 0.4907872244252788
    values: 0.9652199625809575
    values: 0.35011405617402425
    values: 0.5314154532123815
    values: 0.9085835710372856
    values: 0.14433139025482644
    values: 0.2971222340126688
    values: 0.09946862715362115
    values: 0.8433473381558532
  }
}

Response:
meta {
  puid: "36f0h6m1tss6j5aumep1i4019f"
  requestPath {
    key: "wines-classifier"
    value: "seldonio/mlflowserver_rest:0.2"
  }
}
data {
  tensor {
    shape: 1
    values: 4.792492792022103
  }
}



In [99]:
    
!http \
    --print b \
    localhost:8003/seldon/default/wines-classifier/api/v0.1/predictions \
    data:='{\
        "names": ["fixed acidity","volatile acidity","citric acid","residual sugar","chlorides","free sulfur dioxide","total sulfur dioxide","density","pH","sulphates","alcohol"], \
        "ndarray": [[7,0.27,0.36,20.7,0.045,45,170,1.001,3,0.45,8.8]] \
    }'





### Analytics

Now that we have both models running in production, we can analyse their performance using Seldon Core's integration with Prometheus and Grafana.
To do so, we will iterate over the training set (which can be foud in `./training/wine-quality.csv`), making a request and sending the feedback of the prediction.

Since the `/feedback` endpoint requires a `reward` signal (i.e. higher better), we will simulate one as

$$
  R(x_{n})
    = \begin{cases}
        \frac{1}{(y_{n} - f(x_{n}))^{2}} &, y_{n} \neq f(x_{n}) \\
        500 &, y_{n} = f(x_{n})
      \end{cases}
$$

, where $R(x_{n})$ is the reward for input point $x_{n}$, $f(x_{n})$ is our trained model and $y_{n}$ is the actual value.

In [115]:
import pandas as pd
import numpy as np
from seldon_core.seldon_client import SeldonClient

sc = SeldonClient(
    gateway="ambassador", 
    namespace="seldon",
    deployment_name='model-a')

df = pd.read_csv("../data/wine-quality.csv")

def _get_reward(y, y_pred):
    if y == y_pred:
        return 500    
    
    return 1 / np.square(y - y_pred)

def _test_row(row):
    input_features = row[:-1]
    feature_names = input_features.index.to_list()
    X = input_features.values.reshape(1, -1)
    y = row[-1].reshape(1, -1)
    
    r = sc.predict(
        data=X,
        names=feature_names)
    
    y_pred = r.response.data.tensor.values
    reward = _get_reward(y, y_pred)
    sc.feedback(
        prediction_request=r.request,
        prediction_response=r.response,
        reward=reward)
    
    return reward[0]

df.apply(_test_row, axis=1)

0        [6.292674122292678]
1       [5.3391608345408565]
2       [264.15460719776115]
3       [11.500813676302768]
4       [11.500813676302768]
                ...         
4893    [339.87306395634363]
4894     [1.454915963262557]
4895    [21.788327812948797]
4896    [1.4281815246963911]
4897    [114.20670836019673]
Length: 4898, dtype: object

 We can now access the Grafana dashboard in http://localhost:3000 (credentials are `admin` // `password`). Inside the portal, we will go to the Prediction Analytics dashboard.
 
 
We can see a snapshot below.

![Seldon Analytics](../images/seldon-analytics.png)