# Model Explainer Example

![architecture](architecture.png)

In this example we will:

  * [Describe the project structure](#Project-Structure)
  * [Train some models](#Train-Models)
  * [Create Tempo artifacts](#Create-Tempo-Artifacts)
  * [Run unit tests](#Unit-Tests)
  * [Save python environment for our classifier](#Save-Classifier-Environment)
  * [Test Locally on Docker](#Test-Locally-on-Docker)
  * [Production on Kubernetes via Tempo](#Production-Option-1-(Deploy-to-Kubernetes-with-Tempo))
  * [Prodiuction on Kuebrnetes via GitOps](#Production-Option-2-(Gitops))

## Prerequisites

This notebooks needs to be run in the `tempo-examples` conda environment defined below. Create from project root folder:

```bash
conda env create --name tempo-examples --file conda/tempo-examples.yaml
```

## Project Structure

In [1]:
!tree -P "*.py"  -I "__init__.py|__pycache__" -L 2

[01;34m.[00m
├── [01;34martifacts[00m
│   ├── [01;34mexplainer[00m
│   └── [01;34mmodel[00m
├── [01;34mk8s[00m
│   └── [01;34mrbac[00m
└── [01;34msrc[00m
    ├── constants.py
    ├── data.py
    ├── explainer.py
    ├── model.py
    └── tempo.py

6 directories, 5 files


## Train Models

 * This section is where as a data scientist you do your work of training models and creating artfacts.
 * For this example we train sklearn and xgboost classification models for the iris dataset.

In [2]:
import os
from tempo.utils import logger
import logging
import numpy as np
import json
logger.setLevel(logging.ERROR)
logging.basicConfig(level=logging.ERROR)
ARTIFACTS_FOLDER = os.getcwd()+"/artifacts"

In [3]:
from src.data import AdultData
data = AdultData()

In [4]:
from src.model import train_model
adult_model = train_model(ARTIFACTS_FOLDER, data)

Train accuracy:  0.9656333333333333
Test accuracy:  0.854296875


In [5]:
from src.explainer import train_explainer
train_explainer(ARTIFACTS_FOLDER, data, adult_model)

AnchorTabular(meta={
  'name': 'AnchorTabular',
  'type': ['blackbox'],
  'explanations': ['local'],
  'params': {'disc_perc': (25, 50, 75), 'seed': 1}}
)

## Create Tempo Artifacts


In [6]:
from src.tempo import create_tempo_artifacts
adult_model, explainer = create_tempo_artifacts(ARTIFACTS_FOLDER)

In [None]:
# %load src/tempo.py
import os
from typing import Any, Tuple

import dill
import numpy as np
from alibi.utils.wrappers import ArgmaxTransformer
from src.constants import EXPLAINER_FOLDER, MODEL_FOLDER

from tempo.serve.metadata import ModelFramework
from tempo.serve.model import Model
from tempo.serve.pipeline import PipelineModels
from tempo.serve.utils import pipeline, predictmethod


def create_tempo_artifacts(artifacts_folder: str) -> Tuple[Model, Any]:
    sklearn_model = Model(
        name="income-sklearn",
        platform=ModelFramework.SKLearn,
        local_folder=f"{artifacts_folder}/{MODEL_FOLDER}",
        uri="gs://seldon-models/test/income/model",
    )

    @pipeline(
        name="income-explainer",
        uri="s3://tempo/explainer/pipeline",
        local_folder=f"{artifacts_folder}/{EXPLAINER_FOLDER}",
        models=PipelineModels(sklearn=sklearn_model),
    )
    class ExplainerPipeline(object):
        def __init__(self):
            if "MLSERVER_MODELS_DIR" in os.environ:
                models_folder = ""
            else:
                models_folder = f"{artifacts_folder}/{EXPLAINER_FOLDER}"
            with open(models_folder + "/explainer.dill", "rb") as f:
                self.explainer = dill.load(f)
            self.ran_init = True

        def update_predict_fn(self, x):
            if np.argmax(self.models.sklearn(x).shape) == 0:
                self.explainer.predictor = self.models.sklearn
                self.explainer.samplers[0].predictor = self.models.sklearn
            else:
                self.explainer.predictor = ArgmaxTransformer(self.models.sklearn)
                self.explainer.samplers[0].predictor = ArgmaxTransformer(self.models.sklearn)

        @predictmethod
        def explain(self, payload: np.ndarray, parameters: dict) -> str:
            print("Explain called with ", parameters)
            if not self.ran_init:
                print("Loading explainer")
                self.__init__()
            self.update_predict_fn(payload)
            explanation = self.explainer.explain(payload, **parameters)
            return explanation.to_json()

    explainer = ExplainerPipeline()
    return sklearn_model, explainer


## Save Outlier and Svc Environments


In [8]:
!cat artifacts/explainer/conda.yaml

name: tempo
channels:
  - defaults
dependencies:
  - python=3.7.9
  - pip:
    - alibi
    - dill
    - mlops-tempo @ file:///home/clive/work/mlops/fork-tempo
    - mlserver==0.3.1.dev7


In [9]:
from tempo.serve.loader import save
save(explainer)

Collecting packages...
Packing environment at '/home/clive/anaconda3/envs/tempo-79a239fd-9e5f-4984-ad1a-24bc618b9d4a' to '/home/clive/work/mlops/fork-tempo/docs/examples/explainer/artifacts/explainer/environment.tar.gz'
[########################################] | 100% Completed |  1min  6.6s


## Test Locally on Docker

Here we test our models using production images but running locally on Docker. This allows us to ensure the final production deployed model will behave as expected when deployed.

In [10]:
from tempo.seldon.docker import SeldonDockerRuntime
docker_runtime = SeldonDockerRuntime()
docker_runtime.deploy(explainer)
docker_runtime.wait_ready(explainer)

In [11]:
r = json.loads(explainer(payload=data.X_test[0:1], parameters={"threshold":0.99}))
print(r["data"]["anchor"])

Explain called with  {'threshold': 0.99}
['Marital Status = Separated', 'Sex = Female', 'Capital Gain <= 0.00', 'Education = Associates']


In [13]:
r = json.loads(explainer.remote(payload=data.X_test[0:1], parameters={"threshold":0.99}))
print(r["data"]["anchor"])

['Marital Status = Separated', 'Sex = Female', 'Capital Gain <= 0.00', 'Education = Associates', 'Age > 28.00']


In [14]:
docker_runtime.undeploy(explainer)

## Production Option 1 (Deploy to Kubernetes with Tempo)

 * Here we illustrate how to run the final models in "production" on Kubernetes by using Tempo to deploy
 
### Prerequisites
 
 Create a Kind Kubernetes cluster with Minio and Seldon Core installed using Ansible from the Tempo project Ansible playbook.
 
 ```
 ansible-playbook ansible/playbooks/default.yaml
 ```

In [15]:
!kubectl apply -f k8s/rbac -n production

secret/minio-secret configured
serviceaccount/tempo-pipeline unchanged
role.rbac.authorization.k8s.io/tempo-pipeline unchanged
rolebinding.rbac.authorization.k8s.io/tempo-pipeline-rolebinding unchanged


In [16]:
from tempo.examples.minio import create_minio_rclone
import os
create_minio_rclone(os.getcwd()+"/rclone-minio.conf")

In [17]:
from tempo.serve.loader import upload
upload(adult_model)
upload(explainer)

In [18]:
from tempo.serve.metadata import RuntimeOptions, KubernetesOptions
runtime_options = RuntimeOptions(
        k8s_options=KubernetesOptions(
            namespace="production",
            authSecretName="minio-secret"
        )
    )

In [19]:
from tempo.seldon.k8s import SeldonKubernetesRuntime
k8s_runtime = SeldonKubernetesRuntime(runtime_options)
k8s_runtime.deploy(explainer)
k8s_runtime.wait_ready(explainer)

In [21]:
r = json.loads(explainer.remote(payload=data.X_test[0:1], parameters={"threshold":0.95}))
print(r["data"]["anchor"])

['Relationship = Unmarried', 'Sex = Female']


In [22]:
k8s_runtime.undeploy(explainer)

## Production Option 2 (Gitops)

 * We create yaml to provide to our DevOps team to deploy to a production cluster
 * We add Kustomize patches to modify the base Kubernetes yaml created by Tempo

In [23]:
from tempo.seldon.k8s import SeldonKubernetesRuntime
k8s_runtime = SeldonKubernetesRuntime(runtime_options)
yaml_str = k8s_runtime.to_k8s_yaml(explainer)
with open(os.getcwd()+"/k8s/tempo.yaml","w") as f:
    f.write(yaml_str)

In [24]:
!kustomize build k8s

apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: income-explainer
  namespace: production
spec:
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - args: []
          env:
          - name: MLSERVER_HTTP_PORT
            value: "9000"
          - name: MLSERVER_GRPC_PORT
            value: "9500"
          - name: MLSERVER_MODEL_IMPLEMENTATION
            value: tempo.mlserver.InferenceRuntime
          - name: MLSERVER_MODEL_NAME
            value: income-explainer
          - name: MLSERVER_MODEL_URI
            value: /mnt/models
          - name: TEMPO_RUNTIME_OPTIONS
            value: '{"runtime": "tempo.seldon.SeldonKubernetesRuntime", "docker_options":
              {"defaultRuntime": "tempo.seldon.SeldonDockerRuntime"}, "k8s_options":
              {"replicas": 1, "minReplicas": null, "maxReplicas": null, "authSecretName":
              "minio-secret", "serviceAccountName": null, "defaultRuntime": "tempo.seldon.SeldonKub