## Explainer for income classifier model with poetry-defined environment 

### Prerequisites

- poetry 
- wget 
- curl 
- conda
- mc (minio client)

### Poetry

We will use poetry.lock to fully define the explainer environment. Install poetry following the official documentation. For Linux systems you can download with curl and then add poetry to your path: 

In [2]:
!curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python3 -

[32mRetrieving Poetry metadata[0m

[33mThis installer is deprecated. Poetry versions installed using this script will not be able to use 'self update' command to upgrade to 1.2.0a1 or later.[0m
Latest version already installed.


## Train Explainer 

### Prepare Training Environment 

Download the following files which define the dependencies 

In [3]:
!wget https://raw.githubusercontent.com/SeldonIO/seldon-core/master/components/alibi-explain-server/pyproject.toml

--2021-10-26 14:47:12--  https://raw.githubusercontent.com/SeldonIO/seldon-core/master/components/alibi-explain-server/pyproject.toml
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.111.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 734 [text/plain]
Saving to: ‘pyproject.toml’


2021-10-26 14:47:12 (15.7 MB/s) - ‘pyproject.toml’ saved [734/734]



In [4]:
!wget https://raw.githubusercontent.com/SeldonIO/seldon-core/master/components/alibi-explain-server/poetry.lock

--2021-10-26 14:47:16--  https://raw.githubusercontent.com/SeldonIO/seldon-core/master/components/alibi-explain-server/poetry.lock
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.111.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 238201 (233K) [text/plain]
Saving to: ‘poetry.lock’


2021-10-26 14:47:16 (5.80 MB/s) - ‘poetry.lock’ saved [238201/238201]



In [5]:
!conda create --yes --prefix ./venv python=3.7.10

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/sean/Developer/seldon-examples/seldon-deploy-examples/alibi-poetry/venv

  added / updated specs:
    - python=3.7.10


The following NEW packages will be INSTALLED:

  _libgcc_mutex      pkgs/main/linux-64::_libgcc_mutex-0.1-main
  _openmp_mutex      pkgs/main/linux-64::_openmp_mutex-4.5-1_gnu
  ca-certificates    pkgs/main/linux-64::ca-certificates-2021.9.30-h06a4308_1
  certifi            pkgs/main/linux-64::certifi-2021.10.8-py37h06a4308_0
  ld_impl_linux-64   pkgs/main/linux-64::ld_impl_linux-64-2.35.1-h7274673_9
  libffi             pkgs/main/linux-64::libffi-3.3-he6710b0_2
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-9.3.0-h5101ec6_17
  libgomp            pkgs/main/linux-64::libgomp-9.3.0-h5101ec6_17
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-9.3.0-hd4cf53a_17
  ncurses            pkgs/main/linux-64::ncurses-6.2-he6710b0_1
  open

You will now you need to open up a command prompt within your working directory to activate the virtual environment you just created and install all Alibi dependencies. Run the following commands within your working directory:

```conda activate ./venv```

```poetry install```

Now we have created the virtual environment with all dependencies installed we can train our explainer in that environment. You can now close the command prompt and continue using this notebook.

### Train explainer

Create the following training file to train and save an explainer on the adult dataset:

In [6]:
%%writefile train.py
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from alibi.explainers import AnchorTabular
from alibi.datasets import fetch_adult
from alibi.utils.data import gen_category_map

adult = fetch_adult()

data = adult.data
target = adult.target
feature_names = adult.feature_names
category_map = adult.category_map

np.random.seed(0)
data_perm = np.random.permutation(np.c_[data, target])
data = data_perm[:,:-1]
target = data_perm[:,-1]

idx = 30000
X_train,Y_train = data[:idx,:], target[:idx]
X_test, Y_test = data[idx+1:,:], target[idx+1:]

ordinal_features = [x for x in range(len(feature_names)) if x not in list(category_map.keys())]
ordinal_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')),
                                      ('scaler', StandardScaler())])

categorical_features = list(category_map.keys())
categorical_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')),
                                          ('onehot', OneHotEncoder(handle_unknown='ignore'))])

preprocessor = ColumnTransformer(transformers=[('num', ordinal_transformer, ordinal_features),
                                               ('cat', categorical_transformer, categorical_features)])
preprocessor.fit(X_train)

np.random.seed(0)
clf = RandomForestClassifier(n_estimators=50)
clf.fit(preprocessor.transform(X_train), Y_train)

predict_fn = lambda x: clf.predict(preprocessor.transform(x))
print('Train accuracy: ', accuracy_score(Y_train, predict_fn(X_train)))
print('Test accuracy: ', accuracy_score(Y_test, predict_fn(X_test)))

explainer = AnchorTabular(predict_fn, feature_names, categorical_names=category_map, seed=1)

explainer.fit(X_train, disc_perc=[25, 50, 75])

idx = 0
class_names = adult.target_names
print('Prediction: ', class_names[explainer.predictor(X_test[idx].reshape(1, -1))[0]])

explanation = explainer.explain(X_test[idx], threshold=0.95)
print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('Coverage: %.2f' % explanation.coverage)

idx = 6
class_names = adult.target_names
print('Prediction: ', class_names[explainer.predictor(X_test[idx].reshape(1, -1))[0]])

explanation = explainer.explain(X_test[idx], threshold=0.95)
print('Anchor: %s' % (' AND '.join(explanation.anchor)))
print('Precision: %.2f' % explanation.precision)
print('Coverage: %.2f' % explanation.coverage)

explainer.save("./explainer/")

Writing train.py


Train and save:

In [7]:
!./venv/bin/python3 train.py

2021-10-26 14:50:44.013459: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-10-26 14:50:44.013495: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
matplotlib could not be loaded!
matplotlib could not be loaded!
matplotlib could not be loaded!
matplotlib could not be loaded!
matplotlib could not be loaded!
matplotlib could not be loaded!
matplotlib could not be loaded!
IPython could not be loaded!
matplotlib could not be loaded!
matplotlib could not be loaded!
matplotlib could not be loaded!
matplotlib could not be loaded!
matplotlib could not be loaded!
Train accuracy:  0.9655333333333334
Test accuracy:  0.855859375
Prediction:  <=50K
Anchor: Marital Status = Separated AND Sex = Female
Precision: 0.95
Coverage: 0.11
Prediction:  >50K
Could not find 

### Copy model to MinIO

We will now copy our model to MinIO installed in the K8s cluster running Seldon. If you have MinIO installed, open up a seperate terminal and port-forward to MinIO with the following command:

```!kubectl port-forward -n minio-system svc/minio 8090:9000```

Add MinIO host and push the trained model artefact:

In [10]:
!mc config host add minio http://localhost:8090 admin@seldon.io 12341234

[m[32mAdded `minio` successfully.[0m
[0m

Copy `explainer` folder to minio:

In [14]:
!mc cp -r explainer minio/models/

...meta.dill:  164.23 MiB / 170.26 MiB ┃▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓┃ 1.61 MiB/s[0m[0m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[

...meta.dill:  170.26 MiB / 170.26 MiB ┃▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓┃ 1.39 MiB/s 2m2s[0m[0m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[m[32;1m[

### Deploy

In [None]:
!pip install seldon_deploy_sdk==1.4.0

In [15]:
import seldon_deploy_sdk

Make sure to set `SD_IP` to the correct IP address for your cluster:

In [42]:
from seldon_deploy_sdk import Configuration, ApiClient, SeldonDeploymentsApi
from seldon_deploy_sdk.auth import OIDCAuthenticator

SD_IP = ""

config = Configuration()
config.host = f"http://{SD_IP}/seldon-deploy/api/v1alpha1"
config.oidc_client_id = "sd-api"
config.oidc_client_secret = "sd-api-secret"
config.oidc_server = f"http://{SD_IP}/auth/realms/deploy-realm"
config.auth_method = "client_credentials"
auth = OIDCAuthenticator(config)

def auth():
    auth = OIDCAuthenticator(config)
    api_client = ApiClient(configuration=config, authenticator=auth)
    return api_client

In [43]:
DEPLOYMENT_NAME = "income"
NAMESPACE = "default"
PREPACKAGED_SERVER = "SKLEARN_SERVER"
MODEL_LOCATION = "gs://seldon-models/sklearn/income/model-0.23.2"

CPU_REQUESTS = "1"
MEMORY_REQUESTS = "1Gi"

CPU_LIMITS = "1"
MEMORY_LIMITS = "1Gi"

EXPLAINER_TYPE = "AnchorTabular"
EXPLAINER_URI = "s3://models/explainer"

mldeployment = {
    "kind": "SeldonDeployment",
    "metadata": {
        "name": DEPLOYMENT_NAME,
        "namespace": NAMESPACE,
        "labels": {
            "fluentd": "true"
        }
    },
    "apiVersion": "machinelearning.seldon.io/v1alpha2",
    "spec": {
        "name": DEPLOYMENT_NAME,
        "annotations": {
            "seldon.io/engine-seldon-log-messages-externally": "true"
        },
        "protocol": "seldon",
        "transport": "rest",
        "predictors": [
            {
                "componentSpecs": [
                    {
                        "spec": {
                            "containers": [
                                {
                                    "name": f"{DEPLOYMENT_NAME}-container",
                                    "resources": {
                                        "requests": {
                                            "cpu": CPU_REQUESTS,
                                            "memory": MEMORY_REQUESTS
                                        },
                                        "limits": {
                                            "cpu": CPU_LIMITS,
                                            "memory": MEMORY_LIMITS
                                        }
                                    }
                                }
                            ]
                        }
                    }
                ],
                "name": "default",
                "replicas": 1,
                "traffic": 100,
                "graph": {
                    "implementation": PREPACKAGED_SERVER,
                    "modelUri": MODEL_LOCATION,
                    "name": f"{DEPLOYMENT_NAME}-container",
                    "endpoint": {
                        "type": "REST"
                    },
                    "parameters": [],
                    "children": [],
                    "logger": {
                        "mode": "all"
                    }
                }
            }
        ]
    },
    "status": {}
}

In [44]:
explainer_spec = {
                    "type": EXPLAINER_TYPE,
                    "modelUri": EXPLAINER_URI,
                    "envSecretRefName": "seldon-rclone-secret",
                    "containerSpec": {
                        "name": "",
                        "resources": {}
                    }
                }

In [45]:
mldeployment['spec']['predictors'][0]['explainer'] = explainer_spec

In [46]:
deployment_api = SeldonDeploymentsApi(auth())
deployment_api.create_seldon_deployment(namespace=NAMESPACE, mldeployment=mldeployment)

{'api_version': 'machinelearning.seldon.io/v1alpha2',
 'kind': 'SeldonDeployment',
 'metadata': {'annotations': None,
              'cluster_name': None,
              'creation_timestamp': None,
              'deletion_grace_period_seconds': None,
              'deletion_timestamp': None,
              'finalizers': None,
              'generate_name': None,
              'generation': None,
              'labels': {'fluentd': 'true'},
              'managed_fields': None,
              'name': 'income',
              'namespace': 'default',
              'owner_references': None,
              'resource_version': None,
              'self_link': None,
              'uid': None},
 'spec': {'annotations': {'seldon.io/engine-seldon-log-messages-externally': 'true'},
          'name': 'income',
          'oauth_key': None,
          'oauth_secret': None,
          'predictors': [{'annotations': None,
                          'component_specs': [{'hpa_spec': None,
                       