## Seldon Core Income Classifier Deployment

Seldon is used to containerise `trained machine learning models` and deploy them into Kubernetes environments. Seldon can be used to compose `complex inference pipelines` to orchestrate multiple components including models, data transformers, combiners, drift detectors, outlier detectors and explainers for advanced monitoring.

### Trained ML model

First we need a trained ML model to deploy. This will need to be stored in an s3 compatible bucket:

In [None]:
modelUri = gs://seldon-models/sklearn/income/model

### Inference Server (container)

We now need to provide a container to load our trained model and run predict. Seldon has some pre-packaged servers (Tensorflow, XGBoost, scikit-learn and MLFlow) otherwise we need to build our own.

For mlflow use-case the following file provides the logic for how we will load and run predictions with the trained model. We can build a container from this `SklearnServer.py` file by:
- creating requirements file
- creating environment OR Docker file
- building a container from Seldon base image with s2i OR Docker

`SklearnServer.py`

In [None]:
import joblib
import numpy as np
import seldon_core
from seldon_core.user_model import SeldonComponent
from typing import Dict, List, Union, Iterable
import os
import logging
import yaml

logger = logging.getLogger(__name__)

JOBLIB_FILE = "model.joblib"


class SKLearnServer(SeldonComponent):
    def __init__(self, model_uri: str = None, method: str = "predict_proba"):
        super().__init__()
        self.model_uri = model_uri
        self.method = method
        self.ready = False
        logger.info(f"Model uri: {self.model_uri}")
        logger.info(f"method: {self.method}")
        self.load()

    def load(self):
        logger.info("load")
        model_file = os.path.join(
            seldon_core.Storage.download(self.model_uri), JOBLIB_FILE
        )
        logger.info(f"model file: {model_file}")
        self._joblib = joblib.load(model_file)
        self.ready = True

    def predict(
        self, X: np.ndarray, names: Iterable[str], meta: Dict = None
    ) -> Union[np.ndarray, List, str, bytes]:
        try:
            if not self.ready:
                self.load()
            if self.method == "predict_proba":
                logger.info("Calling predict_proba")
                result = self._joblib.predict_proba(X)
            elif self.method == "decision_function":
                logger.info("Calling decision_function")
                result = self._joblib.decision_function(X)
            else:
                logger.info("Calling predict")
                result = self._joblib.predict(X)
            return result
        except Exception as ex:
            logging.exception("Exception during predict")

### Simple inference graph

In Seldon we use Custom Resource Definitions (CRDs) to define inference logic. CRDs are extensions of the K8s API allowing us to create a combo of K8s objects that we can orchestrate. 

In [None]:
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
  name: income
spec:
  name: income
  annotations:
    seldon.io/rest-timeout: "100000"
  predictors:
  - graph:
      children: []
      implementation: SKLEARN_SERVER
      modelUri: gs://seldon-models/sklearn/income/model-0.23.2
      name: classifier
    name: default
    replicas: 1

In [None]:
!kubectl apply -f deployment.yaml -n test

In [None]:
!kubectl get pods -n test

You now have the income classifier model running as a production ready REST/gRPC microservice.

In [None]:
!curl -s -d '{"data": {"names": [], "ndarray": [[39, 7, 1, 1, 1, 1, 4, 1, 2174, 0, 40, 9]]}}' \
   -X POST http://34.141.246.254/seldon/test/income/api/v1.0/predictions \
   -H "Content-Type: application/json"

#### Further Seldon Core capabilities 
    
- complex inference pipelines with predictors, transformers, combiners, outlier detectors, drift detectors and explainers
- metadata    
- custom metrics 
- infrastructure and performance monitoring with Prometheus
- visualisation with Grafana
- logging request, responses and container logs with ElasticSearch
- tracing with Jaeger for latency 

![title](seldoncore.png)

#### Challenges 

The main challenges with adopting Seldon Core at enterprise scale are: 

1. requiring Kubernetes skills to interface with the platform


2. requires building integrations with open source tools for logging and monitoring such as Prometheus, Grafana and ElasticSearch 


3. would require building out additional features for auth, permissioning, model artefact registration and visualisation of features, predictions, outliers, drift and explainers 