## Seldon Core MLFlow Deployment

Seldon is used to containerise `trained machine learning models` and deploy them into Kubernetes environments. Seldon can be used to compose `complex inference pipelines` to orchestrate multiple components including models, data transformers, combiners, drift detectors, outlier detectors and explainers for advanced monitoring.

### Trained ML model

First we need a trained ML model to deploy. This will need to be stored in an s3 compatible bucket:

In [None]:
modelUri = gs://seldon-models/mlflow/elasticnet_wine_1.8.0

### Inference Server (container)

We now need to provide a container to load our trained model and run predict. Seldon has some pre-packaged servers (Tensorflow, XGBoost, scikit-learn and MLFlow) otherwise we need to build our own.

For mlflow use-case the following file provides the logic for how we will load and run predictions with the trained model. We can build a container from this `MLFlowServer.py` file by:
- creating requirements file
- creating environment OR Docker file
- building a container from Seldon base image with s2i OR Docker

`MLFlowServer.py`

In [None]:
import numpy as np
import logging
import requests
from mlflow import pyfunc
from seldon_core import Storage
from seldon_core.user_model import SeldonComponent
from typing import Dict, List, Union, Iterable

log = logging.getLogger()

MLFLOW_SERVER = "model"


class MLFlowServer(SeldonComponent):
    def __init__(self, model_uri: str):
        super().__init__()
        log.info(f"Creating MLFLow server with URI {model_uri}")
        self.model_uri = model_uri
        self.ready = False

    def load(self):
        log.info(f"Downloading model from {self.model_uri}")
        model_folder = Storage.download(self.model_uri)
        self._model = pyfunc.load_model(model_folder)
        self.ready = True

    def predict(
        self, X: np.ndarray, feature_names: Iterable[str] = [], meta: Dict = None
    ) -> Union[np.ndarray, List, Dict, str, bytes]:
        log.info(f"Requesting prediction with: {X}")

        if not self.ready:
            raise requests.HTTPError("Model not loaded yet")

        result = self._model.predict(X)
        log.info(f"Prediction result: {result}")
        return result

### Simple inference graph

In [None]:
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: mlflow
spec:
  name: wines
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - name: classifier
          livenessProbe:
            initialDelaySeconds: 80
            failureThreshold: 200
            periodSeconds: 5
            successThreshold: 1
            httpGet:
              path: /health/ping
              port: http
              scheme: HTTP
          readinessProbe:
            initialDelaySeconds: 80
            failureThreshold: 200
            periodSeconds: 5
            successThreshold: 1
            httpGet:
              path: /health/ping
              port: http
              scheme: HTTP
    graph:
      children: []
      implementation: MLFLOW_SERVER
      modelUri: gs://seldon-models/mlflow/elasticnet_wine_1.8.0
      name: classifier
    name: default
    replicas: 1

In [4]:
!kubectl apply -f deployment.yaml 

seldondeployment.machinelearning.seldon.io/mlflow unchanged


In [5]:
!kubectl get pods -n default 

NAME                                           READY   STATUS    RESTARTS   AGE
mlflow-default-0-classifier-5c9dcfc855-647hn   2/2     Running   0          128m


In [7]:
!curl -s -d '{"data": {"names": [], "ndarray": [[6.2, 0.270, 0.43, 7.80, 0.056, 48.0, 244.0, 0.99560, 3.10, 0.51, 10.00]]}}' \
   -X POST http://34.90.29.195/seldon/default/mlflow/api/v1.0/predictions \
   -H "Content-Type: application/json"

{"data":{"names":[],"ndarray":[5.477889635651638]},"meta":{"requestPath":{"classifier":"seldonio/mlflowserver:1.9.1"}}}


#### Further Seldon Core capabilities 
    
- metadata    
- custom metrics 
- infrastructure and performance monitoring with Prometheus
- visualisation with Grafana
- logging request, responses and container logs with ElasticSearch
- tracing with Jaeger for latency 

![title](seldoncore.png)

#### Challenges 

The main challenges with adopting Seldon Core at enterprise scale are: 

1. requiring Kubernetes skills to interface with the platform


2. requires building integrations with open source tools for logging and monitoring such as Prometheus, Grafana and ElasticSearch 


3. would require building out additional features for auth, permissioning, model artefact registration and visualisation of outliers, drift and explainers 