# Chassis.ml demo

## Easily build MLflow models into {KFServing, Modzy} Docker images

This demo will show you how we can train a model, define custom pre- and post-processing steps, save it in MLflow format and then build it into a container image and push it to docker hub with a single command.

By easily connecting MLflow models to Docker images with a simple Python SDK for data scientists & ML engineers, Chassis is the missing link between MLflow and DevOps.

This demo can be run in local using minikube and a local installation of Chassis.

## Prerequisites

* [Docker Hub](https://hub.docker.com/) account (free one is fine)
* The browser you're reading this in :-)
* Existing local installation of Chassis

In [1]:
import chassisml
import sklearn
import mlflow.pyfunc
from joblib import dump, load

### Train the model

This will train a sklearn model and it will be saved as a joblib file inside the `model` directory.

The goal for Chassis service is to create an image that exposes this model.

In [2]:
from sklearn import datasets, svm
from sklearn.model_selection import train_test_split

digits = datasets.load_digits()
data = digits.images.reshape((len(digits.images), -1))

# Create a classifier: a support vector classifier
clf = svm.SVC(gamma=0.001)

# Split data into 50% train and 50% test subsets
X_train, X_test, y_train, y_test = train_test_split(
    data, digits.target, test_size=0.5, shuffle=False)

# Learn the digits on the train subset
clf.fit(X_train, y_train)
dump(clf, './model.joblib')

['./model.joblib']

In [3]:
# Wrap your model in a pyfunc and provide auxiliary functionality through extension of the
# mlflow PythonModel class with methods pre_process, post_process, and explain

class CustomModel(mlflow.pyfunc.PythonModel):
    _model = load('./model.joblib')
    
    def load_context(self, context):
        self.model = self._model

    def predict(self, context, inputs):
        processed_inputs = self.pre_process(inputs)
        inference_results = self.model.predict(processed_inputs)
        return self.post_process(inference_results)

    def pre_process(self, inputs):
        return inputs / 2

    def post_process(self, inference_results):
        structured_results = []
        for inference_result in inference_results:
            inference_result = {
                "classPredictions": [
                    {"class": str(inference_result), "score": str(1)}
                ]
            }
            structured_output = {
                "data": {
                    "result": inference_result,
                    "explanation": None,
                    "drift": None,
                }
            }
            structured_results.append(structured_output)
        return structured_results

    def explain(self, images):
        pass

In [4]:
# Define conda environment with all required dependencies for your model

conda_env = {
    "channels": ["defaults", "conda-forge", "pytorch"],
    "dependencies": [
        "python=3.8.5",
        "pytorch",
        "torchvision",
        "pip",
        {
            "pip": [
                "mlflow",
                "lime",
                "sklearn"
            ],
        },
    ],
    "name": "linear_env"
}

### Train the model

Transform the model into MLFlow format.

In [5]:
!rm -rf mlflow_custom_pyfunc_svm
model_save_path = "mlflow_custom_pyfunc_svm"
mlflow.pyfunc.save_model(path=model_save_path, python_model=CustomModel(), conda_env=conda_env)

Load the MLFlow model and test it.

In [6]:
import json

classifier = mlflow.pyfunc.load_model(model_save_path)
predictions = classifier.predict(X_test)
print(json.dumps(predictions[0], indent=4))

{
    "data": {
        "result": {
            "classPredictions": [
                {
                    "class": "8",
                    "score": "1"
                }
            ]
        },
        "explanation": null,
        "drift": null
    }
}


We check that the model has been correctly saved inside the `model` directory.

In [7]:
!ls ./mlflow_custom_pyfunc_svm

conda.yaml  MLmodel  python_model.pkl


### Get Docker Hub credentials securely

Now we prompt the user (you!) for your docker hub username and password in such a way that the value itself doesn't get written into the notebook, which is sensible security best-practice.

In [8]:
import getpass
import base64
username = getpass.getpass("docker hub username")
password = getpass.getpass("docker hub password")

docker hub username········
docker hub password········


Now we can construct the metadata that the chassis service needs to build and publish the container to docker hub:

In [9]:
image_data = {
    'name': f'{username}/chassisml-sklearn-demo:latest',
    'version': '0.0.1',
    'model_name': 'digits',
    'model_path': './mlflow_custom_pyfunc_svm',
    'registry_auth': base64.b64encode(f"{username}:{password}".encode("utf-8")).decode("utf-8")
}

In [10]:
modzy_data = {
    'metadata_path': './modzy/model.yaml'
}

### Forward ports to access service and registry

This assumes that you are running these commands on your own terminal to redirect the service (port 5000) and the registry (port 5001) to localhost.

In [None]:
! # kubectl port-forward service/chassis 5000:5000

### Launch the job

Important fields that we should fill in here are:

* `module`: library that has been used to create the model
* `image_data`: the values defined above
* `image_type`: this is needed in case we are training images so afterwards the proxy will know how to interpret data
* `base_url`: the name of the service that runs Chassis

In [11]:
res = chassisml.publish(
    image_data=image_data,
    modzy_data=modzy_data,
    deploy=True,
    base_url='http://localhost:5000'
)

error = res.get('error')
job_id = res.get('job_id')

if error:
    print('Error:', error)
else:
    print('Job ID:', job_id)

Publishing container... Ok!
Job ID: chassis-builder-job-6ade161a-1486-4b01-907b-6e36d039c0f4


After the request is made, Chassis launches a job that runs Kaniko and builds the docker image based on the values provided.

You can get the id of the job created from the result of the request. This id can be used to ask for the status of the job.

This is an example of the data that is shown when the job has not finished yet.

In [12]:
chassisml.get_job_status(job_id)

{'active': 1,
 'completion_time': None,
 'conditions': None,
 'failed': None,
 'start_time': 'Thu, 29 Jul 2021 17:03:44 GMT',
 'succeeded': None}

And this is an example of the data that is shown when the job has already finished.

In [13]:
chassisml.get_job_status(job_id)

{'active': None,
 'completion_time': 'Thu, 29 Jul 2021 17:15:56 GMT',
 'conditions': [{'last_probe_time': 'Thu, 29 Jul 2021 17:15:56 GMT',
   'last_transition_time': 'Thu, 29 Jul 2021 17:15:56 GMT',
   'message': None,
   'reason': None,
   'status': 'True',
   'type': 'Complete'}],
 'failed': None,
 'start_time': 'Thu, 29 Jul 2021 17:03:44 GMT',
 'succeeded': 1}

### Pull the docker image

Now that the job has finished, we can pull and load the docker image that has been generated.

In [14]:
!docker pull {username}/chassisml-sklearn-demo:latest

latest: Pulling from carmilso/chassisml-sklearn-demo

[1B7f680f63: Pulling fs layer 
[1B53afe8a0: Pulling fs layer 
[1B7a6faf2d: Pulling fs layer 
[1Be0a6cfe3: Pulling fs layer 
[1B2e6602b3: Pulling fs layer 
[1Bbf5f9440: Pulling fs layer 
[1B14f5e83b: Pulling fs layer 
[1B5593bf23: Pulling fs layer 
[1B036cba4d: Pulling fs layer 
[1B5e8bc614: Pulling fs layer 
[1Bb5369a3a: Pull complete 492kB/2.492kBB[11A[2K[11A[2K[11A[2K[11A[2K[10A[2K[9A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[8A[2K[11A[2K[10A[2K[9A[2K[9A[2K[6A[2K[7A[2K[8A[2K[6A[2K[7A[2K[6A[2K[7A[2K[6A[2K[5A[2K[6A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[3A[2K[7A[2K[7A[2K[7A[2K[4A[2K[7A[2K[4A[2K[7A[2K[4A[2K[7A[2K[4A[2K[4A[2K[7A[2K[4A[2K[7A[2K[7A[2K[4A[2K[7A[2K[4A[2K[1A[2K[4A[2K[4A[2K[7A[2K[4A[2K[4A[2K[4A[2K[4A[2K[7A[2K[4A[2K[7A[2K[4A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[

In [15]:
!docker images {username}/chassisml-sklearn-demo:latest

REPOSITORY                        TAG       IMAGE ID       CREATED         SIZE
carmilso/chassisml-sklearn-demo   latest    56b10aa0f075   4 minutes ago   2.23GB
