# Containerizer service

Import libraries that have been installed before

In [1]:
import sklearn
import modzymodel

### Train the model

This will train a sklearn model and it will be saved as a joblib file inside the `model` directory.

The goal for containerizer service is to create an image that exposes this model.

In [2]:
from sklearn import datasets, svm, metrics
from sklearn.model_selection import train_test_split
from joblib import dump, load

digits = datasets.load_digits()

data = digits.images.reshape((len(digits.images), -1))

# Create a classifier: a support vector classifier
clf = svm.SVC(gamma=0.001)

# Split data into 50% train and 50% test subsets
X_train, X_test, y_train, y_test = train_test_split(
    data, digits.target, test_size=0.5, shuffle=False)

# Learn the digits on the train subset
clf.fit(X_train, y_train)

dump(clf, './model/model.joblib')

['./model/model.joblib']

### Train the model

Transform the model into MLFlow format.

In [3]:
import joblib
clf = joblib.load('./model/model.joblib')

import mlflow.sklearn
mlflow.sklearn.save_model(clf, './model/MLFlowModel')

We check that the model has been correctly saved inside the `model` directory.

In [4]:
!ls ./model/MLFlowModel

conda.yaml  MLmodel  model.pkl


### Define the values needed

The fields `name`, `version` and `api_key` are used when uploading the image to Modzy platform.

Since now we are just creating and downloading the docker image, the only fields that containerizer service actually needs are:

* `model_name`: name for the model inside the image
* `model_path`: directory that contains our model file

In [5]:
api_key = 'XXXX'

image_data = {
    'name': 'docker-registry:5000/mlflow-digits-containerized',
    'version': '0.0.1',
    'model_name': 'digits',
    'model_path': './model/MLFlowModel',
}

### Forward ports to access service and registry

This assumes that you are running these commands on your own terminal to redirect the service (port 5000) and the registry (port 5001) to localhost.

In [6]:
! # kubectl port-forward service/containerizer 5000:5000
! # kubectl port-forward service/docker-registry 5001:5000

### Launch the job

Important fields that we should fill in here are:

* `module`: library that has been used to create the model
* `image_data`: the values defined above
* `image_type`: this is needed in case we are training images so afterwards the proxy will know how to interpret data
* `base_url`: the name of the service that runs the containerizer

In [7]:
res = modzymodel.publish(
    api_key=api_key,
    module=mlflow,
    image_data=image_data,
    deploy=True,
    image_type=modzymodel.Constants.IMAGE_GREY,
    base_url='http://localhost:5000'
)

error = res.get('error')
job_id = res.get('job_id')

if error:
    print('Error:', error)
else:
    print('Job ID:', job_id)

Publishing container... Ok!
Job ID: containerizer-builder-job-e43d6830-0d64-4d84-ba2f-2413bcda1abb


After the request is made, containerizer launches a job that runs Kaniko and builds the docker image based on the values provided.

You can get the id of the job created from the result of the request. This id can be used to ask for the status of the job.

In [8]:
modzymodel.get_job_status(job_id)

{'result': None,
 'status': {'active': None,
  'completion_time': 'Fri, 18 Jun 2021 10:07:50 GMT',
  'conditions': [{'last_probe_time': 'Fri, 18 Jun 2021 10:07:50 GMT',
    'last_transition_time': 'Fri, 18 Jun 2021 10:07:50 GMT',
    'message': None,
    'reason': None,
    'status': 'True',
    'type': 'Complete'}],
  'failed': None,
  'start_time': 'Fri, 18 Jun 2021 10:04:53 GMT',
  'succeeded': 1}}

Now, we should be able to see the created image listed in the registry. This means that the service has correctly created the image and uploaded it.

In [9]:
!wget -qO- http://localhost:5001/v2/_catalog

{"repositories":["mlflow-digits-containerized"]}


### Pull the docker image

Now that the job has finished, we can pull and load the docker image that has been generated.

In [10]:
!docker pull localhost:5001/mlflow-digits-containerized

Using default tag: latest
latest: Pulling from mlflow-digits-containerized

[1B2152171a: Already exists 
[1B043ff07a: Pulling fs layer 
[1B01523801: Pulling fs layer 
[1Bdb539500: Pulling fs layer 
[1B16605b55: Pulling fs layer 
[1B903809c0: Pull complete 498MB/5.498MBBA[2K[1A[2K[2A[2K[5A[2K[4A[2K[4A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[2A[2K[1A[2K[1A[2KDigest: sha256:3db68a482375ed0a5c46894825a9030bba0537f05ea6768b073594b9fe6ecaa9
Status: Downloaded newer image for localhost:5001/mlflow-digits-containerized:latest
localhost:5001/mlflow-digits-containerized:latest


In [11]:
!docker images localhost:5001/mlflow-digits-containerized

REPOSITORY                                   TAG       IMAGE ID       CREATED          SIZE
localhost:5001/mlflow-digits-containerized   latest    af270737e355   21 seconds ago   691MB
