# Machine Learning Pipeline with KubeDirector - Lab 4
## Move the trained model to production: Register the trained model and deploy it to a deployment engine service to serve predictions

### **Lab workflow**

After you have trained your ML model and saved it to a file in the central repository, it is time to move it to production by creating a service end-point in the form of an API service that client applications can access to serve the model and make predictions. 

Delivering an ML model to production is a two-step process. We will cover both of these steps in this lab.

1. As tenant user, you will first register the trained model in the Kubernetes cluster by creating a ConfigMap resource in the Kubernetes cluster. The ConfigMap object stores metadata about the trained model to be used to make predictions. It contains information such as model name, description, versioning, trained model file path (XGB.pickle.dat), and the scoring path. The scoring path locates a Python script (XGB_Scoring.py) that is used to generate predictions from new data.  

2. You will then deploy a deployment engine cluster using KubeDirector and attach the registered model to the deployment engine. The deployment engine cluster loads information about the registered model from the ConfigMap object. The deployment engine exposes a set of microservices with a secure RESTful API that allows clients to consume the registered model and draw predictions on new input data.

**Definitions:**

- *Model registry:* The trained model to be used is identified and characterized in the Kubernetes cluster by a ConfigMap resource. The integrated model registry enables version tracking and seamless updates to models in production.

- *Model predictions:* The trained model is deployed to a target _"deployment engine"_ KubeDirector cluster environment in the Kubernetes cluster to serve predictions and for answering prediction queries from the trained model(s) you registered.


### **1- Initialize the environment**

Let's first define the environment variables needed to execute this part of the lab.

In [None]:
#
# environment variables
#
studentId="student{{ STDID }}" # your Jupyter Notebook student Identifier (i.e.: student<xx>)

gateway_host="{{ HPEECPGWNAME }}"
Internet_access="{{ JPHOSTEXT }}"

JupyterNotebookApp="cr-cluster-jupyter-notebook.yaml" # the Jupyter Notebook KD App manifest you will deploy to build your model
DeploymentEngineApp="cr-cluster-endpoint-wrapper.yaml" # the Deployment engine KD App manifest you will deploy to query your model for answers 
PipelineConfigMap="ml-pipeline-configmap.yaml" # ConfigMap manifest used to register the trained model version 1 
TrainingModel="model-${studentId}"

echo "Your studentId is: "$studentId 

### **2- Register your trained model**

You will need to register the trained model in the Kubernetes cluster model registry by creating a ConfigMap resource. The ConfigMap object will be used later in a **_Connections_** stanza to attach the model from the model registry to a deployment engine cluster. The ConfigMap object stores metadata about the trained model to be used to make predictions. It contains information such as:
* the model name, 
* a label: **kubedirector.hpe.com/cmType: "model"**
* a description, 
* a versioning (for example 1 for the first version of the model) 
* the full path to the trained model (serialized) file (XGB.pickle.dat),
* the full path to the scoring (prediction) script (XGB_Scoring.py) that will be used by the deployment engine to load (deserialize) the model and process the model to make predictions from new data (this process is also known as **_scoring_**, hence the name of this python script file). 

#### Create the ConfigMap resource using a YAML manifest file:
The `kubectl apply -f ManifestAppFile` command is used to create the ConfigMap resource. The application manifest is a YAML file that describes the registry information for the trained model. 

In [None]:
cat $PipelineConfigMap

In [None]:
kubectl apply -f $PipelineConfigMap

In [None]:
kubectl get configmap $TrainingModel

### **3- Deploying your model to a deployment engine environment to serve predictions**

#### Create the manifest file and deploy an instance of the _deployment-engine_ KubeDirector application:
You will now deploy an instance of the _**deployment-engine**_ KubeDirector application (kdapp) by creating a KubeDirector virtual cluster (kdcluster). The deployment engine cluster environment is used to stand up services that will allow clients to draw predictions from the model you have just registered with the ConfigMap resource in the Kubernetes cluster.

The deployment engine cluster is a set of microservices with a REST API to serve online predictions. The deployment engine cluster exposes network service endpoints such as a **LoadBalancer and a RESTServer** with token-based authorization. 

Like any other containerized application deployment on Kubernetes, the `kubectl apply -f ManifestAppFile` command is used to deploy the kdcluster. 

> **Note:** _Similar to how the Jupyter Notebook kdcluster yaml file was modified (Lab 2), this kdcluster manifest file includes the **Connections** stanza. The Connections stanza here is used to attach your model from the model registry (that is the ConfigMap object) to the deployment engine cluster. The deployment engine cluster will load information about the registered model from the ConfigMap object into a JSON file (**/etc/guestconfig/configmeta.json**) within the deployment engine cluster containers._

In [None]:
cat $DeploymentEngineApp

In [None]:
kubectl apply -f $DeploymentEngineApp

After a few seconds, you should get the response message: *kubedirectorcluster/Your-instance-name created*.  

### **4- Inspect the deployed KubeDirector application instance** 
Your application will be represented in the Kubernetes cluster by a custom resource of type **KubeDirectorCluster (kdcluster)**, with the name that was indicated inside the YAML file used to create it. 

In [None]:
clusterName="inference-server-${studentId}"
kubectl get kdcluster $clusterName

After creating the instance of the KubeDirector application, you can use the `kubectl describe kdcluster` command below to observe its status and any events logged against it.

The virtual cluster status indicates its overall "state" (top-level property of the status object). It should have a value of **"configured"**. 

> **Note:** _The first time a virtual cluster of a given KubeDirector application type is created, it may take several minutes to reach its **"configured"** state, as the relevant Docker image must be downloaded and imported._ 

**>Run the `kubectl describe` command below and scroll down to the `Events` section to check the overal state of your kdcluster.**

**>Regularly repeat (every minute or so) the command below until the kdcluster is in state "configured"**.

In [None]:
kubectl describe kdcluster $clusterName

You can use the `kubectl get pod,service,statefulset` command that matches against a value of the **kubedirector.hpe.com/kdcluster=YourClusterApplicationName** label to observe the standard Kubernetes resources that compose the application virtual cluster:

In [None]:
kubectl get pod,service,statefulset -l kubedirector.hpe.com/kdcluster=$clusterName

Your instance of the KubeDirector Application virtual cluster is made up of a **StatefulSet**, a **POD** (a cluster node) and a **NodePort Service** per service role member (LoadBalancer, RESTServer), and a **headless service** for the application cluster.   

* The ClusterIP service is the headless service required by a Kubernetes StatefulSet to work. It maintains a stable POD network identity (i.e.: persistence of the hostname of the PODs across PODs rescheduling).
* The NodePort services expose the LoadBalancer and RESTServer application services with token-based authorization outside the Kubernetes cluster. 

Now, follow the instructions in Lab 5 to serve prediction queries.

* [Lab 5 Model Serving](5-WKSHP-K8s-ML-Pipeline-Model-Serving.ipynb)

## Summary

In this lab, you learned how you can deliver a trained model to production and make it available for answering prediction queries. You first registered the trained model in the Kubernetes cluster with relevant model information in a ConfigMap resource. You then deployed the registered model to a target deployment engine environment that exposes a REST API service endpoint to serve predictions.