# Machine Learning Pipeline with KubeDirector - Lab 4
## Register the trained model and deploy it to a deployment engine service to serve predictions

### **Lab workflow:**

In this lab:

1. As tenant user, you will register the trained model in the Kubernetes cluster by creating a ConfigMap resource in the Kubernetes cluster. The ConfigMap object stores metadata about the trained model to be used to make predictions. It contains information such as model name, description, versioning, trained model (serialized) file path (XGB.pickle.dat), and the scoring path that locates a Python script (XGB_Scoring.py) that will be used by the inference deployment engine to deserialize and process the model to generate predicted values (i.e.: predictions) from new input data. 

2. You will then deploy an inference deployment engine cluster using KubeDirector. The inference deployment-engine cluster loads information about the registered model from the ConfigMap object. The inference deployment engine is used to stand up services that will allow client to draw predictions from the model you have registered.

**Definitions:**

- *Model registry:* The trained model to be used is identified and characterized in the Kubernetes cluster by a ConfigMap resource. The integrated model registry enables version tracking and seamless updates to models in production.

- *Model inferences:* The trained model is deployed to a target "inference deployment engine" KubeDirector cluster environment in the Kubernetes cluster to serve predictions and for answering prediction queries from the trained model(s) you registered.


#### Initialize the environment:

Let's first define the environment variables needed to execute this lab part.

In [1]:
#
# environment variables to be verified by the student
#
studentId="student{{ STDID }}" # your Jupyter Notebook student Identifier (i.e.: student<xx>)

studentId="student74"

# fixed environment variables setup by the HPE ECP lab administrator - Please DO NOT MODIFY!!

gateway_host="{{ HPEECPGWNAME }}"
Internet_access="{{ JPHOSTEXT }}"

gateway_host="hpecpgw1.hp.local"
Internet_access="notebooks2.hpedev.io"

JupyterNotebookApp="cr-cluster-jupyter-notebook.yaml" # the Jupyter Notebook KD App manifest you will deploy to build your model
DeploymentEngineApp="cr-cluster-endpoint-wrapper.yaml" # the Deployment engine KD App manifest you will deploy to query your model for answers 
PipelineConfigMap="ml-pipeline-configmap.yaml" # ConfigMap manifest used to register the trained model version 1 
TrainingModel="model-${studentId}"

echo "Your studentId is: "$studentId 

Your studentId is: student74


## Register your trained model

You will need to register the trained model in Kubernetes cluster by creating a ConfigMap resource. The ConfigMap object will be used later in a **Connection** stanza to attach the trained model to the inference deployment engine cluster. The ConfigMap object stores metadata about the trained model to be used to make predictions. It contains information such as:
* the model name, 
* a label: **kubedirector.hpe.com/cmType: "model"**
* a description, 
* a versioning (for example 1 for the first version of the model) 
* the full path to the trained model (serialized) file (XGB.pickle.dat),
* the full path to the scoring (prediction) script (XGB_Scoring.py) that will be used by the inference deployment engine to load (deserialize) the model and process the model to make predictions from new data (this process is also known as **_scoring_**, hence the name of this python script file). 

Let's make sure the model registry is unique among the tenant users. Here we replace the string "example" with your "studentId" in the Configmap resource manifest file.

In [2]:
sed -i "s/example/${studentId}/g" $PipelineConfigMap
cat $PipelineConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  name: model-student74
  labels:
    kubedirector.hpe.com/cmType: "model"
data:
  name: model-student74
  description: "student74 model"
  model-version: "1"
  path: /bd-fs-mnt/TenantShare/repo/models/NYCTaxi/student74/XGB.pickle.dat
  scoring-path: /bd-fs-mnt/TenantShare/repo/code/NYCTaxi/student74/XGB_Scoring.py

In [3]:
kubectl apply -f $PipelineConfigMap

configmap/model-student74 created


In [4]:
kubectl get configmap $TrainingModel

NAME              DATA   AGE
model-student74   5      2s


## Deploying your model to a deployment engine environment to serve predictions

#### Deploy an instance of the deployment-engine KubeDirector application:
Next, you will deploy an instance of the **deployment-engine** KubeDirector application (kdapp) by creating a KubeDirector virtual cluster (kdcluster). The deployment engine is used to stand up services that will allow clients to draw predictions from the model you have just registered with the ConfigMap resource in the Kubernetes cluster.

The inference deployment engine cluster is a set of microservices with a REST API to serve online predictions. The deployment engine cluster exposes network service endpoints such as a **LoadBalancer and a RESTServer** with token-based authorization. 

Let's make sure your application deployment name will be unique among the tenant users. Here we replace the string "example" with your "studentId" in the application manifest file.

In [5]:
kubectl get kdapp deployment-engine

NAME                AGE
deployment-engine   22h


In [6]:
sed -i "s/example/${studentId}/g" $DeploymentEngineApp
cat $DeploymentEngineApp

apiVersion: "kubedirector.hpe.com/v1beta1"
kind: "KubeDirectorCluster"
metadata: 
  name: "inference-server-student74"

spec:
  app: deployment-engine
  appCatalog: "local"
  connections: 
    #secrets: 
      #- 
        #"some secrets"
    configmaps: 
      - "model-student74" 
        #"some configmaps"
    #clusters: 
      #-
        #"some clusters"
  roles:
  - id: RESTServer
    members: 1
    resources:
      requests:
        memory: "2Gi"
        cpu: "1"
      limits:
        memory: "2Gi"
        cpu: "1"
  - id: LoadBalancer
    members: 1
    resources:
      requests:
        memory: "2Gi"
        cpu: "1"
      limits:
        memory: "2Gi"
        cpu: "1"    
        


> **Note:** _Similar to how the Jupyter Notebook kdcluster yaml file was modified (Lab 2), this kdcluster manifest file includes the **Connections** stanza. This connection stanza here is used to attach your model from the model registry (that is the ConfigMap object) to the inference deployment engine cluster. The inference deployment engine cluster will load information about the registered model from the ConfigMap object into a JSON file (**/etc/guestconfig/configmeta.json**) within the deployment engine cluster containers._

In [7]:
kubectl apply -f $DeploymentEngineApp

kubedirectorcluster.kubedirector.hpe.com/inference-server-student74 created


After a few seconds, you should get the response message to your K8s API request: *kubedirectorcluster/Your-instance-name created*.  

#### Inspect the deployed KubeDirector application instance: 
Your application will be represented in the Kubernetes cluster by a custom resource of type **KubeDirectorCluster (kdcluster)**, with the name that was indicated inside the YAML file used to create it. 

In [8]:
clusterName="inference-server-${studentId}"
kubectl get kdcluster $clusterName

NAME                         AGE
inference-server-student74   7s


After creating the instance of the KubeDirector Application, you can use the `kubectl describe kdcluster` command below to observe its status and the standard Kubernetes resources that compose the application virtual cluster (statefulsets, pods, services, persistent volume claim if any), as well as any events logged against it.

The virtual cluster status indicates its overall "state" (top-level property of the status object). It should have a value of **"configured"**. 

> **Note:** _The first time a virtual cluster of a given KubeDirector Application type is created, it may take some minutes to reach its **"configured"** state, as the relevant Docker image must be downloaded and imported._

**Repeat the command below until the kdcluster is in state "configured"**

In [16]:
kubectl describe kdcluster $clusterName

Name:         inference-server-student74
Namespace:    k8smltenant
Labels:       <none>
Annotations:  <none>
API Version:  kubedirector.hpe.com/v1beta1
Kind:         KubeDirectorCluster
Metadata:
  Creation Timestamp:  2020-12-16T18:35:15Z
  Finalizers:
    kubedirector.hpe.com/cleanup
  Generation:        1
  Resource Version:  310354
  Self Link:         /apis/kubedirector.hpe.com/v1beta1/namespaces/k8smltenant/kubedirectorclusters/inference-server-student74
  UID:               9ba7c2ba-077d-4ead-b535-697c2f8252c3
Spec:
  App:          deployment-engine
  App Catalog:  local
  Connections:
    Configmaps:
      model-student74
  Naming Scheme:  UID
  Roles:
    Id:       RESTServer
    Members:  1
    Resources:
      Limits:
        Cpu:     1
        Memory:  2Gi
      Requests:
        Cpu:     1
        Memory:  2Gi
    Id:          LoadBalancer
    Members:     1
    Resources:
      Limits:
        Cpu:     1
        Memory:  2Gi
      Requests:
        Cpu:     1
        Memo

In [15]:
kubectl get all -l kubedirector.hpe.com/kdcluster=$clusterName

NAME               READY   STATUS    RESTARTS   AGE
pod/kdss-2xg9c-0   1/1     Running   0          109s
pod/kdss-dqscg-0   1/1     Running   0          109s

NAME                     TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                          AGE
service/kdhs-lpxwg       ClusterIP   None             <none>        8888/TCP                                         111s
service/s-kdss-2xg9c-0   NodePort    10.102.185.101   <none>        8081:32640/TCP,10001:32472/TCP,32700:31713/TCP   111s
service/s-kdss-dqscg-0   NodePort    10.110.81.198    <none>        22:32254/TCP,10001:32596/TCP                     111s

NAME                          READY   AGE
statefulset.apps/kdss-2xg9c   1/1     111s
statefulset.apps/kdss-dqscg   1/1     111s


Your instance of the KubeDirector Application virtual cluster is made up of a **StatefulSet**, a **POD** (a cluster node) and a **NodePort Service** per service role member (LoadBalancer, RESTServer), and a **headless service** for the application cluster.   

* The ClusterIP service is the headless service required by a Kubernetes StatefulSet to work. It maintains a stable POD network identity (i.e.: persistence of the hostname of the PODs across PODs rescheduling).
* The NodePort services expose the LoadBalancer and RESTServer application services with token-based authorization outside the Kubernetes cluster. 

HPE Ezmeral Container Platform automatically maps the NodePort Service endpoints to the HPE Ezmeral Container Platform gateway (haproxy) host.

Now, follow the instructions in Lab 5 to serve prediction queries

* [Lab 5 Model Serving](5-WKSHP-K8s-ML-Pipeline-Model-Serving.ipynb)

## Summary

In this lab, we have shown you how you, **as tenant user**, can register a trained model in Kubernetes cluster with relevant model information and attach the model from model registry to a deployment engine cluster.