# Machine Learning Pipeline with KubeDirector - Lab 5
## Serving prediction queries

### **Lab workflow:**

You have built and trained a model from a dataset, register the trained model and deployed it in a deployment-engine kdcluster. Now it is time to make predictions (i.e.: _how long my taxi ride take?_) with new data. The **LoadBalancer** service port on the inference deployment engine can now be used to serve REST API queries using the trained model and make predictions. 

In this lab, you will use kubectl commands in the context of your tenant user account and get the LoadBalancer network service endpoints with token-based authentication of your inference deployment engine. 

You will then use a script in **_cURL_** for making queries to your prediction service. 

Finally, you will experience the **dynamic** aspect of your ML pipeline.

**Definitions:**

- *Model inferences:* The trained model is deployed to a target "inference deployment engine" KubeDirector cluster environment in the Kubernetes cluster to serve predictions and for answering prediction queries from the trained model you registered.

- *Scoring:* Scoring denotes the process of generating predicted values from new data.


#### Initialize the environment:

Let's first define the environment variables needed to execute this lab part.

In [1]:
#
# environment variables to be verified by the student
#
studentId="student{{ STDID }}" # your Jupyter Notebook student Identifier (i.e.: student<xx>)

studentId="student74"

# fixed environment variables setup by the HPE ECP lab administrator - Please DO NOT MODIFY!!

gateway_host="{{ HPEECPGWNAME }}"
Internet_access="{{ JPHOSTEXT }}"

gateway_host="hpecpgw1.hp.local"
Internet_access="notebooks2.hpedev.io"

JupyterNotebookApp="cr-cluster-jupyter-notebook.yaml" # the Jupyter Notebook KD App manifest you will deploy to build your model
DeploymentEngineApp="cr-cluster-endpoint-wrapper.yaml" # the Deployment engine KD App manifest you will deploy to query your model for answers 
PipelineConfigMap="ml-pipeline-configmap.yaml" # ConfigMap manifest used to register the trained model version 1 
PipelineConfigMapv2="ml-pipeline-configmap-v2.yaml" # ConfigMap manifest used to register the trained model version 2 
#
#
#
clusterName="inference-server-${studentId}"
#
# Model registry information
#
TrainingModel="model-${studentId}"
modelVersion="1"
#
echo "Your studentId is: "$studentId 

Your studentId is: student74


## Serving queries through the Load Balancer service of the deployment engine cluster

### Get the gateway mapped application service endpoint and the Authentication token of the Load Balancer of the deployment engine kdcluster:
To get a report on all services related to a specific virtual cluster, you can use a form of **kubectl describe** that matches against a value of the **kubedirector.hpe.com/kdcluster=YourClusterApplicationName,kubedirector.hpe.com/role=LoadBalancer** label.

In [2]:
#
# Getting the access point for the HAPROXY service of the LoadBalancer (role: LoadBalancer, internal port: 32700)
#
LoadBalancerURL=$(kubectl describe service -l kubedirector.hpe.com/kdcluster=${clusterName},kubedirector.hpe.com/role=LoadBalancer | grep gateway/32700 | awk '{print $2}')
LoadBalancerPort=$(echo $LoadBalancerURL | cut -d':' -f 2) # extract the gateway re-mapped port value.
LoadBalancer_endpoint="https://$gateway_host:$LoadBalancerPort"
echo "Your deployment-engine's LoadBalancer service endpoint re-mapped port is: "$LoadBalancerPort
echo "Your deployment-engine's LoadBalancer service endpoint is: "$LoadBalancer_endpoint
#echo "The LoadBalancer service endpoint URL is: https://"$Internet_access:$RESTServerPort
#
# Getting the auth-token:
#
LoadBalancerAuthToken=$(kubectl describe service -l kubedirector.hpe.com/kdcluster=${clusterName},kubedirector.hpe.com/role=LoadBalancer | grep kd-auth-token  | awk '{print $2}' | tr -d '\r')
echo "The deployment-engine's Load Balancer service authentication token is: "$LoadBalancerAuthToken

Your deployment-engine's LoadBalancer service endpoint re-mapped port is: 10151
Your deployment-engine's LoadBalancer service endpoint is: https://hpecpgw1.hp.local:10151
The deployment-engine's Load Balancer service authentication token is: 88f08f4023855961a2d40e8b12abd888


### Making predictions on new data
To make prediction, create an authenticated "POST" API call that is formulated as follows:  
https://loadbalancer_endpoint/registeredModel/modelVersion/predict

The query below is used to predict how long a taxi ride in NY City with attributes listed below will take:
* pickup location: West 23rd street
* dropoff location: Centre Market place 
* on a weekday 
* at 09:00 am 
* in February

>Note: _It may take a few seconds to get the result of the REST API call_

In [3]:
curl --location -k -s --request POST "${LoadBalancer_endpoint}/${TrainingModel}/${modelVersion}/predict" \
--header "X-Auth-Token: ${LoadBalancerAuthToken}" \
--header 'Content-Type: application/json' \
--data-raw '{
    "use_scoring": true,
    "scoring_args": {
        "work": 0,
        "start_latitude": 40.57689727,
        "start_longitude": -73.99047356,
        "end_latitude": 40.72058154,
        "end_longitude": -73.99740673,
        "distance": 8,
        "weekday": 1,
        "hour": 9,
        "month_1": 0,
        "month_2": 1,
        "month_3": 0,
        "month_4": 0,
        "month_5": 0,
        "month_6": 0
    }
}' | python -m json.tool | grep output | cut -d'\' -f 1

    "output": "The ride duration prediction is 3211.7134 seconds.


> Note: 
> * work is a boolean for work hours (1 if the ride occurs Mon-Fri 8am-5pm, 0 otherwise)
> * start_latitude is the pickup location latitude
> * start_longitude is the pickup location longitude
> * end_latitude is the dropoff location latitude
> * end_longitude is the dropoff location longitude
> * distance is the trip distance in miles
> * weekday is a boolean for weekday (1 if the ride occurs on Mon-Fri, 0 otherwise)
> * hour is the hour of day (0 to 23)
> * month_1 is a boolean if the ride is is in January (1 if true, 0 otherwise)
> * month_2 is a boolean if the ride is is in February (1 if true, 0 otherwise)
> * month_3 is a boolean if the ride is is in March (1 if true, 0 otherwise)
> * month_4 is a boolean if the ride is is in April (1 if true, 0 otherwise)
> * month_5 is a boolean if the ride is is in May (1 if true, 0 otherwise)
> * month_6 is a boolean if the ride is is in June (1 if true, 0 otherwise)

## Serving queries through the RESTServer service:
You might want to make queries directly through the RESTServer of the deployment-engine.

To get a report on all services related to a specific virtual cluster, you can use a form of **kubectl describe** that matches against a value of the **kubedirector.hpe.com/kdcluster=YourClusterApplicationName,kubedirector.hpe.com/role=RESTServer** label.

In [4]:
#
# Getting the RESTServer service endpoint URL (role: RESTServer, internal port: 10001):
#
RESTServerURL=$(kubectl describe service -l kubedirector.hpe.com/kdcluster=${clusterName},kubedirector.hpe.com/role=RESTServer | grep gateway/10001 | awk '{print $3}')
RESTServerPort=$(echo $RESTServerURL | cut -d':' -f 2) # extract the gateway re-mapped port value.
RESTServer_endpoint="https://$gateway_host:$RESTServerPort"
echo "The RESTServer service endpoint re-mapped port is: "$RESTServerPort
echo "Your RESTServer service endpoint is: "$RESTServer_endpoint
#echo "The RESTServer service endpoint URL is: https://"$Internet_access:$RESTServerPort
#
# Getting the auth-token:
#
RESTServerAuthToken=$(kubectl describe service -l kubedirector.hpe.com/kdcluster=${clusterName},kubedirector.hpe.com/role=RESTServer | grep kd-auth-token  | awk '{print $2}' | tr -d '\r')
echo "The RESTServer service authentication token is: "$RESTServerAuthToken

The RESTServer service endpoint re-mapped port is: 10149
Your RESTServer service endpoint is: https://hpecpgw1.hp.local:10149
The RESTServer service authentication token is: 20941723f2cac5d91cf56a8d72ab6ed1


In [5]:
curl --location -k -s --request POST "${RESTServer_endpoint}/${TrainingModel}/${modelVersion}/predict" \
--header "X-Auth-Token: ${RESTServerAuthToken}" \
--header 'Content-Type: application/json' \
--data-raw '{
    "use_scoring": true,
    "scoring_args": {
        "work": 0,
        "start_latitude": 40.57689727,
        "start_longitude": -73.99047356,
        "end_latitude": 40.72058154,
        "end_longitude": -73.99740673,
        "distance": 8,
        "weekday": 1,
        "hour": 9,
        "month_1": 0,
        "month_2": 1,
        "month_3": 0,
        "month_4": 0,
        "month_5": 0,
        "month_6": 0
    }
}' | python -m json.tool | grep output | cut -d'\' -f 1

    "output": "The ride duration prediction is 3211.7134 seconds.


# Dynamic? Did someone say Dynamic ML pipeline?

Over time the accuracy of the predictions will drop, your dataset will change, your model will improve and your scoring script may change as well. The ML pipeline needs to adapt to constantly changing dataset and enhanced ML models.

In this part of the lab, let's imagine you want to improve the prediction accuracy of your deployed model. You will _retrain_ your model by tuning the model parameters to get a better predictive performance and save your retrained model into a new file. You will then update the model registry information in the configMap kubernetes resource with the file path for the new trained model file and new scoring script file. 

#### Go back to your **local Jupyter Notebook**, and run the cell code from the section **"Retrain the model to improve model accuracy"**. Once your model is retrained, come back here to continue from the cell codes below to adjust the model registry information accordingly.

## Adjust the model registry information

In [6]:
sed -i "s/example/${studentId}/g" $PipelineConfigMapv2
cat $PipelineConfigMapv2

apiVersion: v1
kind: ConfigMap
metadata:
  name: model-student74
  labels:
    kubedirector.hpe.com/cmType: "model"
data:
  name: model-student74
  description: "student74 model"
  model-version: "1"
  path: /bd-fs-mnt/TenantShare/repo/models/NYCTaxi/student74/XGB.picklev2.dat
  scoring-path: /bd-fs-mnt/TenantShare/repo/code/NYCTaxi/student74/XGB_Scoringv2.py

In [7]:
kubectl apply -f $PipelineConfigMapv2

configmap/model-student74 configured


**Use the command below and check the events logged against it**

In the events section at the bottom of the command output, you should notice the message:  
_"Connected to cluster {your deployment-engine cluster name}; updating it."_ 

KubeDirector is updating the model registry metadata information on the PODs of your deployment-engine cluster.

In [8]:
kubectl describe configmap $TrainingModel

Name:         model-student74
Namespace:    k8smltenant
Labels:       kubedirector.hpe.com/cmType=model
Annotations:  <none>

Data
====
description:
----
student74 model
model-version:
----
1
name:
----
model-student74
path:
----
/bd-fs-mnt/TenantShare/repo/models/NYCTaxi/student74/XGB.picklev2.dat
scoring-path:
----
/bd-fs-mnt/TenantShare/repo/code/NYCTaxi/student74/XGB_Scoringv2.py
Events:
  Type    Reason   Age   From          Message
  ----    ------   ----  ----          -------
  Normal  Cluster  3s    kubedirector  connected to cluster {inference-server-student74}; updating it


**Use the command below and check the events logged against it** 
  
In the events section at the bottom of the command output, you should notice the message:  

_"connected configmap has changes, updated context for the PODs of your instance of the deployment engine cluster"._ 

#### By changing the configMap, your entire ML pipeline will be reconciled by KubeDirector operator for you, while the containers of your deployment-engine cluster instance remain running. **This is what makes your ML pipeline created with KubeDirector very dynamic.**

In [9]:
kubectl describe kdcluster $clusterName

Name:         inference-server-student74
Namespace:    k8smltenant
Labels:       <none>
Annotations:  kubedirector.hpe.com/connUpdateCounter: 1
              kubedirector.hpe.com/hashChangeCounter: 1
API Version:  kubedirector.hpe.com/v1beta1
Kind:         KubeDirectorCluster
Metadata:
  Creation Timestamp:  2020-12-16T18:35:15Z
  Finalizers:
    kubedirector.hpe.com/cleanup
  Generation:        1
  Resource Version:  311614
  Self Link:         /apis/kubedirector.hpe.com/v1beta1/namespaces/k8smltenant/kubedirectorclusters/inference-server-student74
  UID:               9ba7c2ba-077d-4ead-b535-697c2f8252c3
Spec:
  App:          deployment-engine
  App Catalog:  local
  Connections:
    Configmaps:
      model-student74
  Naming Scheme:  UID
  Roles:
    Id:       RESTServer
    Members:  1
    Resources:
      Limits:
        Cpu:     1
        Memory:  2Gi
      Requests:
        Cpu:     1
        Memory:  2Gi
    Id:          LoadBalancer
    Members:     1
    Resources:
      Limi

Let's make another query using your retrained model:

In [10]:
curl --location -k -s --request POST "${LoadBalancer_endpoint}/${TrainingModel}/${modelVersion}/predict" \
--header "X-Auth-Token: ${LoadBalancerAuthToken}" \
--header 'Content-Type: application/json' \
--data-raw '{
    "use_scoring": true,
    "scoring_args": {
        "work": 0,
        "start_latitude": 40.57689727,
        "start_longitude": -73.99047356,
        "end_latitude": 40.72058154,
        "end_longitude": -73.99740673,
        "distance": 8,
        "weekday": 1,
        "hour": 9,
        "month_1": 0,
        "month_2": 1,
        "month_3": 0,
        "month_4": 0,
        "month_5": 0,
        "month_6": 0
    }
}' | python -m json.tool | grep output | cut -d'\' -f 1

    "output": "The ride duration prediction is 3412.6396 seconds.


# Time to go through some cleanup

### Delete your deployment engine and your local Jupyter Notebook

In [11]:
kubectl delete -f $DeploymentEngineApp

kubedirectorcluster.kubedirector.hpe.com "inference-server-student74" deleted


In [12]:
kubectl delete -f $PipelineConfigMap

configmap "model-student74" deleted


In [13]:
kubectl delete -f $JupyterNotebookApp

kubedirectorcluster.kubedirector.hpe.com "jupyter-notebook-student74" deleted


### Reset the application files

In [14]:
#reset the application deployment name in the YAML file
sed -i "s/${studentId}/example/g" $JupyterNotebookApp
sed -i "s/${studentId}/example/g" $DeploymentEngineApp
sed -i "s/${studentId}/example/g" $PipelineConfigMap
sed -i "s/${studentId}/example/g" $PipelineConfigMapv2
cat $JupyterNotebookApp
cat $DeploymentEngineApp
cat $PipelineConfigMap
cat $PipelineConfigMapv2

apiVersion: "kubedirector.hpe.com/v1beta1"
kind: "KubeDirectorCluster"
metadata:
  name: "jupyter-notebook-example"
spec:
  app: "jupyter-notebook-v1"
  appCatalog: "local"
  connections: 
    #secrets: 
      #- 
        #"some secrets"
    #configmaps: 
      #- 
        #"some configmaps"
    clusters: 
      - "training-engine-shared"
        #"some clusters"
  roles:
  - id: controller
    resources:
      requests:
        memory: "2Gi"
        cpu: "1"
      limits:
        memory: "2Gi"
        cpu: "1"
apiVersion: "kubedirector.hpe.com/v1beta1"
kind: "KubeDirectorCluster"
metadata: 
  name: "inference-server-example"

spec:
  app: deployment-engine
  appCatalog: "local"
  connections: 
    #secrets: 
      #- 
        #"some secrets"
    configmaps: 
      - "model-example" 
        #"some configmaps"
    #clusters: 
      #-
        #"some clusters"
  roles:
  - id: RESTServer
    members: 1
    resources:
      requests:
        memory: "2Gi"
        cpu: "1"
      limits:
 

## Summary

In this lab, we have shown you how you, **as tenant user** can make predictions on a trained model registered in a Kubernetes cluster. You also learned how the key use of KubeDirector applications, Clusters and Connections is what makes your ML pipeline very dynamic.


* [Conclusion](6-Conclusion.ipynb)