# Machine Learning Pipeline with KubeDirector - Lab 5
## Serving prediction queries

### **Lab workflow**

You have built and trained a model from a dataset, register the trained model and deployed it in a deployment engine kdcluster. Now it is time to make predictions (i.e.: _how long my taxi ride take?_) with new data. The **LoadBalancer** service port on the deployment engine can now be used to serve REST API queries using the trained model and make predictions. 

In this lab:

1. You will use kubectl commands in the context of your tenant user account and get the LoadBalancer network service endpoints with token-based authentication of your inference deployment engine. 

2. You will then use a script in **_cURL_** for making queries to your prediction service. 


**Definitions:**

- *Model predictions:* The trained model is deployed to a target _"deployment engine"_ KubeDirector cluster environment in the Kubernetes cluster to serve predictions and for answering prediction queries from the trained model you registered.

- *Scoring:* Scoring denotes the process of generating predicted values from new data.


### **1- Initialize the environment**

Let's first define the environment variables needed to execute this part of the lab.

In [1]:
#
# environment variables
#
studentId="student75" # your Jupyter Notebook student Identifier (i.e.: student<xx>)

gateway_host="haecpgtw.etc.fr.comm.hpecorp.net"
Internet_access="notebooks.hpedev.io"

JupyterNotebookApp="cr-cluster-jupyter-notebook.yaml" # the Jupyter Notebook KD App manifest you will deploy to build your model
DeploymentEngineApp="cr-cluster-endpoint-wrapper.yaml" # the Deployment engine KD App manifest you will deploy to query your model for answers 
PipelineConfigMap="ml-pipeline-configmap.yaml" # ConfigMap manifest used to register the trained model version 1 
PipelineConfigMapv2="ml-pipeline-configmap-v2.yaml" # ConfigMap manifest used to register the trained model version 2 
#
clusterName="inference-server-${studentId}"
#
# Model registry information
#
TrainingModel="model-${studentId}"
modelVersion="1"
#
echo "Your studentId is: "$studentId

Your studentId is: student75


### **2- Serving queries through the Load Balancer service of the deployment engine cluster**

#### Get the service endpoint and the Authentication token of the Load Balancer service of the deployment engine kdcluster:
To get a report on all services related to a specific virtual cluster, you can use a form of **kubectl describe** that matches against a value of the **kubedirector.hpe.com/kdcluster=YourClusterApplicationName,kubedirector.hpe.com/role=LoadBalancer** label.

In [2]:
#
# Getting the access point for the HAPROXY service of the LoadBalancer (role: LoadBalancer, internal port: 32700)
#
LoadBalancerURL=$(kubectl describe service -l kubedirector.hpe.com/kdcluster=${clusterName},kubedirector.hpe.com/role=LoadBalancer | grep gateway/32700 | awk '{print $2}')
LoadBalancerPort=$(echo $LoadBalancerURL | cut -d':' -f 2) # extract the gateway re-mapped port value.
LoadBalancer_endpoint="https://$gateway_host:$LoadBalancerPort"
echo "Your deployment-engine's LoadBalancer service endpoint re-mapped port is: "$LoadBalancerPort
echo "Your deployment-engine's LoadBalancer service endpoint is: "$LoadBalancer_endpoint
#echo "The LoadBalancer service endpoint URL is: https://"$Internet_access:$RESTServerPort
#
# Getting the auth-token:
#
LoadBalancerAuthToken=$(kubectl describe service -l kubedirector.hpe.com/kdcluster=${clusterName},kubedirector.hpe.com/role=LoadBalancer | grep kd-auth-token  | awk '{print $2}' | tr -d '\r')
echo "The deployment-engine's Load Balancer service authentication token is: "$LoadBalancerAuthToken

Your deployment-engine's LoadBalancer service endpoint re-mapped port is: 10056
Your deployment-engine's LoadBalancer service endpoint is: https://haecpgtw.etc.fr.comm.hpecorp.net:10056
The deployment-engine's Load Balancer service authentication token is: f89a4f9489a023631b63d203a2aa57c7


### **3- Making predictions on new data**
To make a prediction, you create an authenticated "POST" API call that is formulated as follows:  
https://loadbalancer_endpoint/registeredModel/modelVersion/predict

The query below is used to predict how long a taxi ride in NY City with attributes listed below will take:
* pickup location: West 23rd street
* dropoff location: Centre Market place 
* on a weekday 
* at 09:00 am 
* in February

>Note: _It may take a few seconds to get the result of the REST API call_

In [3]:
curl --location -k -s --request POST "${LoadBalancer_endpoint}/${TrainingModel}/${modelVersion}/predict" \
--header "X-Auth-Token: ${LoadBalancerAuthToken}" \
--header 'Content-Type: application/json' \
--data-raw '{
    "use_scoring": true,
    "scoring_args": {
        "work": 0,
        "start_latitude": 40.57689727,
        "start_longitude": -73.99047356,
        "end_latitude": 40.72058154,
        "end_longitude": -73.99740673,
        "distance": 8,
        "weekday": 1,
        "hour": 9,
        "month_1": 0,
        "month_2": 1,
        "month_3": 0,
        "month_4": 0,
        "month_5": 0,
        "month_6": 0
    }
}' | python -m json.tool | grep output | cut -d'\' -f 1

    "output": "The ride duration prediction is 3211.7134 seconds.


The new data that you provide as input have the same columns that were used to train the model, minus the outcome column. 
> Fields description:
> * work is a boolean for work hours (1 if the ride occurs Mon-Fri 8am-5pm, 0 otherwise)
> * start_latitude is the pickup location latitude
> * start_longitude is the pickup location longitude
> * end_latitude is the dropoff location latitude
> * end_longitude is the dropoff location longitude
> * distance is the trip distance in miles
> * weekday is a boolean for weekday (1 if the ride occurs on Mon-Fri, 0 otherwise)
> * hour is the hour of day (0 to 23)
> * month_1 is a boolean if the ride is is in January (1 if true, 0 otherwise)
> * month_2 is a boolean if the ride is is in February (1 if true, 0 otherwise)
> * month_3 is a boolean if the ride is is in March (1 if true, 0 otherwise)
> * month_4 is a boolean if the ride is is in April (1 if true, 0 otherwise)
> * month_5 is a boolean if the ride is is in May (1 if true, 0 otherwise)
> * month_6 is a boolean if the ride is is in June (1 if true, 0 otherwise)

## Summary

In this lab, we have shown you how you can make prediction queries, using REST API calls, to a target deployment engine kdcluster environment that serves your model.

Now, follow the instructions in Lab 6 to explore the dynamic aspect of the ML pipeline you have just constructed with KubeDirector. 

* [Lab 6 Dynamic ML Pipeline](6-WKSHP-K8s-ML-Pipeline-Dynamic-Aspect.ipynb)