# Deploying end-to-end machine learning workflows with HPE Ezmeral ML Ops - Lab 2
## Deploying a local Jupyter Notebook sandbox for model development and training

### **Lab workflow:**

In this lab:

1. As tenant user, you will first deploy a local ML Ops Jupyter Notebook application cluster to develop your model. In this workshop the data science notebooks and code scripts are source controlled in GitHub. You will therefore attach your local Jupyter Notebook cluster to a Kubernetes **Secret** for seamless integration with _GitHub Version Control System_ that lets data scientists maintain versioning of their model notebooks and code scripts directly from their local Jupyter Notebook. You will also connect the local Jupyter Notebook cluster to a **remote tenant-shared training cluster** to train your model within the notebook. The shared training cluster includes the open source ML toolkits, libraries and frameworks for developing and training models. It has been already deployed by the Operations team for your tenant. The shared training cluster will allow you to train your model faster using more compute and memory resources than your local Jupyter Notebook cluster.

2. You will then access your local Jupyter Notebook web UI to develop your model, then train the model to the remote tenant-shared training cluster. 


**Definitions:**

- *Training:* Input datasets are processed to create machine learning model. Data scientists can use local Jupyter Notebook to build their models and train their models. They can also interact with remote larger capacity training cluster to train their models faster on larger datasets.

- *KubeDirector:* also known as [Kubernetes Director](https://kubedirector.io/). HPE Ezmeral ML Ops runs ML Ops applications with KubeDirector, an **open-source project from HPE**, to address the deployment of stateful scaleout ML Ops applications in Kubernetes clusters. In the context of HPE Ezmeral ML Ops, these ML Ops applications refer to a distributed, single-node or multi-node application **virtual cluster**.

### **1- Initialize the environment**

Let's first define the environment variables needed to execute this part of the lab.

In [None]:
#
# environment variables
#
username="student{{ STDID }}" 
password="{{ PASSSTU }}"
studentId=$(grep hpecp-user $HOME/.kube/config | cut -d= -f2)
#
gateway_host="{{ HPEECPGWNAME }}"
Internet_access="{{ JPHOSTEXT }}"

sc_secret="sc-secret-mlops-students.yaml" # The secret object for the GitHub VCS information. it is used by the notebook to know how to connect to the VCS.
JupyterNotebookApp="cr-cluster-mlops-jupyter-notebook.yaml" # the Jupyter Notebook ML Ops app manifest you will deploy to build your model

echo "Your studentId is: "$studentId

### **2- List the ML Ops applications registered for your tenant**

**HPE Ezmeral ML Ops runs ML Ops applications with [KubeDirector](https://kubedirector.io/), an open-source project from HPE.** 

You can get the list of KubeDirector ML Ops applications registered with the Kubernetes cluster for your tenant using the `kubectl get kdapp` command. A KubeDirector application (kdapp) is a _template or a blueprint_ for the application. It describes an application's **metadata** (service roles, Docker images, configuration packages, services ports, persistent storage). A KubeDirector cluster (kdcluster) is a running instance of a KubeDirector application.  

In this workshop, you will be using the ML Ops application **_jupyter-notebook_** to create your local Jupyter Notebook cluster with Git code versioning integrated into notebooks.

In [None]:
kubectl get kdapp

### **3- Deploying your local Jupyter Notebook cluster with _Connection_ to a remote tenant-shared training cluster and to a GitHub VCS repository**

You will deploy an instance of the **jupyter-notebook** ML Ops application with code versioning integrated into notebooks by creating a KubeDirector virtual cluster (kdcluster). A kdcluster identifies the desired application template and specifies runtime configuration parameters, such as the size and resource requirements (CPU, GPU, Memory, storage) of the application virtual cluster. 

> **Note:** _The Jupyter Notebook cluster includes the open source machine learning toolkits, software libraries and frameworks for developing and training models such as TensorFlow, scikit-learn, keras, XGBoost, matplotlib, Jupyter Notebook, Numpy, Scipy, Pandas, etc._     
> _The Jupyter Notebook cluster also includes **Git** extension that allows data scientists to use Git and do version control of their model notebooks and code scripts directly from their local Jupyter Notebook._

#### Create the manifest file and deploy an instance of the Jupyter Notebook application virtual cluster:
Like any other containerized application deployment on Kubernetes, the `kubectl apply -f ManifestAppFile` command is used to deploy your local Jupyter Notebook cluster. The application manifest is a YAML file that describes the attributes of the application virtual cluster.  
> **Important note:** _One of the most interesting parts of the specification of the ML Ops application virtual cluster is the **Connections** stanza (a related group of attributes), which identifies other resources of interest to that ML Ops application virtual cluster._   
> - _Here, you connect your local Jupyter Notebook cluster to the tenant-shared training cluster **training-engine-shared** already deployed by the Operations team for your tenant._     
> - _You also attach a Kubernetes **Secret** for seamless integration with GitHub Version Control System. With the Kubernetes Secret the Jupyter Notebook cluster knows how to connect to the Github source control repository. At the time of the first user login to the Jupyter Notebook cluster, the Juputer Notebook cluster will pull up the content (for example the notebooks and the code scripts) from the particular GitHub repository branch specified in the Kubernetes Secret. For this workshop, a separate branch (named against your studentId) has been created in the {{ BRANDING }} Community GitHub account by the Operations team for each participant. The branch contains the model code scripts you will use in lab part 3 to train and test your model._

Let's first create the source control secret:

In [None]:
cat $sc_secret

This Kubernetes Secret is an object that contains a small amount of sensitive data such as authentication data (username, token personal account or password, email, and branch) to access GitHub repository branch content from your local Jupyter Notebook cluster. From within the local Jupyter Notebook, data scientists can do the usual _git status_, _git add_, _git commit_, _git push_ to push their notebooks to their GitHub repository branch and start versioning their notebooks and codes, and collaborate across projects.

In [None]:
kubectl apply -f $sc_secret

Then deploy the local Jupyter Notebook application cluster: 

In [None]:
cat $JupyterNotebookApp

In [None]:
kubectl apply -f $JupyterNotebookApp

After a few seconds, you should get the response message: *kubedirectorcluster/Your-instance-name created*.  

### **4- Inspect the deployed ML Ops application instance**
Your ML Ops application will be represented in the Kubernetes cluster by a custom resource of type **KubeDirectorCluster (kdcluster)**, with the name that was indicated inside the YAML file used to create it. Use the command `kubectl get kdcluster YourClustername` to list your ML Ops application virtual cluster.

In [None]:
clusterName="mlops-jupyter-notebook-${studentId}"
kubectl get kdcluster $clusterName

After creating the instance of the Jupyter Notebook application, you can use the `kubectl describe kdcluster` command below to observe its status and any events logged against it.

The application virtual cluster status indicates its overall "state" (top-level property of the status object). It should have a value of **"configured"**. 

> **Note:** _The first time a virtual cluster of a given ML Ops application type is created, it may take several minutes to reach its **"configured"** state, as the relevant Docker image must be downloaded and imported._ 

**>Run the `kubectl describe` command below and scroll down to the `Events` section to check the overall state of your application virtual cluster.**

**>Regularly repeat (every minute or so) the command below until the virtual cluster is in the state "_configured_".**

In [None]:
kubectl describe kdcluster $clusterName

You can use a form of the `kubectl get pod,service,statefulset` command that matches against a value of the **kubedirector.hpe.com/kdcluster=YourClusterApplicationName** label to observe the standard Kubernetes resources that compose the ML Ops application virtual cluster:

In [None]:
kubectl get pod,service,statefulset -l kubedirector.hpe.com/kdcluster=$clusterName

Your instance of the application virtual cluster is made up of a **StatefulSet**, a **POD** (a cluster node) and a **NodePort Service** per service role member (Controller), and a **headless service** for the application cluster.   

* The ClusterIP service is the headless service required by a Kubernetes StatefulSet to work. It maintains a stable POD network identity (i.e.: persistence of the hostname of the PODs across PODs rescheduling).
* The NodePort service exposes the Notebook application service with token-based authorization outside the Kubernetes cluster. 

### **5- Get your local Jupyter Notebook's service endpoint to connect to it**
To get a report on all services related to a specific application virtual cluster, you can use a form of **kubectl describe** that matches against a value of the **kubedirector.hpe.com/kdcluster=YourClusterApplicationName** label.

In [None]:
#
# Getting the service endpoint URL:
#
JupyterAppURL=$(kubectl describe service -l  kubedirector.hpe.com/kdcluster=${clusterName} | grep gateway/8000 | awk '{print $2}')
JupyterAppPort=$(echo $JupyterAppURL | cut -d':' -f 2) # extract the gateway re-mapped port value.
myJupyterApp_endpoint="https://$gateway_host:$JupyterAppPort"
echo "Your application service endpoint re-mapped port is: "$JupyterAppPort
#echo "Your Intranet application service endpoint is: "$myJupyterApp_endpoint
echo "Your Jupyter Notebook service endpoint URL is: "https://$Internet_access:$JupyterAppPort
echo "Your local Jupyter Notebook web UI login credentials are: $username / $password"

## **6- Connect to your local Jupyter Notebook web UI**

Click the **_service endpoint URL_** from Step 5 above to connect to your Jupyter Notebook sandbox. This opens a Jupyter Notebook login screen in a new browser tab. **Use the login credentials above to authenticate.** Upon the first login to the Jupyter Notebook server, the notebooks and model code scripts are all pulled from the Github source control repository branch set up for you by the Operations team.

> <font color="red"> **Note:** On Windows PC, ***Firefox*** is the recommended browser to connect to your local Jupyter Notebook UI. When using Chrome you may observe a message _"Server Not Running"_, in which case just click Restart button.</font>

> <font color="red"> **Note:** If you are seeing a security warning about the certificate while connecting to the local Jupyter Notebook web UI, please accept the risk and proceed to continue.</font>


![Jupyter-Notebook-Login](Pictures/Jupyter-Notebook-Login.png)

### <font color="red">Now, from your local Jupyter Notebook, open the notebook **3-WKSHP-MLOps-K8s-Model-Development.ipynb** and follow the instructions from the notebook to develop, train and test the model.</font>

Once your model is trained and saved to a file, follow the instructions in Lab 4 to deploy your trained model:

* [Lab 4 Model Registry and Deployment](4-WKSHP-MLOps-K8s-Register-Model-Deployment.ipynb)

## Summary

In this lab, we have shown you how you can deploy a local Jupyter Notebook virtual cluster with code versioning capabilities integrated. You will use your local Jupyter Notebook in the next lab for model development and submit model training jobs remotely on a tenant-shared training cluster.