### Deploy Web App on Azure Container Services (AKS)
In this notebook, we will set up an Azure Container Service which will be managed by Kubernetes. We will then take the Docker image we created earlier that contains our app and deploy it to the AKS cluster. Then, we will check everything is working by sending an image to it and getting it scored.
    
The process is split into the following steps:
* [Define our resource names](#section1)
* [Login to Azure](#section2)
* [Create resource group and create AKS](#section3)
* [Connect to AKS](#section4)
* [Deploy our app](#section5)

This guide assumes is designed to be run on linux and requires that the Azure CLI is installed.

In [1]:
import json
from testing_utilities import write_json_to_file
from dotenv import set_key, get_key, find_dotenv

<a id='section1'></a>
## Setup
Below are the various name definitions for the resources needed to setup AKS.

In [2]:
env_path = find_dotenv(raise_error_if_not_found=True)

In [3]:
set_key(env_path, 'selected_subscription', '<YOUR_SUBSCRIPTION>') # Replace YOUR_AZURE_SUBSCRIPTION
set_key(env_path, 'resource_group', 'msaksrg')
set_key(env_path, 'aks_name', 'msaks')
set_key(env_path, 'location', 'eastus')

(True, 'location', 'eastus')

In [4]:
image_name = get_key(env_path, 'docker_login') + '/' +get_key(env_path, 'image_repo') 

<a id='section2'></a>
## Azure account login
If you are not already logged in to an Azure account, the command below will initiate a login. It will pop up a browser where you can select an Azure account.

In [5]:
%%bash
list=`az account list -o table`
if [ "$list" == '[]' ] || [ "$list" == '' ]; then 
  az login -o table
else
  az account list -o table 
fi

In [6]:
!az account set --subscription "{get_key(env_path, 'selected_subscription')}"

[0m

In [7]:
!az account show

In [8]:
!az provider register -n Microsoft.ContainerService

[0m

In [9]:
!az provider show -n Microsoft.ContainerService

<a id='section3'></a>
## Create resource group and create AKS

### Create resource group
Azure encourages the use of groups to organise all the Azure components you deploy. That way it is easier to find them but also we can deleted a number of resources simply by deleting the group.

In [10]:
!az group create --name {get_key(env_path, 'resource_group')} \
                 --location {get_key(env_path, 'location')}

Below, we create the AKS cluster in the resource group we created earlier. This can take up to 15 minutes.

In [28]:
!az aks create --resource-group {get_key(env_path, 'resource_group')}  \
               --name {get_key(env_path, 'aks_name')} \
               --node-count 1 \
               --generate-ssh-keys \
               -s Standard_NC6 \
               --kubernetes-version 1.11.2

### Install kubectl CLI
To connect to the Kubernetes cluster, we will use kubectl, the Kubernetes command-line client. To install, run the following:

In [29]:
!sudo az aks install-cli

[33mDownloading client to /usr/local/bin/kubectl from https://storage.googleapis.com/kubernetes-release/release/v1.12.2/bin/linux/amd64/kubectl[0m
[33mPlease ensure that /usr/local/bin is in your search PATH, so the `kubectl` command can be found.[0m


<a id='section4'></a>
## Connect to AKS cluster

To configure kubectl to connect to the Kubernetes cluster, run the following command:


**NOTE: If you get an error below try deleting the .kube/config file in your home directory and running the command above again**

In [44]:
!az aks get-credentials --resource-group {get_key(env_path, 'resource_group')}\
                        --name {get_key(env_path, 'aks_name')}

Let's verify connection by listing the nodes. 


In [45]:
!kubectl get nodes

NAME                       STATUS   ROLES   AGE   VERSION
aks-nodepool1-24684105-0   Ready    agent   9m    v1.11.2


Let's check the pods on our cluster.

In [49]:
!kubectl get pods --all-namespaces

NAMESPACE     NAME                                    READY   STATUS    RESTARTS   AGE
kube-system   heapster-5f8d5688-45jfp                 2/2     Running   0          14m
kube-system   kube-dns-v20-54f74f4458-fbl9r           3/3     Running   0          14m
kube-system   kube-dns-v20-54f74f4458-zrt5b           3/3     Running   0          14m
kube-system   kube-proxy-wzh9g                        1/1     Running   0          10m
kube-system   kube-svc-redirect-8d6hl                 2/2     Running   0          10m
kube-system   kubernetes-dashboard-85c9c5944d-g954s   1/1     Running   2          14m
kube-system   metrics-server-76f76c6bfd-kpkfw         1/1     Running   2          14m
kube-system   nvidia-device-plugin-daemonset-mf2kg    1/1     Running   0          24s
kube-system   tunnelfront-7679cfdd45-nrl8b            1/1     Running   0          14m


In order to be able to use the GPU we need to install the GPU device plugin from NVIDIA. For more information look at [https://github.com/nvidia/k8s-device-plugin/](https://github.com/nvidia/k8s-device-plugin/)

In [47]:
!kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.11/nvidia-device-plugin.yml

daemonset.extensions/nvidia-device-plugin-daemonset created


<a id='section5'></a>
## Deploy application

Below we define our Kubernetes manifest file for our service and load balancer. Note that we have to specify the volume mounts to the drivers that are located on the node.


In [50]:
app_template = {
  "apiVersion": "apps/v1beta1",
  "kind": "Deployment",
  "metadata": {
      "name": "azure-dl"
  },
  "spec":{
      "replicas":1,
      "template":{
          "metadata":{
              "labels":{
                  "app":"azure-dl"
              }
          },
          "spec":{
              "containers":[
                  {
                      "name": "azure-dl",
                      "image": image_name,
                      "env":[
                          {
                              "name": "LD_LIBRARY_PATH",
                              "value": "$LD_LIBRARY_PATH:/usr/local/nvidia/lib64:/opt/conda/envs/py3.6/lib"
                          }
                      ],
                      "ports":[
                          {
                              "containerPort":80,
                              "name":"model"
                          }
                      ],
                      "volumeMounts":[
                          {
                            "mountPath": "/usr/local/nvidia",
                            "name": "nvidia"
                          }
                      ],
                      "resources":{
                           "requests":{
                               "nvidia.com/gpu": 1
                           },
                           "limits":{
                               "nvidia.com/gpu": 1
                           }
                       }  
                  }
              ],
              "volumes":[
                  {
                      "name": "nvidia",
                      "hostPath":{
                          "path":"/usr/local/nvidia"
                      },
                  },
              ]
          }
      }
  }
}

service_temp = {
  "apiVersion": "v1",
  "kind": "Service",
  "metadata": {
      "name": "azure-dl"
  },
  "spec":{
      "type": "LoadBalancer",
      "ports":[
          {
              "port":80
          }
      ],
      "selector":{
            "app":"azure-dl"
      }
   }
}

In [51]:
write_json_to_file(app_template, 'az-dl.json') # We write the service template to the json file

In [52]:
write_json_to_file(service_temp, 'az-dl.json', mode='a') # We add the loadbelanacer template to the json file

Let's check the manifest created.

In [53]:
!cat az-dl.json

{
    "apiVersion": "apps/v1beta1",
    "kind": "Deployment",
    "metadata": {
        "name": "azure-dl"
    },
    "spec": {
        "replicas": 1,
        "template": {
            "metadata": {
                "labels": {
                    "app": "azure-dl"
                }
            },
            "spec": {
                "containers": [
                    {
                        "env": [
                            {
                                "name": "LD_LIBRARY_PATH",
                                "value": "$LD_LIBRARY_PATH:/usr/local/nvidia/lib64:/opt/conda/envs/py3.6/lib"
                            }
                        ],
                        "image": "masalvar/pytorch-gpu",
                        "name": "azure-dl",
                        "ports": [
                            {
                                "containerPort": 80,
                                "name": "model"
                            }
          

Next, we will use kubectl create command to deploy our application.

In [54]:
!kubectl create -f az-dl.json

deployment.apps/azure-dl created
service/azure-dl created


Let's check if the pod is deployed.

In [60]:
!kubectl get pods --all-namespaces

NAMESPACE     NAME                                    READY   STATUS    RESTARTS   AGE
default       azure-dl-5778795bdc-s6ps8               1/1     Running   0          6m
kube-system   heapster-5f8d5688-45jfp                 2/2     Running   0          21m
kube-system   kube-dns-v20-54f74f4458-fbl9r           3/3     Running   0          21m
kube-system   kube-dns-v20-54f74f4458-zrt5b           3/3     Running   0          21m
kube-system   kube-proxy-wzh9g                        1/1     Running   0          17m
kube-system   kube-svc-redirect-8d6hl                 2/2     Running   0          17m
kube-system   kubernetes-dashboard-85c9c5944d-g954s   1/1     Running   2          21m
kube-system   metrics-server-76f76c6bfd-kpkfw         1/1     Running   2          21m
kube-system   nvidia-device-plugin-daemonset-mf2kg    1/1     Running   0          7m
kube-system   tunnelfront-7679cfdd45-nrl8b            1/1     Running   0          20m


If anything goes wrong you can use the commands below to observe the events on the node as well as review the logs.

In [61]:
!kubectl get events

LAST SEEN   FIRST SEEN   COUNT   NAME                                         KIND         SUBOBJECT                   TYPE      REASON                      SOURCE                                 MESSAGE
17m         17m          2       aks-nodepool1-24684105-0.15662dcc6f8ff4f4    Node                                     Normal    NodeHasSufficientDisk       kubelet, aks-nodepool1-24684105-0      Node aks-nodepool1-24684105-0 status is now: NodeHasSufficientDisk
17m         17m          2       aks-nodepool1-24684105-0.15662dcc6f904a80    Node                                     Normal    NodeHasSufficientMemory     kubelet, aks-nodepool1-24684105-0      Node aks-nodepool1-24684105-0 status is now: NodeHasSufficientMemory
17m         17m          2       aks-nodepool1-24684105-0.15662dcc6f906aeb    Node                                     Normal    NodeHasNoDiskPressure       kubelet, aks-nodepool1-24684105-0      Node aks-nodepool1-24684105-0 status is now: NodeHasNoDiskPressure
1

In [62]:
pod_json = !kubectl get pods -o json
pod_dict = json.loads(''.join(pod_json))
!kubectl logs {pod_dict['items'][0]['metadata']['name']}

2018-11-11 21:20:04,092 CRIT Supervisor running as root (no user in config file)
2018-11-11 21:20:04,094 INFO supervisord started with pid 1
2018-11-11 21:20:05,096 INFO spawned: 'program_exit' with pid 9
2018-11-11 21:20:05,097 INFO spawned: 'nginx' with pid 10
2018-11-11 21:20:05,098 INFO spawned: 'gunicorn' with pid 11
2018-11-11 21:20:06,123 INFO success: program_exit entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
Downloading: "https://download.pytorch.org/models/resnet152-b121ed2d.pth" to /root/.torch/models/resnet152-b121ed2d.pth
0.0%0.0%0.0%0.0%0.0%0.0%0.0%0.0%0.0%0.0%0.0%0.0%0.0%0.0%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.1%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.2%0.3%0.3%0.3%0.3%0.3%0.3%0.3%0.3%0.3%0.3%0.3%

33.4%33.4%33.4%33.4%33.4%33.4%33.4%33.4%33.4%33.4%33.4%33.4%33.4%33.4%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.5%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.6%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.7%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.8%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%33.9%34.0%34.0%34.0%34.0%34.0%34.

It can take a few minutes for the service to populate the EXTERNAL-IP field. This will be the IP you use to call the service. You can also specify an IP to use please see the AKS documentation for further details.

In [63]:
!kubectl get service azure-dl

NAME       TYPE           CLUSTER-IP    EXTERNAL-IP   PORT(S)        AGE
azure-dl   LoadBalancer   10.0.253.42   40.76.41.82   80:32230/TCP   6m


Next, we will [test our web application](05_TestWebApp.ipynb) deployed on AKS. 