# Azure Kubernetes Services (AKS) 

In this notebook, we will first provision an AKS cluster and install blobfuse.

In [14]:
from dotenv import set_key, get_key, find_dotenv
from pathlib import Path


from environs import Env

## Setup

Let's first define the names and configurations of the resources that will be provisioned on Azure.

In [15]:
ENV = Env()
ENV.read_env()

subscription_id = ENV("AZURE_SUBSCRIPTION_ID")
resource_group = ENV("AZURE_RESOURCE_GROUP") #  i.e.'kuberg'
location = ENV("AZURE_RESOURCE_LOCATION") # i.e. 'eastus'
agent_size = "Standard_NC6" # i.e. 'Standard_NC6', Standard_D1_v2
aks_name = ENV("AKS_NAME") # i.e. 'kubeaks'
agent_count = 1 # agent count is the number of VMs that will be provisioned in the cluster, you can pick any number.
storage_account = "fenocerakubestorage"#ENV("STORAGE_ACCOUNT_NAME") # i.e. 'kubest'
storage_container = ENV("AKS_CONTAINER") # i.e. 'blobfuse'

Create and initialize a dotenv file for storing parameters used in multiple notebooks.

DONT THINK THIS SHOULD BE HERE!! 

In [16]:
# env_path = find_dotenv()
# if env_path == "":
#     Path(".env").touch()
#     env_path = find_dotenv()

In [17]:
# set_key(env_path, 'subscription_id', subscription_id) 
# set_key(env_path, 'resource_group', resource_group)
# set_key(env_path, 'storage_account', storage_account)
# set_key(env_path, 'storage_container', storage_container)

## Create resource group and AKS cluster

In [18]:
!az account set -s {subscription_id}

In [19]:
# I guess you dont need this if you have a res g already
# !az group create --name {resource_group} --location {location} 

In [20]:
# !az aks create --node-vm-size {agent_size} --resource-group {resource_group} --name {aks_name} --node-count {agent_count} --kubernetes-version 1.11.6  --generate-ssh-keys --query 'provisioningState'

In [21]:
!az aks get-versions --location eastus --output table


KubernetesVersion    Upgrades
-------------------  ------------------------
1.13.5               None available
1.12.8               1.13.5
1.12.7               1.12.8, 1.13.5
1.11.9               1.12.7, 1.12.8
1.11.8               1.11.9, 1.12.7, 1.12.8
1.10.13              1.11.8, 1.11.9
1.10.12              1.10.13, 1.11.8, 1.11.9
1.9.11               1.10.12, 1.10.13
1.9.10               1.9.11, 1.10.12, 1.10.13


- Need to make sure you check the version of kubernetes with `!az aks get-versions --location eastus --output table` 
- This parameter does not exist in the docs so not sure it should be there--query 'provisioningState', looks to be something relatex to az not aks create....look up

In [22]:
!az aks create --resource-group {resource_group} --node-vm-size {agent_size} --name {aks_name} --node-count {agent_count} --generate-ssh-keys

[K{- Finished ..
  "aadProfile": null,
  "addonProfiles": null,
  "agentPoolProfiles": [
    {
      "availabilityZones": null,
      "count": 1,
      "enableAutoScaling": null,
      "maxCount": null,
      "maxPods": 110,
      "minCount": null,
      "name": "nodepool1",
      "orchestratorVersion": "1.12.8",
      "osDiskSizeGb": 100,
      "osType": "Linux",
      "provisioningState": "Succeeded",
      "type": "AvailabilitySet",
      "vmSize": "Standard_NC6",
      "vnetSubnetId": null
    }
  ],
  "apiServerAuthorizedIpRanges": null,
  "dnsPrefix": "kubeflowte-fenocerarg-fb45cb",
  "enablePodSecurityPolicy": null,
  "enableRbac": true,
  "fqdn": "kubeflowte-fenocerarg-fb45cb-e391bb27.hcp.eastus.azmk8s.io",
  "id": "/subscriptions/fb45cb39-23ee-447d-a047-4c8ba0a5d527/resourcegroups/fenocera_rg/providers/Microsoft.ContainerService/managedClusters/kubeflowtest",
  "kubernetesVersion": "1.12.8",
  "linuxProfile": {
    "adminUsername": "azureuser",
    "ssh": {
      "publicKeys"

Install kubectl to connect to the Kubernetes cluster.

In [10]:
#!sudo az aks install-cli

Now, let's connect to AKS cluster and get the nodes.

In [23]:
!az aks get-credentials --resource-group {resource_group} --name {aks_name}

A different object named kubeflowtest already exists in your kubeconfig file.
Overwrite? (y/n): ^C


In [24]:
!kubectl get nodes

NAME                       STATUS    ROLES     AGE       VERSION
aks-nodepool1-34660992-0   Ready     agent     4m        v1.12.8


Let's check the first node.

In [25]:
node_names = !kubectl get nodes -o name
!kubectl describe node {node_names[0].strip('node/')}

Name:               aks-nodepool1-34660992-0
Roles:              agent
Labels:             accelerator=nvidia
                    agentpool=nodepool1
                    beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=Standard_NC6
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=eastus
                    failure-domain.beta.kubernetes.io/zone=0
                    kubernetes.azure.com/cluster=MC_fenocera_rg_kubeflowtest_eastus
                    kubernetes.io/hostname=aks-nodepool1-34660992-0
                    kubernetes.io/role=agent
                    node-role.kubernetes.io/agent=
                    storageprofile=managed
                    storagetier=Standard_LRS
Annotations:        node.alpha.kubernetes.io/ttl=0
                    volumes.kubernetes.io/controller-managed-attach-detach=true
CreationTimestamp:  Tue, 21 May 2019 21:27:38 -0400
Taints:             

Deploy the following deamonset to enable GPU support in Kubernetes.

In [26]:
!kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v1.11/nvidia-device-plugin.yml

daemonset.extensions "nvidia-device-plugin-daemonset" created


In [27]:
!kubectl get pods --all-namespaces

NAMESPACE     NAME                                    READY     STATUS              RESTARTS   AGE
kube-system   coredns-66cb57b9db-4thcv                1/1       Running             0          4m
kube-system   coredns-66cb57b9db-zbmrl                1/1       Running             0          9m
kube-system   coredns-autoscaler-7fd449d848-rjszm     1/1       Running             0          9m
kube-system   heapster-7677c744b8-jqrnp               2/2       Running             0          9m
kube-system   kube-proxy-864x4                        1/1       Running             0          5m
kube-system   kube-svc-redirect-n6tqz                 2/2       Running             0          5m
kube-system   kubernetes-dashboard-7b55c6f7b9-xpg8k   1/1       Running             0          9m
kube-system   metrics-server-67c75dbf7-8g9zk          1/1       Running             0          9m
kube-system   nvidia-device-plugin-daemonset-njsm6    0/1       ContainerCreating   0          1s
kube-syst

## Attach blobfuse on AKS

We will use [blobfuse](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-how-to-mount-container-linux) using [blobfuse volume driver for Kubernetes](https://github.com/Azure/kubernetes-volume-drivers/tree/master/flexvolume/blobfuse) to store the model servables for Kubeflow tensorflow serving component to serve the model from. The driver requires that a storage account and a container created in the same region with the kubernetes cluster. 

### Create storage account and copy model servable

Let's first create that storage account. or we can use an existing storage account?! 


In [28]:
!az storage account create -n {storage_account} -g {resource_group} --query 'provisioningState'

[K - Starting ..[K - Finished ..[K"Succeeded"
[0m

Let's get the first storage acount key.

In [29]:
key = !az storage account keys list --account-name {storage_account} -g {resource_group} --query '[0].value'
storage_account_key = str(key[0][1:-1]) # clean up key

In [30]:
storage_container = "kubetestcont"

Create the container to be used by blobfuse driver.

In [25]:
!az storage container create --name {storage_container} --account-key {storage_account_key} --account-name {storage_account}

[91m
Missing credentials to access storage service. The following variations are accepted:
    (1) account name and key (--account-name and --account-key options or
        set AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_KEY environment variables)
    (2) account name and SAS token (--sas-token option used with either the --account-name
        option or AZURE_STORAGE_ACCOUNT environment variable)
    (3) account name (--account-name option or AZURE_STORAGE_ACCOUNT environment variable;
        this will make calls to query for a storage account key using login credentials)
    (4) connection string (--connection-string option or
        set AZURE_STORAGE_CONNECTION_STRING environment variable); some shells will require
        quoting to preserve literal character interpretation.
[0m


Now, we can upload the model servables to the container. This step requires that you have azcopy installed.

In [31]:
destination = 'https://{}.blob.core.windows.net/{}'.format(storage_account, storage_container)

In [32]:
# USING MAC SO UPLOADED models folder directly to blob manually 

#!azcopy --source models --destination {destination} --dest-key {storage_account_key} --recursive

### Install blobfuse driver on AKS

We will deploy the following deamonset to enable blobfuse on every node of the cluster.

In [34]:
#NOT GOING TO RUN THE BLOBFUSE INSTALLER AS A TEST!!

#! kubectl create -f https://raw.githubusercontent.com/Azure/kubernetes-volume-drivers/master/flexvolume/blobfuse/deployment/blobfuse-flexvol-installer-1.9.yaml

In [33]:
!kubectl get pods --all-namespaces

NAMESPACE     NAME                                    READY     STATUS    RESTARTS   AGE
kube-system   coredns-66cb57b9db-4thcv                1/1       Running   0          5m
kube-system   coredns-66cb57b9db-zbmrl                1/1       Running   0          10m
kube-system   coredns-autoscaler-7fd449d848-rjszm     1/1       Running   0          10m
kube-system   heapster-7677c744b8-jqrnp               2/2       Running   0          10m
kube-system   kube-proxy-864x4                        1/1       Running   0          5m
kube-system   kube-svc-redirect-n6tqz                 2/2       Running   0          5m
kube-system   kubernetes-dashboard-7b55c6f7b9-xpg8k   1/1       Running   0          10m
kube-system   metrics-server-67c75dbf7-8g9zk          1/1       Running   0          10m
kube-system   nvidia-device-plugin-daemonset-njsm6    1/1       Running   0          50s
kube-system   tunnelfront-7bc54c5887-k28v9            1/1       Running   0          10m


In [None]:
!kubectl describe daemonset blobfuse-flexvol-installer --namespace=kube-system # BUG HERE AS NAMESPACE WAS WRONG!!!

Now, we can move on to [installing Kubeflow and serving the model on AKS with kubeflow tensorflow serving component](03_ServeWithKubeflow.ipynb).