# Create the AKS cluster

In this notebook we'll setup the AKS cluster. To do so, we'll do the following:
1. check that there is enough quota to provision our desired cluster
2. provision the cluster using the `az cli`
3. set up blobfuse on the nodes so the pods in our kubernetes cluster can access our blob storage container

---

### Import packages and load .env

In [None]:
from dotenv import set_key, get_key, find_dotenv, load_dotenv
from pathlib import Path
import subprocess
import json
import os
import time

In [None]:
env_path = find_dotenv(raise_error_if_not_found=True)
load_dotenv(env_path)

### Provision AKS cluster and set up blobfuse

Set how many nodes you want to provision.

In [None]:
node_count = 3

set_key(env_path, "NODE_COUNT", str(node_count))
print("Done.")

Check that there are enough core of the "Standard_NC6s_v3". If not, check that there are enough core of the "Standard_D2s_v3". If not, raise exception. 

In [None]:
vm_dict = {
    "NCSv3": {
        "size": "Standard_NC6s_v3",
        "cores": 6
    },
    "NC": {
        "size": "Standard_NC6",
        "cores": 6
    },
    "DSv3": {
        "size": "Standard_D2s_v3",
        "cores": 2
    }
}

node_count = int(get_key(env_path, "NODE_COUNT"))

print("Checking quota for family size NCSv3...")
vm_family = "NCSv3"
requested_cores = node_count * vm_dict[vm_family]["cores"]

def check_quota(vm_family):
    """
    returns quota object
    """
    results = subprocess.run([
        "az", "vm", "list-usage", 
        "--location", get_key(env_path, "REGION"), 
        "--query", "[?contains(localName, '%s')].{max:limit, current:currentValue}" % (vm_family)
    ], stdout=subprocess.PIPE)
    quota = json.loads(''.join(results.stdout.decode('utf-8')))
    print(quota)
    return int(quota[0]['max']) - int(quota[0]['current'])

diff = check_quota(vm_family)
if diff <= requested_cores:
    print("Not enough cores of NCSv3 in region, asking for {} but have {}".format(requested_cores, diff))
    
    print("Retrying with family size NC6...")
    vm_family = "NC"
    requested_cores = node_count * vm_dict[vm_family]["cores"]
    
    diff = check_quota(vm_family)
    if diff <= requested_cores:
        print("Not enough cores of NC6 in region, asking for {} but have {}".format(requested_cores, diff))
    
        print("Retrying with family size DSv3...")
        vm_family = "DSv3"
        requested_cores = node_count * vm_dict[vm_family]["cores"]

        diff = check_quota(vm_family)
        if diff <= requested_cores:
            print("Not enough cores of DSv3 in region, asking for {} but have {}".format(requested_cores, diff))
            raise Exception("Core Limit", "Note enough cores to satisfy request")

print("There are enough cores, you may continue...") 

Create the aks cluster. This step may take a while... Please note that this step creates another resource group in your subscription containing the actual compute of the AKS cluster.

*The `az aks create` command will generate service principal credentials (unless you explicitly specify it). So, if you have run this notebook before or have created an AKS cluster using the Azure CLI, you may need to clear service principal credentials stored to your machine's disk by running `rm ~/.azure/aksServicePrincipal.json`.*

In [None]:
%%time
!az aks create \
    --resource-group {get_key(env_path, "RESOURCE_GROUP")} \
    --name {get_key(env_path, "AKS_CLUSTER")} \
    --node-count {get_key(env_path, "NODE_COUNT")} \
    --node-vm-size {vm_dict[vm_family]["size"]} \
    --generate-ssh-keys

Install Kubectl - this tool is used to manage the kubernetes cluster.

In [None]:
!sudo az aks install-cli

In [None]:
!az aks get-credentials \
    --resource-group {get_key(env_path, 'RESOURCE_GROUP')}\
    --name {get_key(env_path, 'AKS_CLUSTER')}

Check also that the nodes are up and ready using this command. You may choose to run this command in a new cell.
```bash
!kubectl get nodes
```

### Blobfuse on AKS

Now we setup our AKS cluster so that we have blob storage mounted onto the nodes using blobfuse. More info [here](https://github.com/Azure/kubernetes-volume-drivers/tree/master/flexvolume/blobfuse).

Install blobfuse driver on every agent VM.

In [None]:
!kubectl create -f https://raw.githubusercontent.com/ewyuanzhang/kubernetes-volume-drivers/master/flexvolume/blobfuse/deployment/blobfuse-flexvol-installer-1.9.yaml

Check daemonset status.

In [None]:
!kubectl describe daemonset blobfuse-flexvol-installer --namespace=kube-system
!kubectl get po --namespace=kube-system -o wide

Set up credentials for blobfuse.

In [None]:
!kubectl create secret generic blobfusecreds \
    --from-literal accountname={get_key(env_path, 'STORAGE_ACCOUNT_NAME')} \
    --from-literal accountkey={get_key(env_path, 'STORAGE_ACCOUNT_KEY')} \
    --type="azure/blobfuse"

Set the mount directory on our AKS cluster as en dotenv variable.

In [None]:
set_key(env_path, "MOUNT_DIR", "/data")
print("Done.")

### Install nVidia drivers

Before the GPUs in the nodes can be used, you must deploy a DaemonSet for the NVIDIA device plugin. This DaemonSet runs a pod on each node to provide the required drivers for the GPUs.

First, create a namespace using the `kubectl create namespace` command:

In [None]:
!kubectl create namespace gpu-resources

Now use the `kubectl apply` command to create the DaemonSet and confirm the nVidia device plugin is created successfully, as shown in the following example output:

In [None]:
!kubectl apply -f nvidia-device-plugin-ds.yaml

### Confirm that GPUs are schedulable

With your AKS cluster created, confirm that GPUs are schedulable in Kubernetes. First, list the nodes in your cluster using the `kubectl get nodes` command:

In [None]:
!kubectl get nodes

Now use the `kubectl describe node <node_name>` command to confirm that the GPUs are schedulable. Under the `Capacity` section, the GPU should list as `nvidia.com/gpu: 1`.

---

Continue to the next [notebook](./04_vehicle_detection_on_aks.ipynb).