# 4. Neural Style Transfer on AKS

Now that the AKS cluster is up, we need to deploy our __flask app__ and __scoring app__ onto it.

To do so, we'll do the following:
1. Build our __flask app__ and __scoring app__ push it to Dockerhub
2. Create our dot-yaml files for each of these apps (these dot-yaml files will need to have the proper configuration for the pods to use blobfuse to access our blob storage container). We should end up creating: `flask_app_deployment.json` and `scoring_app_deployment.json`
3. Use `kubectl` to make these deployments to our AKS cluster
4. Expose the __flask app__ REST endpoint so that it can be accessed externally

### Kubernetes Deployment
In this notebook, we will deploy our __flask app__ and __scoring app__ on the kubernetes cluster. Since the __flask app__ does not require heavy computation, we will deploy it on one node and reserve the remaining nodes for the __scoring app__ as it will perform the parallel computation.

---

### Import packages and load .env

In [1]:
from dotenv import set_key, get_key, find_dotenv, load_dotenv
from pathlib import Path
import subprocess
import json
import os

In [2]:
env_path = find_dotenv(raise_error_if_not_found=True)
load_dotenv(env_path)

True

### Build Scoring App Docker Image

In [3]:
%%writefile scoring_app/requirements.txt
azure==4.0.0
torch==0.4.1
torchvision==0.2.1

Writing scoring_app/requirements.txt


In [4]:
%%writefile scoring_app/Dockerfile

FROM nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04

RUN echo "deb http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 /" > /etc/apt/sources.list.d/nvidia-ml.list

RUN apt-get update && apt-get install -y --no-install-recommends \
        build-essential \
        ca-certificates \
        cmake \
        curl \
        git \
        nginx \
        supervisor \
        wget && \
        rm -rf /var/lib/apt/lists/*

ENV PYTHON_VERSION=3.6
RUN curl -o ~/miniconda.sh -O  https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh  && \
    chmod +x ~/miniconda.sh && \
    ~/miniconda.sh -b -p /opt/conda && \
    rm ~/miniconda.sh && \
    /opt/conda/bin/conda create -y --name py$PYTHON_VERSION python=$PYTHON_VERSION && \
    /opt/conda/bin/conda clean -ya
ENV PATH /opt/conda/envs/py$PYTHON_VERSION/bin:$PATH
ENV LD_LIBRARY_PATH /opt/conda/envs/py$PYTHON_VERSION/lib:/usr/local/cuda/lib64/:$LD_LIBRARY_PATH
ENV PYTHONPATH /code/:$PYTHONPATH

RUN mkdir /app
WORKDIR /app
ADD process_images_from_queue.py /app
ADD style_transfer.py /app
ADD main.py /app
ADD util.py /app
ADD requirements.txt /app

RUN pip install --no-cache-dir -r requirements.txt

CMD ["python", "main.py"]

Writing scoring_app/Dockerfile


In [5]:
!sudo docker build -t {get_key(env_path, "SCORING_IMAGE")} scoring_app

Sending build context to Docker daemon  35.84kB
Step 1/17 : FROM nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04
 ---> f4f6aaaaa057
Step 2/17 : RUN echo "deb http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64 /" > /etc/apt/sources.list.d/nvidia-ml.list
 ---> Using cache
 ---> 4196af2ba86e
Step 3/17 : RUN apt-get update && apt-get install -y --no-install-recommends         build-essential         ca-certificates         cmake         curl         git         nginx         supervisor         wget &&         rm -rf /var/lib/apt/lists/*
 ---> Using cache
 ---> 8ddcde9d280a
Step 4/17 : ENV PYTHON_VERSION=3.6
 ---> Using cache
 ---> 5a047de1f83a
Step 5/17 : RUN curl -o ~/miniconda.sh -O  https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh  &&     chmod +x ~/miniconda.sh &&     ~/miniconda.sh -b -p /opt/conda &&     rm ~/miniconda.sh &&     /opt/conda/bin/conda create -y --name py$PYTHON_VERSION python=$PYTHON_VERSION &&     /opt/conda/bin/conda c

Collecting cryptography (from azure-cosmosdb-table~=1.0->azure==4.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/98/71/e632e222f34632e0527dd41799f7847305e701f38f512d81bdf96009bca4/cryptography-2.5-cp34-abi3-manylinux1_x86_64.whl (2.4MB)
Collecting python-dateutil (from azure-cosmosdb-table~=1.0->azure==4.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/74/68/d87d9b36af36f44254a8d512cbfc48369103a3b9e474be9bdfe536abfc45/python_dateutil-2.7.5-py2.py3-none-any.whl (225kB)
Collecting azure-cosmosdb-nspkg>=2.0.0 (from azure-cosmosdb-table~=1.0->azure==4.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/58/0d/1329b47e5386b0acf4e42ada2284851eff60ef3337a87e5b2dfedabbfcb1/azure_cosmosdb_nspkg-2.0.2-py2.py3-none-any.whl
Collecting msrest<2.0.0,>=0.5.4 (from azure-applicationinsights~=0.1.0->azure==4.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org

Collecting azure-mgmt-subscription~=0.2.0 (from azure-mgmt~=4.0->azure==4.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/27/72/f63f72d9c27659f96aae287f7ba67c9ba2877213c2c10d8b797e0dee5059/azure_mgmt_subscription-0.2.0-py2.py3-none-any.whl (40kB)
Collecting azure-mgmt-powerbiembedded~=2.0 (from azure-mgmt~=4.0->azure==4.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/64/2b/5a69d118626a83055ce577fb6ba35c114e72f0273d0c4d943b909ee8d51e/azure_mgmt_powerbiembedded-2.0.0-py2.py3-none-any.whl
Collecting azure-mgmt-consumption~=2.0 (from azure-mgmt~=4.0->azure==4.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/11/f4/2db9557494dfb17ff3edeae5726981143a7baace17df3712b189e343bd8c/azure_mgmt_consumption-2.0.0-py2.py3-none-any.whl (46kB)
Collecting azure-mgmt-machinelearningcompute~=0.4.1 (from azure-mgmt~=4.0->azure==4.0.0->-r requirements.txt (line 1))
  Downloading https://fi

Collecting azure-mgmt-reservations~=0.2.1 (from azure-mgmt~=4.0->azure==4.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/a9/2b/e0edb8dbe85e447005cde508b869e564f58270f631219f036d8ab71184f9/azure_mgmt_reservations-0.2.1-py2.py3-none-any.whl (50kB)
Collecting azure-mgmt-containerregistry~=2.1 (from azure-mgmt~=4.0->azure==4.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/4e/93/982bdd8f38379405beab963ee2abf1ab42a826a57cc2157228216cfc62e4/azure_mgmt_containerregistry-2.6.0-py2.py3-none-any.whl (501kB)
Collecting azure-mgmt-notificationhubs~=2.0 (from azure-mgmt~=4.0->azure==4.0.0->-r requirements.txt (line 1))
  Downloading https://files.pythonhosted.org/packages/b1/7f/283067cf30dd72707bbdfa608838509663328e533d7e362f7630c785679c/azure_mgmt_notificationhubs-2.0.0-py2.py3-none-any.whl (71kB)
Collecting azure-mgmt-datafactory~=0.6.0 (from azure-mgmt~=4.0->azure==4.0.0->-r requirements.txt (line 1))
  Downloading

Removing intermediate container 31676e0db18a
 ---> eb2150d4a8d1
Step 17/17 : CMD ["python", "main.py"]
 ---> Running in 6b34aec254e0
Removing intermediate container 6b34aec254e0
 ---> 044133dd4a7b
Successfully built 044133dd4a7b
Successfully tagged batchscoringdl_scoring_app:latest


Tag and push docker image

In [6]:
repo = "{}/{}".format(get_key(env_path, "DOCKER_LOGIN"), get_key(env_path, "SCORING_IMAGE"))

In [7]:
!sudo docker tag {get_key(env_path, "SCORING_IMAGE")} {repo}

In [8]:
!sudo docker push {repo}

The push refers to repository [docker.io/jiata/batchscoringdl_scoring_app]

[1Bd55a0991: Preparing 
[1B3a48a106: Preparing 
[1B27fe773a: Preparing 
[1B9cf2b9f2: Preparing 
[1B6063fec7: Preparing 
[1B2632d196: Preparing 
[1B8452f77e: Preparing 
[1Baad0d176: Preparing 
[1Bff05626e: Preparing 
[1B9048222b: Preparing 
[1Bf7dc85a1: Preparing 
[1B2df89268: Preparing 
[1Bd8f0884d: Preparing 
[1B87fdb58c: Preparing 
[1B8fb03d12: Preparing 
[1B843615e2: Preparing 
[1Ba8049aa6: Preparing 
[1B9c0f8a0b: Preparing 
[1B8ccd260b: Preparing 
[20B55a0991: Pushed   1.319GB/1.286GBK[K[19A[1K[K[20A[1K[KLayer already exists [20A[1K[K[20A[1K[K[14A[1K[K[20A[1K[K[13A[1K[K[20A[1K[K[15A[1K[K[11A[1K[K[9A[1K[K[7A[1K[K[20A[1K[K[5A[1K[K[19A[1K[K[20A[1K[K[3A[1K[K[20A[1K[K[20A[1K[K[20A[1K[K[20A[1K[K[20A[1K[K[20A[1K[K[20A[1K[K[20A[1K[K[20A[1K[K[20A[1K[K[20A[1K[K[20A[1K[K[20A[1K[K[20A[1K[K[20A[1K[K[20A

### Build Flask App Docker Image

Create our Dockerfile and save it to the directory, `flask_app/`.

In [9]:
%%writefile flask_app/Dockerfile

FROM continuumio/miniconda3

RUN mkdir /app
WORKDIR /app
ADD add_images_to_queue.py /app
ADD preprocess.py /app
ADD postprocess.py /app
ADD util.py /app
ADD main.py /app

RUN conda install -c conda-forge -y ffmpeg
RUN pip install azure
RUN pip install flask

CMD ["python", "main.py"]

Writing flask_app/Dockerfile


Build the Docker image

In [10]:
!sudo docker build -t {get_key(env_path, "FLASK_IMAGE")} flask_app

Sending build context to Docker daemon  24.06kB
Step 1/12 : FROM continuumio/miniconda3
 ---> d3c252f8727b
Step 2/12 : RUN mkdir /app
 ---> Using cache
 ---> 3d09918b3c53
Step 3/12 : WORKDIR /app
 ---> Using cache
 ---> cdb65a76c343
Step 4/12 : ADD add_images_to_queue.py /app
 ---> Using cache
 ---> 7c1787f853f4
Step 5/12 : ADD preprocess.py /app
 ---> Using cache
 ---> 71901a55c05e
Step 6/12 : ADD postprocess.py /app
 ---> Using cache
 ---> 3d38f2424ba4
Step 7/12 : ADD util.py /app
 ---> Using cache
 ---> a8454a26df6d
Step 8/12 : ADD main.py /app
 ---> Using cache
 ---> 8f7fb47b3a27
Step 9/12 : RUN conda install -c conda-forge -y ffmpeg
 ---> Using cache
 ---> ea75d4d4cb83
Step 10/12 : RUN pip install azure
 ---> Using cache
 ---> 2cb7be8f73e8
Step 11/12 : RUN pip install flask
 ---> Using cache
 ---> b29b139e3081
Step 12/12 : CMD ["python", "main.py"]
 ---> Using cache
 ---> ccc092782af4
Successfully built ccc092782af4
Successfully tagged batchscoringdl_flask_app:latest


Tag and push.

In [11]:
repo = "{}/{}".format(get_key(env_path, "DOCKER_LOGIN"), get_key(env_path, "FLASK_IMAGE"))

In [12]:
!sudo docker tag {get_key(env_path, "FLASK_IMAGE")} {repo}

In [13]:
!sudo docker push {repo}

The push refers to repository [docker.io/jiata/batchscoringdl_flask_app]

[1B4e17c344: Preparing 
[1B4b743574: Preparing 
[1B5b7e80f6: Preparing 
[1B3289740b: Preparing 
[1Bb3c10e7f: Preparing 
[1B2fb3d239: Preparing 
[1B5a465adc: Preparing 
[1B0c665bf2: Preparing 
[1B59854246: Preparing 
[1Bb5c98f73: Preparing 
[1B31f7d329: Preparing 
[1Bcc4bbc9d: Preparing 
[1Bae060f2d: Preparing 
[7B0c665bf2: Layer already exists [11A[1K[K[10A[1K[K[9A[1K[K[8A[1K[K[4A[1K[K[12A[1K[K[1A[1K[K[7A[1K[Klatest: digest: sha256:594a861ebbe3db25b661173aec23762eb0440666ac9b95b993fdc80a1985cf01 size: 3248


### Create our Flask App and Scoring App deployments on AKS

We need to deploy both our aci and aks docker images to the AKS cluster. Since we'll need to set up our gpu and drivers and blobfuse mount point for both deployments, we'll set these up first:

In [14]:
volume_mounts = [
    {"name": "nvidia", "mountPath": "/usr/local/nvidia"},
    {"name": "blob", "mountPath": get_key(env_path, "MOUNT_DIR")},
]

resources = {
    "requests": {"alpha.kubernetes.io/nvidia-gpu": 1},
    "limits": {"alpha.kubernetes.io/nvidia-gpu": 1},
}

volumes = [
    {"name": "nvidia", "hostPath": {"path": "/usr/local/nvidia"}},
    {
        "name": "blob",
        "flexVolume": {
            "driver": "azure/blobfuse",
            "readOnly": False,
            "secretRef": {"name": "blobfusecreds"},
            "options": {
                "container": get_key(env_path, "STORAGE_CONTAINER_NAME"),
                "tmppath": "/tmp/blobfuse",
                "mountoptions": "--file-cache-timeout-in-seconds=120 --use-https=true",
            },
        },
    },
]

env = [
    {
        "name": "MOUNT_DIR", 
        "value": get_key(env_path, "MOUNT_DIR")
    },
    {
        "name": "LB_LIBRARY_PATH",
        "value": "$LD_LIBRARY_PATH:/usr/local/nvidia/lib64:/opt/conda/envs/py3.6/lib",
    },
    {
        "name": "DP_DISABLE_HEALTHCHECKS", 
        "value": "xids"
    },
    {
        "name": "STORAGE_MODEL_DIR",
        "value": get_key(env_path, "STORAGE_MODEL_DIR")
    },
    {
        "name": "SUBSCRIPTION_ID",
        "value": get_key(env_path, "SUBSCRIPTION_ID")
    },
    {
        "name": "RESOURCE_GROUP",
        "value": get_key(env_path, "RESOURCE_GROUP")
    },
    {
        "name": "REGION",
        "value": get_key(env_path, "REGION")
    },
    {
        "name": "SB_SHARED_ACCESS_KEY_NAME",
        "value": get_key(env_path, "SB_SHARED_ACCESS_KEY_NAME")
    },
    {
        "name": "SB_SHARED_ACCESS_KEY_VALUE",
        "value": get_key(env_path, "SB_SHARED_ACCESS_KEY_VALUE")
    },
    {
        "name": "SB_NAMESPACE",
        "value": get_key(env_path, "SB_NAMESPACE")
    },
    {
        "name": "SB_QUEUE", 
        "value": get_key(env_path, "SB_QUEUE")
    },
]

Define the aks deployment and save it to a `scoring_app_deployment.json` file using the variables set above.

In [15]:
scoring_app_deployment_json = {
    "apiVersion": "apps/v1beta1",
    "kind": "Deployment",
    "metadata": {
        "name": "scoring-app", 
        "labels": {
            "purpose": "dequeue_messages_and_apply_style_transfer"
        }
    },
    "spec": {
        "replicas": int(get_key(env_path, "NODE_COUNT")) - 1,
        "template": {
            "metadata": {
                "labels": {
                    "app": "scoring-app"
                }
            },
            "spec": {
                "containers": [
                    {
                        "name": "scoring-app",
                        "image": "{}/{}:latest".format(get_key(env_path, "DOCKER_LOGIN"), get_key(env_path, "SCORING_IMAGE")),
                        "volumeMounts": volume_mounts,
                        "resources": resources,
                        "ports": [{
                            "containerPort": 433
                        }],
                        "env": env,
                    }
                ],
                "volumes": volumes
            },
        },
    },
}

with open("scoring_app_deployment.json", "w") as outfile:
    json.dump(scoring_app_deployment_json, outfile, indent=4, sort_keys=True)
    outfile.write('\n\n')

Using the `scoring_app_deployment.json` we created, create our deployment on AKS. This can take a few minutes...

In [16]:
!kubectl create -f scoring_app_deployment.json

deployment.apps/scoring-app created


Define the flask app deployment and save it to a `flask_app_deployment.json` file using the variables set above.

In [17]:
flask_app_deployment_json = {
    "apiVersion": "apps/v1beta1",
    "kind": "Deployment",
    "metadata": {
        "name": "flask-app", 
        "labels": {
            "purpose": "pre_and_post_processing_and_queue_images"
        }
    },
    "spec": {
        "replicas": 1,
        "template": {
            "metadata": {
                "labels": {
                    "app": "flask-app"
                }
            },
            "spec": {
                "containers": [
                    {
                        "name": "flask-app",
                        "image": "{}/{}:latest".format(get_key(env_path, "DOCKER_LOGIN"), get_key(env_path, "FLASK_IMAGE")),
                        "volumeMounts": volume_mounts,
                        "resources": resources,
                        "ports": [{
                            "containerPort": 8080
                        }],
                        "env": env,
                    }
                ],
                "volumes": volumes
            },
        },
    },
}

with open("flask_app_deployment.json", "w") as outfile:
    json.dump(flask_app_deployment_json, outfile, indent=4, sort_keys=True)
    outfile.write('\n\n')

Using the `flask_app_deployment.json` we created, create our flask app deployment on AKS. This can take a few minutes...

In [18]:
!kubectl create -f flask_app_deployment.json

deployment.apps/flask-app created


These deployments may take a few minutes. You can inspect the state of the pods by running the command: `kubectl get pods`. When the deployment is done, the results may look as follows:
```bash
NAME                           READY   STATUS              RESTARTS   AGE
flask-app-6db66c97ff-x8rq4     1/1     Running             0          78s
scoring-app-846dd6bc79-5nm5b   1/1     Running             0          73s
scoring-app-846dd6bc79-6qc6k   1/1     Running             0          73s
scoring-app-846dd6bc79-8gtsv   1/1     Running             0          73s
scoring-app-846dd6bc79-hjsfc   1/1     Running             0          73s
```

Expose the flask-app in the kubernetes cluster. This will open a public endpoint.

In [19]:
!kubectl expose deployment flask-app --type="LoadBalancer"

service/flask-app exposed


Run `!watch kubectl get services` and wait until the external ip goes from pending to being realized. It can take some time.

NOTE: If the following command is run without the external ip being realized, an error will be thrown. 

In [20]:
external_ip = !kubectl get services -o=jsonpath={.items[*].status.loadBalancer.ingress[0].ip}
external_ip = external_ip[0]

Since we'll use the `external_ip` later on, save it to the dot-env file.

In [21]:
set_key(env_path, "AKS_EXTERNAL_IP", external_ip)

### Test that the deployment works end-to-end

Set the name of the new test video.

In [22]:
new_video_name = "aks_test_orangutan.mp4"

Make a copy the old `orangutan.mp4` video but named with the `<new_video_name>`. 

In [23]:
!cp data/orangutan.mp4 data/{new_video_name}

Use `curl` to hit the endpoint of the kubernetes cluster we just deployed.

In [24]:
!curl {external_ip}":8080/process?video_name="{new_video_name}

Processing aks_test_orangutan.mp4 in background...


Inspect your kubernetes cluster to see that the process is running. You can use the commands below to do so. Alternatively, you can also inspect the blob storage container to see that the images are being created.

When the video completes, you can play the video file directly from your mounted blob container:

In [26]:
%%HTML
<video width="320" height="240" controls>
  <source src="data/aks_test_orangutan/aks_test_orangutan_processed.mp4" type="video/mp4">
</video>

### Basic Kubectl usage
You can use kubectl to perform basic monitoring. Use the following commands:
```bash
# monitor pods
!kubectl get pods

# print logs from a pod (<pod-name> can be found when calling 'get pods')
!kubectl logs <pod-name>

# check all services running on the cluster
!kubectl get services

# delete a service
!kubectl delete services <service-name>

# delete a deployment
!kubectl delete -f scoring_app_deployment.json
!kubectl delete -f flask_app_deployment.json
```

### Monitor in kubernetes dashboard
You can use the Kubernetes dashboard to monitor the cluster using the following commands:

```bash
# use the kube_dashboard_access.yaml to create a deployment
!kubectl create -f kube_dashboard_access.yaml

# use this command to browse
!az aks browse -n {get_key(env_path, "AKS_CLUSTER")} -g {get_key(env_path, "RESOURCE_GROUP")}
```

If you're not able to access the dashboard, follow the instructions [here](https://blog.tekspace.io/kubernetes-dashboard-remote-access/).

### Additional commands for AKS

Scale your AKS cluster:

```bash 
!az aks scale \
    --name {get_key(env_path, "AKS_CLUSTER")} \
    --resource-group {get_key(env_path, "RESOURCE_GROUP")} \
    --node-count 10
```

Scale your deployment:
```bash
!kubectl scale deployment.apps/aks-app --replicas=10
```

---

Continue to the next [notebook](/notebooks/05_deploy_logic_app.ipynb).