# Deploying CNTK deep learning models as real-time micro-services

Inspired by https://github.com/ilkarman/Blog/blob/master/rndm/ACS%20Deploy.ipynb , https://github.com/Azure/Spark-Operationalization-On-Azure/blob/master/samples/cntk/tutorials/realtime/image_classification.md and https://github.com/Azure-Samples/hdinsight-pyspark-cntk-integration

## Scenario

Deep learning models are trained for a variety of tasks, from image classification to translation. Often times, there is a need to perform real time scoring of unseen data. 

The Azure Machine Learning CLI is a tool that wraps the APIs of Azure to deploy a VM scale set backed by Marathon and allows the deployment of deep learning models on docker containers right from the command line.

In this tutorial we will demonstrate the deployment of an Image Classification service on docker containers using  a pre-trained CNTK Resnet_152 model.

## Installing the AML CLI dependencies

In [None]:
!pip install azuremlcli asyncio aiohttp

In [None]:
!pip install azure-cli -I --upgrade

In [None]:
# Creating ssh key pair and saving it in the .library for re-use between containers
import os
if not os.path.exists('/home/nbuser/.ssh/id_rsa'):
    !ssh-keygen -t rsa -b 2048 -N "" -f ~/.ssh/id_rsa
print('Private key id_rsa:')
!cat ~/.ssh/id_rsa
print('Public key id_rsa.pub:')
!cat ~/.ssh/id_rsa.pub

Save the private key and the public key, as you will need them to access your cluster if you plan to keep longer than the length of this tutorial. Azure notebooks run in a container that can get restarted after a period of inactivity.

## Setting up the ACS environment

Login into your azure account

In [None]:
!az login -o table

If you want to select a non-default subscription to use, uncomment and replace the value of `subscription` with the name of the subcsription you want to use, copied from the output of the previous command

In [None]:
subscription = "<YOUR_SUBSCRIPTION_NAME_HERE>"
subscription = "'" + subscription + "'"
!az account set --subscription $subscription

Create the aml environment

In [None]:
import uuid

name = "aiimmersion{}".format(str(uuid.uuid4())[:8])

# Creating the environment
!aml env setup --name $name

Copy the deployment key found in the command at the end the following paragraph into **ACS_deployment_key**:

```
Started ACS deployment. Please note that it can take up to 15 minutes to complete the deployment.
You can continue to work with aml in local mode while the ACS is being provisioned.
To check the status of the deployment, run the following command:
aml env setup -s XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
```

In [None]:
ACS_deployment_key = "<YOUR_ACS_DEPLOYMENT_KEY>"

In [None]:
if "YOUR_ACS_DEPLOYMENT_KEY" in ACS_deployment_key:
    print("/!\ STOP /!\ You need to modify the value of ACS_deployment_key, please follow the above instructions")
else:
    print("You are good to go :)")

Checking the cluster deployment status, it should be **Running** which means the deployment is still running and that the cluster is **NOT YET** deployed

In [None]:
!aml env setup -s $ACS_deployment_key

Showing the AML environment variables

In [None]:
!cat ~/.amlenvrc

The Azure Notebook environment does not persist environment variables beyond current shell execution 
This little trick sources first the environment variables before running the AML cli

In [None]:
import os
# Moving the original aml to aml_orig
aml_path = !which aml
aml_path = aml_path[0]
aml_path_orig = aml_path + "_orig"
if not os.path.exists(aml_path_orig):
    !mv $aml_path $aml_path_orig

# Writing a new script to source the env variables
# before running the aml CLI
script = """
#!/bin/sh 
touch ~/.amlenvrc
. ~/.amlenvrc
export no_proxy=127.0.0.1
{} $@
""".format(aml_path_orig)
with open(aml_path, 'w') as f:
    f.write(script)

# Setting the permission to executable
!chmod 755 $aml_path

!aml

## Downloading the pre-trained model

Downloading the **ImageNet** CNTK pre-trained model using the ResNet_152 architecture and the labels

In [None]:
!wget "https://migonzastorage.blob.core.windows.net/deep-learning/models/cntk/imagenet/ResNet_152.model"

In [None]:
!wget "https://ikcompuvision.blob.core.windows.net/acs/synset.txt"

## Writing the driver file

The driver file needs to implement 2 functions, **init()** and **run(inputString)** 
In the **init()** function we load the model in memory.
In the **run(inputString)** we parse the input image and process it through the DNN

In [None]:
%%writefile driver.py
import numpy as np
import logging
import sys
import json
import timeit as t
import urllib.request
import base64
from cntk import load_model, combine
from PIL import Image, ImageOps
from io import BytesIO

logger = logging.getLogger("cntk_svc_logger")
ch = logging.StreamHandler(sys.stdout)
logger.addHandler(ch)

trainedModel = None
mem_after_init = None
labelLookup = None
topResult = 3

def aml_cli_get_sample_request():
    return '{"input": ["base64Image"]}'

def init():
    global trainedModel, labelLookup, mem_after_init

    # Load the model from disk and perform evals
    # Load labels txt
    with open('synset.txt', 'r') as f:
        labelLookup = [l.rstrip() for l in f]
    
    # The pre-trained model was trained using brainscript
    # Loading is not we need the right index 
    # See https://github.com/Microsoft/CNTK/wiki/How-do-I-Evaluate-models-in-Python
    # Load model and load the model from brainscript (3rd index)
    trainedModel = load_model('ResNet_152.model')
    trainedModel = combine([trainedModel.outputs[3].owner])

def run(inputString):

    start = t.default_timer()

    images = json.loads(inputString)
    result = []
    totalPreprocessTime = 0
    totalEvalTime = 0
    totalResultPrepTime = 0

    for base64ImgString in images:

        if base64ImgString.startswith('b\''):
            base64ImgString = base64ImgString[2:-1]
        base64Img = base64ImgString.encode('utf-8')

        # Preprocess the input data
        startPreprocess = t.default_timer()
        decoded_img = base64.b64decode(base64Img)
        img_buffer = BytesIO(decoded_img)
        # Load image with PIL (RGB)
        pil_img = Image.open(img_buffer).convert('RGB')
        pil_img = ImageOps.fit(pil_img, (224, 224), Image.ANTIALIAS)
        rgb_image = np.array(pil_img, dtype=np.float32)
        # Resnet trained with BGR
        bgr_image = rgb_image[..., [2, 1, 0]]
        imageData = np.ascontiguousarray(np.rollaxis(bgr_image, 2))

        endPreprocess = t.default_timer()
        totalPreprocessTime += endPreprocess - startPreprocess

        # Evaluate the model using the input data
        startEval = t.default_timer()
        imgPredictions = np.squeeze(trainedModel.eval(
            {trainedModel.arguments[0]: [imageData]}))
        endEval = t.default_timer()
        totalEvalTime += endEval - startEval

        # Only return top 3 predictions
        startResultPrep = t.default_timer()
        resultIndices = (-np.array(imgPredictions)).argsort()[:topResult]
        imgTopPredictions = []
        for i in range(topResult):
            imgTopPredictions.append(
                (labelLookup[resultIndices[i]], imgPredictions[resultIndices[i]] * 100))
        endResultPrep = t.default_timer()
        result.append(imgTopPredictions)

        totalResultPrepTime += endResultPrep - startResultPrep

    end = t.default_timer()

    logger.info("Predictions: {0}".format(result))
    logger.info("Predictions took {0} ms".format(
        round((end - start) * 1000, 2)))
    logger.info("Time distribution: preprocess={0} ms, eval={1} ms, resultPrep = {2} ms".format(round(
        totalPreprocessTime * 1000, 2), round(totalEvalTime * 1000, 2), round(totalResultPrepTime * 1000, 2)))

    actualWorkTime = round(
        (totalPreprocessTime + totalEvalTime + totalResultPrepTime) * 1000, 2)
    return (result, 'Computed in {0} ms'.format(actualWorkTime))

## Deploying the realtime service

Run this command until the deployment is completed, and **not `Running`**
It can take up to 15 minutes to complete the cluster provisionning

If it is still **Running**, re-run this command until you don't see **Running**

In [None]:
!aml env setup -s $ACS_deployment_key

If the deployment of the cluster is completed, switch to cluster mode, **otherwise, wait**.

Make sure to have **AML_ACS_MASTER** and **AML_ACS_AGENT** env variable specified in the configuration file.
Otherwise **it probably means your cluster deployment is still running**

In [None]:
!cat ~/.amlenvrc

Adding the fingerprint of the master using the env variable AML_ACS_MASTER in the list of known_hosts

In [None]:
! . ~/.amlenvrc && ssh-keyscan -p 2200 $AML_ACS_MASTER >> ~/.ssh/known_hosts

Switching the environment to cluster

In [None]:
!cat ~/.amlenvrc

In [None]:
!echo '\n' | aml env cluster

Creating the realtime service

In [None]:
service_name = 'cntkservice'
!aml service create realtime -r cntk-py -f driver.py -m ResNet_152.model -d synset.txt -n $service_name

In [None]:
!aml service view realtime $service_name -v

### **/!\** **/!\** **/!\** Update the **CLUSTER_SCORING_URL** with the URL you obtained above

In [None]:
CLUSTER_SCORING_URL = "http://YOUR_SCORING_URL:9091/score"

In [None]:
if "YOUR_SCORING_URL" in CLUSTER_SCORING_URL:
    print("/!\ STOP /!\ You need to modify the value above to contain your scoring url")
else:
    print("You are good to go! :)")

## Score images against the network

In [None]:
import base64
import urllib
import requests
import json
import matplotlib.pyplot as plt
from PIL import Image, ImageOps
from io import BytesIO
%matplotlib inline

In [None]:
def url_img_to_json_img(url):
    bytfile = BytesIO(urllib.request.urlopen(url).read())
    img = Image.open(bytfile).convert('RGB')  # 3 Channels
    img = ImageOps.fit(img, (224, 224), Image.ANTIALIAS)  # Fixed size 
    plt.imshow(img)
    imgio = BytesIO()
    img.save(imgio, 'PNG')
    imgio.seek(0)
    dataimg = base64.b64encode(imgio.read())
    return json.dumps(
        {'input':'[\"{0}\"]'.format(dataimg.decode('utf-8'))})

Set the headers

In [None]:
HEADERS = {'content-type': 'application/json',
           'X-Marathon-App-Id': '/{}'.format(service_name)}

You can use your own image by uploading the image using the `Data` button in the notebook toolbar then using the name of the image, for example `image.png` instead of the URL below

In [None]:
image_url = 'http://thomasdelteillondon.blob.core.windows.net/public/shuttle.jpg'
jsondata = url_img_to_json_img(image_url)

Posting the actual request to the cluster

In [None]:
res = requests.post(CLUSTER_SCORING_URL, data=jsondata, headers=HEADERS)

Scoring results

In [None]:
print(json.dumps(res.json(), indent=4))

## Load testing
Let see how fast it can process requests in parallel

In [None]:
import random
import asyncio
from aiohttp import ClientSession
import json

In [None]:
async def fetch(url, session):
    async with session.post(url, headers={
        "content-type":"application/json",
        "X-Marathon-App-Id":"/{}".format(service_name)
    }, data=jsondata) as response:
        date = response.headers.get("DATE")
        #print("{}:{}".format(date, response.url))
        return await response.read()


async def bound_fetch(sem, url, session):
    # Getter function with semaphore.
    async with sem:
        await fetch(url, session)


async def run(r):
    url = CLUSTER_SCORING_URL
    tasks = []
    # create instance of Semaphore
    sem = asyncio.Semaphore(1000)

    # Create client session that will ensure we dont open new connection
    # per each request.
    async with ClientSession() as session:
        for i in range(r):
            # pass Semaphore and session to every GET request
            task = asyncio.ensure_future(bound_fetch(sem, url, session))
            tasks.append(task)

        responses = asyncio.gather(*tasks)
        await responses

Let's run the load test

In [None]:
%%time
number = 30
loop = asyncio.get_event_loop()

future = asyncio.ensure_future(run(number))
loop.run_until_complete(future)

## Cleanup

Delete the resource group, this can take up to several minutes without showing any output

In [None]:
resource_group = name+"rg"
!az group delete --yes -n $resource_group

In [None]:
!ps aux | grep ssh

In [None]:
!cat ~/.ssh/acs_id_rsa