[![Azure Notebooks](https://notebooks.azure.com/launch.png)](https://notebooks.azure.com/import/gh/microsoft/AI-Utilities)
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/microsoft/AI-Utilities/deep_learning_2?filepath=notebooks%2Fai-deep-realtime-score.ipynb)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](http://colab.research.google.com/github/microsoft/AI-Utilities/blob/deep_learning_2/notebooks/ai-deep-realtime-score.ipynb)

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Deploy-Solution" data-toc-modified-id="Deploy-Solution-1">Deploy Solution</a></span><ul class="toc-item"><li><span><a href="#Create-Configuration" data-toc-modified-id="Create-Configuration-1.1">Create Configuration</a></span></li><li><span><a href="#Create-Train.py" data-toc-modified-id="Create-Train.py-1.2">Create Train.py</a></span></li><li><span><a href="#Create-Score.py" data-toc-modified-id="Create-Score.py-1.3">Create Score.py</a></span></li><li><span><a href="#Deploy-to-Azure-Kubernetes-Service-with-Azure-ML" data-toc-modified-id="Deploy-to-Azure-Kubernetes-Service-with-Azure-ML-1.4">Deploy to Azure Kubernetes Service with Azure ML</a></span></li></ul></li><li><span><a href="#Deploy-Services" data-toc-modified-id="Deploy-Services-2">Deploy Services</a></span><ul class="toc-item"><li><span><a href="#Machine-Learning-Studio" data-toc-modified-id="Machine-Learning-Studio-2.1">Machine Learning Studio</a></span></li><li><span><a href="#Kubernetes" data-toc-modified-id="Kubernetes-2.2">Kubernetes</a></span></li><li><span><a href="#Application-Insights" data-toc-modified-id="Application-Insights-2.3">Application Insights</a></span><ul class="toc-item"><li><span><a href="#Main" data-toc-modified-id="Main-2.3.1">Main</a></span></li><li><span><a href="#Availability" data-toc-modified-id="Availability-2.3.2">Availability</a></span></li><li><span><a href="#Performance-Dashboard" data-toc-modified-id="Performance-Dashboard-2.3.3">Performance Dashboard</a></span></li><li><span><a href="#Load-Test" data-toc-modified-id="Load-Test-2.3.4">Load Test</a></span></li></ul></li></ul></li></ul></div>

## Overview
In the [az-deep-realtime-score repository](https://github.com/microsoft/az-deep-realtime-score) there are a number of tutorials in Jupyter notebooks that have step-by-step instructions on how to deploy a pretrained deep learning model on a GPU enabled Kubernetes cluster throught Azure Machine Learning (AzureML). The tutorials cover how to deploy models with the deep learning framework Keras (TensorFlow backend)

![alt text](https://happypathspublic.blob.core.windows.net/aksdeploymenttutorialaml/example.png "Example Classification")
 
We go through the following steps:
 * Create an AzureML Workspace
 * Model development where we load the pretrained model and test it by using it to score images
 * Develop the API that will call our model 
 * AKS
     * Creating our Kubernetes cluster and deploying our application to it
     * Testing the deployed model
     * Testing the throughput of our model
     * Cleaning up resources

## Design
![](https://docs.microsoft.com/en-us/azure/architecture/reference-architectures/ai/_images/python-model-architecture.png)

As described on the associated [Azure Reference Architecture page](https://docs.microsoft.com/en-us/azure/architecture/reference-architectures/ai/realtime-scoring-python), the application we will develop is a simple image classification service, where we will submit an image and get back what class the image belongs to. The application flow for the deep learning model is as follows:
1.	Deep learning model is registered to AzureML model registry.
1.	AzureML creates a docker image including the model and scoring script.
1.	AzureML deploys the scoring image on the chosen deployment compute target (AKS) as a web service.
1.	The client sends a HTTP POST request with the encoded image data.
1.	The web service created by AzureML preprocesses the image data and sends it to the model for scoring.
1.	The predicted categories with their scores are then returned to the client.


**NOTE**: The tutorial goes through step by step how to deploy a deep learning model on Azure; it **does** **not** include enterprise best practices such as securing the endpoints and setting up remote logging etc. 

**Deploying with GPUS:** For a detailed comparison of the deployments of various deep learning models, see the blog post [here](https://azure.microsoft.com/en-us/blog/gpus-vs-cpus-for-deployment-of-deep-learning-models/) which provides evidence that, at least in the scenarios tested, GPUs provide better throughput and stability at a lower cost.

## Quick Start
Run this notebook to quickly setup the entire solution, and explore the custom train.py and score.py used in this example.

Run all the cells to get started. 
> _**Action Required:** You will be prompted to login to Azure before the Widget is displayed and the notebook can continue._

### Sample Configuration Widget
Run the following code to produce this configuration widget in order to provide setting for the Azure Machine Learning deployment. Enter configuration settings, or upload an existing project.yml.


### Create Configuration Widget


In [1]:
import os

from azure_utils.notebook_widgets.notebook_configuration_widget import get_configuration_widget, test_train_py_button, test_score_py_button, deploy_button

project_configuration = "notebook_project.yml"
os.makedirs("script", exist_ok=True)
os.makedirs("source", exist_ok=True)
configuration_widget = get_configuration_widget(project_configuration)
configuration_widget

VBox(children=(Output(), FileUpload(value={}, accept='.yml', description='Upload'), Text(value='0ca618d2-22a8-…

### Create Train.py

The following code trains a model and saves it to the output directory. For more details see this [sample notebook](https://github.com/microsoft/az-deep-realtime-score/blob/master/%7B%7Bcookiecutter.project_name%7D%7D/Keras_Tensorflow/01_DevelopModel.ipynb).

In [2]:
%%writefile script/train_dl.py
import os

from azure_utils.samples.deep_rts_samples import MakeResNet152

if __name__ == "__main__":
    import warnings

    with warnings.catch_warnings():
        warnings.filterwarnings("ignore", category=FutureWarning)
        import tensorflow as tf
        tf.logging.set_verbosity(tf.logging.ERROR)

        os.makedirs("outputs", exist_ok=True)
        model = MakeResNet152(include_top=False, input_shape=(200, 200, 3), pooling="avg", weights="imagenet")
        model.save_weights("outputs/model.pkl")


Overwriting script/train_dl.py


> _**Optional:** Test the training code locally before deployment. This is particularly useful when authoring a train.py file._

In [3]:
test_train_py_button(train_py="script/train_dl.py")

Button(description='Test train.py', style=ButtonStyle())

Output()

### Create Score.py
The scoring script is used to create a rest service. The model is loaded, and is used to make predictions on incoming requests. For more details see this [sample notebook](https://github.com/microsoft/az-deep-realtime-score/blob/master/%7B%7Bcookiecutter.project_name%7D%7D/Keras_Tensorflow/02_DevelopModelDriver.ipynb).

In [4]:
%%writefile source/score.py

import sys
sys.setrecursionlimit(3000)

from azureml.contrib.services.aml_request import rawhttp
from azureml.contrib.services.aml_response import AMLResponse

def init():
    """ Initialise the model and scoring function
    """
    global process_and_score
    from azure_utils.samples.deep_rts_samples import get_model_api

    process_and_score = get_model_api()


@rawhttp
def run(request):
    """ Make a prediction based on the data passed in using the preloaded model
    """
    from azure_utils.machine_learning.realtime import default_response
    if request.method == 'POST':
        return process_and_score(request.files)
    return default_response(request)

Overwriting source/score.py


> _**Optional:** Test the code locally before deployment. This is particularly useful when authoring a score.py file.._

In [5]:
test_score_py_button(score_py="source/score.py")

Button(description='Test score.py', style=ButtonStyle())

Output()

### Deploy to Azure Kubernetes Service with Azure ML

Train the model locally, and then deploy the web service to an Azure Kubernetes Cluster managed by an Azure Machine Learning Workspace. Use the output widget to explore the deployed resources.
For more details see this [sample notebook](https://github.com/microsoft/az-deep-realtime-score/blob/master/%7B%7Bcookiecutter.project_name%7D%7D/Keras_Tensorflow/aks/04_DeployOnAKS.ipynb).

> _**Action Required:** Press the button to deploy the scripts to Azure Machine Learning._

In [6]:
deploy_button(project_configuration, train_py="train_dl.py", score_py="score.py")

Button(description='Deploy Service', style=ButtonStyle())

Output()

For more details on how to test the endpoint see this [sample notebook](https://github.com/microsoft/az-deep-realtime-score/blob/master/%7B%7Bcookiecutter.project_name%7D%7D/Keras_Tensorflow/aks/05_TestWebApp.ipynb).
 
For more details on how to load test the endpoint see this [sample notebook](https://github.com/microsoft/az-deep-realtime-score/blob/master/%7B%7Bcookiecutter.project_name%7D%7D/Keras_Tensorflow/aks/06_SpeedTestWebApp.ipynb).

Notebook Finished.