Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# Creating and Updating a Docker Image before Deployment as a Webservice

This notebook demonstrates how to make changes to an existing docker image, before deploying it as a webservice.  

Knowing how to do this can be helpful, for example if you need to debug the execution script of a webservice you're developing, and debugging it involves several iterations of code changes.  In this case it is not an option to deploy your application as a webservice at every iteration, because the time it takes to deploy your service will significantly slow you down.  In some cases, it may be easier to simply run the execution script on the command line, but this not an option if your script accumulates data across individual calls.

We describe the following process:

1. Configure your Azure Workspace.
2. Create a Docker image, using the Azure ML SDK.
3. Test your Application by running the Docker container locally.
4. Update the execution script inside your running Docker container.
5. Commit the changes in your Docker container to the Docker image and update the Azure Container Registry (ACR).
6. Deploy your Docker image as an Azure Container Instance ([ACI](https://azure.microsoft.com/en-us/services/container-instances/)) Webservice.

### Prerequisites
- You need to have an [Azure](https://azure.microsoft.com) subscription.

**Note:** This code was tested on a Data Science Virtual Machine ([DSVM](https://azure.microsoft.com/en-us/services/virtual-machines/data-science-virtual-machines/)), running Ubuntu Linux 16.04 (Xenial).

## Configure your Azure Workspace

We need to set up your workspace, and make sure we have access to it from here.

This requires that you have downloaded the ***config.json*** configuration file for your azure workspace.

Follow this [Quickstart](https://docs.microsoft.com/en-us/azure/machine-learning/service/quickstart-get-started) to set up your workspace and to download the config.json file, which contains information about the workspace you just created. Save the file in the same directory as this notebook.

In [None]:
# %matplotlib inline  
import azureml.core


# Check core SDK version number - based on build number of preview/master.
print("SDK version:", azureml.core.VERSION)

Let's make sure that you have the correction version of the Azure ML SDK installed on your workstation or VM.  If you don't have the write version, please follow these [Installation Instructions](https://docs.microsoft.com/en-us/azure/machine-learning/service/quickstart-create-workspace-with-python#install-the-sdk).

In [None]:
subscription_id = ""
resource_group = ""
workspace_name = ""
location = ""

# import the Workspace class and check the azureml SDK version
# exist_ok checks if workspace exists or not.

from azureml.core import Workspace

ws = Workspace.create(name = workspace_name,
                      subscription_id = subscription_id,
                      resource_group = resource_group, 
                      location = location,
                      exist_ok=True)

ws.get_details()
ws.write_config()

## Create a Docker image using the Azure ML SDK

### Create a template execution script for your application

We are going to start with just a barebones execution script for your webservice.  This script is only for testing, and will do nothing but thank us for interacting with it.

In [None]:
%%writefile score.py

import json # we use json in order to interact with the anomaly detection service via a RESTful API

# The init function is only run once, when the webservice (or Docker container) is started
def init():
    global running_avg, curr_n
    
    running_avg = 0.0
    curr_n = 0
    
    pass

# the run function is run everytime we interact with the service
def run(raw_data):
    """
    Calculates rolling average according to Welford's online algorithm.
    https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online

    :param raw_data: raw_data should be a json query containing a dictionary with the key 'value'
    :return: runnin_avg (float, json response)
    """
    global running_avg, curr_n
    
    value = json.loads(raw_data)['value']
    n_arg = 5 # we calculate the average over the last "n" measures
    
    curr_n += 1
    n = min(curr_n, n_arg) # in case we don't have "n" measures yet
    
    running_avg += (value - running_avg) / n
    
    return json.dumps(running_avg)

### Create environment file for your Conda environment

Next, create an environment file (environment.yml) that specifies all the python dependencies of your script. This file is used to ensure that all of those dependencies are installed in the Docker image.  Let's assume your Webservice will require ``azureml-sdk``, ``scikit-learn``, and ``pynacl``.

In [None]:
from azureml.core.conda_dependencies import CondaDependencies 

myenv = CondaDependencies()
myenv.add_conda_package("scikit-learn")
myenv.add_pip_package("pynacl==1.2.1")
myenv.add_pip_package("pyculiarity")
myenv.add_pip_package("pandas")
myenv.add_pip_package("numpy")

with open("environment.yml","w") as f:
    f.write(myenv.serialize_to_string())

Review the content of the `environment.yml` file.

In [None]:
with open("environment.yml","r") as f:
    print(f.read())

### Create the initial Docker image

We use the ``environment.yml`` and ``score.py`` files from above, to create an initial Docker image.

In [None]:
%%time

from azureml.core.image import ContainerImage

# configure the image
image_config = ContainerImage.image_configuration(execution_script = "AD_score.py", 
                                                  runtime = "python",
                                                  conda_file = "environment.yml")

# create the docker image. this should take less than 5 minutes
image = ContainerImage.create(name = "my-docker-image",
                              image_config = image_config,
                              models = [],
                              workspace = ws)

# we wait until the image has been created
image.wait_for_creation(show_output=True)

# let's save the image location
imageLocation = image.serialize()['imageLocation']

## Deploy container

Let's try to deploy the container to ACI, just to make sure everything behaves as expected.

In [None]:
%%time
from azureml.core.webservice import Webservice
from azureml.core.image import ContainerImage
from azureml.core.webservice import AciWebservice

# create configuration for ACI
aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, 
                                               memory_gb=1, 
                                               tags={"data": "some data",  "method" : "machine learning"}, 
                                               description="Does machine learning on some data")
# pull the image
image = ContainerImage(ws, name='my-docker-image', tags='1')

# deploy webservice
service_name = 'my-web-service'
service = Webservice.deploy_from_image(deployment_config = aciconfig,
                                            image = image,
                                            name = service_name,
                                            workspace = ws)
service.wait_for_deployment(show_output = True)
print(service.state)


In [None]:
import json
import requests
import numpy as np
import matplotlib.pyplot as plt

n = 10  # set the number of values going into the running avg
values = np.random.normal(0,1,100)
values = np.cumsum(values)


running_avgs = []

for value in values:
    raw_data = json.dumps({"value": value, "n": n})
    raw_data = bytes(raw_data, encoding = 'utf8')
    
    # predict using the deployed model
    result = json.loads(service.run(input_data=raw_data))

    running_avgs.append(result)

plt.plot(values)
plt.plot(running_avgs)
display()

## Clean up resources

To keep the resource group and workspace for other tutorials and exploration, you can delete only the ACI deployment using this API call:

### Load telemetry data

In [None]:
import pandas as pd 
# import os 

print("Reading data ... ", end="")
telemetry = pd.read_csv('/dbfs/FileStore/tables/telemetry.csv')  # , parse_dates=['datetime'])
print("Done.")

print("Parsing datetime...", end="")
telemetry['datetime'] = pd.to_datetime(telemetry['datetime'], format="%m/%d/%Y %I:%M:%S %p")
telemetry.columns = ['timestamp', 'machineID', 'volt', 'rotate', 'pressure', 'vibration']
print("Done.")

In [None]:
service.delete()

# The end

Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.