Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# Model deployment to ACI

We finished the last Notebook by finding best fitting model using automated ML and registering it to our Azure ML account. In this Notebook, we deploy this model to an ACI instance and test it by scoring data against it. Scoring here happens in near-realtime, meaning that the data we score is pre-computed for us (such as a nightly batch job). Scoring can also happen in realtime, but as we will explore in a later Notebook, this requires more work. For predictive maintenance, realtime scoring is usually not needed, because models are used to predict when a machine is going to *about to* fail, which gives us some time to run unscheduled maintenance and replace parts.

## Create Experiment

As part of the setup we have already created an AML workspace. Let's load the workspace and create an experiment.

In [None]:
import json
import logging
import os
import random
import warnings
warnings.filterwarnings('ignore')

import numpy as np
import pandas as pd
from sklearn import datasets

import azureml.core
from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.train.automl import AutoMLConfig
from azureml.train.automl.run import AutoMLRun

We load the workspace directly from the config file we created in the early part of the course.

In [None]:
home_dir = os.path.expanduser('~')
config_path = os.path.join(home_dir, 'aml_config')
ws = Workspace.from_config(path = config_path)

experiment_name =  'pred-maint-automl' # choose a name for experiment
project_folder = '.' # project folder

experiment = Experiment(ws, experiment_name)

output = {}
output['SDK version'] = azureml.core.VERSION
output['Subscription ID'] = ws.subscription_id
output['Workspace'] = ws.name
output['Resource Group'] = ws.resource_group
output['Location'] = ws.location
output['Project Directory'] = project_folder
output['Experiment Name'] = experiment.name
pd.set_option('display.max_colwidth', -1)
pd.DataFrame(data = output, index = ['']).T

In [None]:
import azureml.core

print("SDK Version:", azureml.core.VERSION)

Next we load the test data, not to evaluate the model but only to use it to get some predictions from our deployed model.

In [None]:
%store -r X_test
%store -r y_test

## Create a scoring script

The first part of the deployment consists of pointing to the model we want to deploy. We can simply provide the model name, which was given to us at the time we registered the model. We can also go to the Azure portal to look up the model name.

Here's a quick sanity check to ensure that the model exists and can be loaded (loading the model in the current session is not required for deployment).

**Note**: You have to updated the `model_name` below. You can find the model name in your workspace in the azure portal.

In [None]:
from azureml.core.model import Model
model_name = "ENTER_REGISTERED_MODEL_NAME"

model = Model(workspace = ws, name = model_name)
print(model.id)

We now create a scoring script that will run every time we make a call to the deployed model. The scoring script consists of an `init` function that will load the model and a `run` function that will load the data we provide at score time and use the model to obtain predictions.

**Note**: You have to updated `model_name` below by replacing the placeholder with the name of your registered model. You can find the model name in your workspace in the azure portal.

In [None]:
%%writefile score.py
import pickle
import json
import numpy
from sklearn.externals import joblib
from azureml.core.model import Model
import azureml.train.automl

model_name = "ENTER_REGISTERED_MODEL_NAME"

def init():
    global model
    model_path = Model.get_model_path(model_name = model_name) # this name is modeld.id of model that we want to deploy
    # model_path = Model.get_model_path('model.pkl') # select this if deploying model from file
    # deserialize the model file back into a sklearn model
    model = joblib.load(model_path)

def run(rawdata):
    try:
        data = json.loads(rawdata)['data']
        data = numpy.array(data)
        result = model.predict(data)
    except Exception as e:
        result = str(e)
        return json.dumps({"error": result})
    return json.dumps({"result":result.tolist()})

## Create a conda environment file

We begin by retrieving the run ID for the automl experiment we ran in the last Notebook and pasting it in for the `run_id` argument in the `AutoMLRun` function below.

**Note**: You have to updated `run_id` below. You can find the `run_id` under the experiments in the azure portal.

In [None]:
ml_run = AutoMLRun(experiment = experiment, run_id = 'ENTER_AUTOML_RUN_ID')

Next we create a `yml` file for the conda environment that will be used to run the scoring script above. To ensure consistency of the scored results with the training results, the SDK dependencies need to mirror development environment (used for model training). In the code snippet below we need to provide `best_iter_num` with the number that corresponds to the best iteration from the AutoML Run ID.

In [None]:
best_iter_num = 6 # change this to the desired iteration run
dependencies = ml_run.get_run_sdk_dependencies(iteration = best_iter_num)

In [None]:
dependencies

In [None]:
for p in ['azureml-train-automl', 'azureml-sdk', 'azureml-core']:
    print('{}\t{}'.format(p, dependencies[p]))

We can automatically generate a Conda dependencies YAML file. This can serve as a baseline, which we can then modify to create the Conda environment we want to use in production. Note that there may be a lot of packages that we required during training but aren't needed in production for scoring.

In [None]:
from azureml.core.conda_dependencies import CondaDependencies 

myenv = CondaDependencies.create(conda_packages=['scikit-learn'])
# myenv.add_pip_package("azureml-train")
# myenv.add_pip_package("azureml-train-automl")
# myenv.add_pip_package("azureml-core")

with open("myenv.yml","w") as f:
    f.write(myenv.serialize_to_string())

In [None]:
!cat myenv.yml

We make modifications to the above file to get the following Conda dependencies file. 

In [None]:
%%writefile myenv.yml
name: myenv
channels:
  - defaults
dependencies:
  - pip:
    - scikit-learn==0.19.1
    - azureml-sdk[automl]==0.1.74

In [None]:
!cat myenv.yml

## Create a docker image

Using the scoring script and conda environment file, we can now create a docker image that will host the scoring script and a Python executable that meets the conda requirement dependencies laid out in the YAML file.

In [None]:
from azureml.core.image import Image, ContainerImage

image_config = ContainerImage.image_configuration(runtime = "python",
                                 execution_script = "score.py",
                                 conda_file = "myenv.yml",
                                 tags = {'area': "digits", 'type': "automl_classification"},
                                 description = "Image for automl classification sample")

From the image config file above we now create a Docker image.

In [None]:
%%time
image_name = experiment_name + "-img"

image = Image.create(name = image_name,
                     models = [model], 
                     image_config = image_config, 
                     workspace = ws)

image.wait_for_creation(show_output = True)

If the image creation fails, this is how we can access the log file and examine what went wrong.

In [None]:
print(image.image_build_log_uri)

This is the image location that will be used when the imaged is pulled down from Docker hub.

In [None]:
print(image.image_location)

Note that if the image was created in another session and we just wanted to point to it in this session, then we can just pass the image name and workspace to the `Image` function as follows:

In [None]:
image = Image(name = experiment_name + "-img", workspace = ws, version=1)
print(image.image_location)

## Deploy Image as web service on ACI

We are now ready to deploy our image as a web service on ACI. To do so, we first create a config file and then pass it to `deploy_from_image` along with a name for the service, the image we created in the last step, and our workspace.

In [None]:
from azureml.core.webservice import AciWebservice

aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1, 
                                               memory_gb = 1, 
                                               tags = {"method" : "automl"}, 
                                               description = 'Predictive maintenance using auto-ml')

If a service with the same name already exists, we can delete it by calling the `delete` method.

In [None]:
%%time
from azureml.core.webservice import Webservice

aci_service_name = experiment_name + "-aci"
print(aci_service_name)
# aci_service.delete()
aci_service = Webservice.deploy_from_image(deployment_config = aciconfig,
                                           image = image,
                                           name = aci_service_name,
                                           workspace = ws)
aci_service.wait_for_deployment(True)
print(aci_service.state)

## Debugging issues

Here's how we can get logs from the deployed service, which can help us debug any issues that cause the deployment to fail.

In [None]:
logs = aci_service.get_logs()

In [None]:
import re
import json

ll = re.findall(r"\{.*\}", logs)
dd = [json.loads(l) for l in ll]

for k in dd:
    if 'level' in k.keys():
        if k['level'] == 'INFO': 
            try:
                j = json.loads(k['message'])
                print(j['message'])
            except:
                print(k['message'])
        if k['level'] == 'ERROR': 
            print('================================================================================')
            print(json.loads(k['message'])['message'])

## Alternative deployments (optional)

There are two other ways that we could have launched our ACI deployment. The first one is by deploying directly from the image config file and the registered model. In this scenario, the deployment will first create the image from the registered model, and then deploy the docker container from the base image. So we combine two steps (image creation, service creation) into a single step. However, behind the scenes the steps still run individually and create corresponding resources.

In [None]:
# aci_service.delete()

In [None]:
%%time
from azureml.core.webservice import Webservice

aci_service_name = experiment_name + "-aci"
print(aci_service_name)

# aci_service = Webservice.deploy_from_model(deployment_config = aciconfig,
#                                        image_config = image_config,
#                                        models = [model], # this is the registered model object
#                                        name = aci_service_name,
#                                        workspace = ws)
# aci_service.wait_for_deployment(show_output = True)
# print(aci_service.state)

In the above example, we launched the ACI docker container from the image config file and the registered model. But we can take one further step back and simply provide the model pickle file and let it register the model, then create a docker image from it and finally launch an ACI docker container from the image. In this case, we are combining three steps into one.

In [None]:
# aci_service.delete()

In [None]:
from azureml.core.webservice import Webservice

aci_service_name = experiment_name + "-aci"
print(aci_service_name)

# aci_service = Webservice.deploy(deployment_config = aciconfig,
#                                 image_config = image_config,
#                                 model_paths = ['model.pkl'],
#                                 name = aci_service_name,
#                                 workspace = ws)

# aci_service.wait_for_deployment(show_output = True)
# print(aci_service.state)

In [None]:
print(aci_service.scoring_uri)

Combining many steps into one may save us a few lines of code, but it has the disadvantage of appearing to over-simplify the workflow. So it is probably best to avoid doing it, especially for production systems.

## Test Web Service

It is time to test our web service. To begin with, we will point to our service using `Webservice`. Note that we've already done this in the last step, so in the current session this is not a necessary step, but since we want to be able to test the service from any Python session, we will point to the service again here. There is next to no overhead in doing so.

In [None]:
from azureml.core.image import Image, ContainerImage
from azureml.core.webservice import Webservice

aci_service_name = experiment_name + "-aci"

aci_service = Webservice(workspace = ws, name = aci_service_name)

We can now proceed to testing the service. To do so, we will take a few random samples from `X_test` and dump its content into a json string (with UTF-8 encoding). This will act as the data that we intend to score. We can pass this data to the service using the `run` method, and it will return the predictions to us.

In [None]:
n = 5
sample_indices = np.random.permutation(X_test.shape[0])[0:n]

test_samples = json.dumps({"data": X_test.iloc[sample_indices, :].values.tolist()})
test_samples = bytes(test_samples, encoding = 'utf8')
print(test_samples)

# predict using the deployed model
prediction = aci_service.run(input_data = test_samples)
print('**********************************************')
print(prediction)

### Testing against a REST client (optional)

We were able to successfully test the scoring script from Python by calling the `run` method on it. But what happened in the background? We executed a REST call to the web application. Let's now re-do this from a REST client to confirm that everything works.

1. Install a REST client for your browser such as *RESTClient, a debugger for RESTful web services.* (Firefox)
2. In the section called **Request Headers** set Name to **Content-Type** and Attribute Value to **application/json**.
3. Copy the content of data snippet above to https://jsonformatter.org/ and validate it to make sure it's clean.
4. Paste the clean json into the section of the REST client called **Body**, set method to **POST** and add the API address (what we get when we run `myservice.scoring_uri`) to the section called **URL**.
5. Finally, hit **SEND** and scroll down to see your results in the tab called **Response**.
6. Note how you can see the original POST call, which looks like `curl -X POST -H 'Content-Type: application/json' -i 'http://<SCORING_URI>:80/score' --data ...`.

### Final project

<div class="alert alert-info">
In the above example, we took the data to be scored directly from `X_test`. But this data had already been pre-processed for us and was ready for scoring. A more realistic scenario involves getting raw data, pre-processing it and then feeding it to the deployed model for scoring. In this lab, we will implement this.
</div>

In [None]:
# generate the telemetry data
# compute moving average telemetries (you will need a way to compute moving averages at test time)
# append maintenance history and failure history to it
# finally, score the data to obtain 

### End of lab

# The end

Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.