# Tutorial #2:  Deploy model as an Azure Container Instances (ACI) webservice

In the [Tutorial #1](predict-emailservice-part1.ipynb), you trained machine learning models and then registered a model in your workspace on the cloud.
This tutorial will go through the steps to deploy the model as an [Azure Container Instances](https://docs.microsoft.com/azure/container-instances/) (ACI) webservice, a Docker image that encapsulates the scoring logic and the model itself. 

The codes here were tested using Azure ML SDK version:
- 1.6.0
- 1.3.0
- 1.0.72 on Microsoft Azure Notebooks with Python 3.6 kernel         
        
In this tutorial, you use Azure Machine Learning service to:
* Retrieve the model from your workspace
* Create a scoring script
* Deploy the model as an ACI webservice
* Test the deployed model

ACI is a great solution for testing and understanding the workflow. For scalable production deployments, consider using Azure Kubernetes Service. For more information, see [how to deploy and where](https://docs.microsoft.com/azure/machine-learning/service/how-to-deploy-and-where).



                                                                

## Connect Azure Machine Learning Workspace

Create a workspace object from the existing workspace. `Workspace.from_config()` reads the file **config.json** and loads the details into an object named `workspace`.

If you see this message:
"Performing interactive authentication. Please follow the instructions on the terminal.
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code &lt;token\&gt; to authenticate."
    
Click on the link and use the &lt;token\&gt; given to authenticate. After authenticated, run this script again to get load the Workspace.&lt;/token\&gt;&lt;/token\&gt;

In [1]:
# Load workspace configuration from the config.json file in the current folder.
from azureml.core import Workspace
workspace = Workspace.from_config()
# print(workspace.name, workspace.location, workspace.resource_group, workspace.location, sep='\t')

## Import Azure Machine Learning SDK for Python 

This step is to test you have installed Azure Machine Learning SDK for Python. Most of the coding will required the use of the Azure ML SDK. 

Display the Azure Machine Learning SDK version.

In [2]:
import azureml.core

# check core SDK version number (need Python 3.6 kernel if you run this in Microsoft Azure Notebooks)
print("Azure ML SDK Version: ", azureml.core.VERSION)

Azure ML SDK Version:  1.24.0


## Test model locally

Before deploying, make sure your model is working locally by:
* Loading test data
* Predicting test data
* Examining the confusion matrix

### Load test data
 
You can create your test data, but for simplicity this tutorial will only re-use the same dataset from Tutorial #1.

In [3]:
from azure_search_client import azure_search_client as azs_client 
from pandas.io.json import json_normalize
import pandas as pd
import json
import concurrent
import datetime
from itertools import chain
import random
import numpy as np
from random import sample
import warnings
warnings.filterwarnings("ignore")
import xgboost as xgb

In [4]:
def get_search_results(service, query):
    search_request_body = {
        "search": query,
        "featuresMode": "enabled",
        "scoringStatistics": "global",
        "count": "true"
    }
    
    return service.search(search_request_body)

def get_features(service, query):
    
    search_results = get_search_results(service, query.lower())

    # this will flatten the search json response into a panda dataframe
    azs_features = json_normalize(search_results)

    return azs_features

def retrieve_from_search(query, sessionid, azs_service):
    
    ## Call the api service to retrieve json format data
    json_search_results = get_search_results(azs_service, query)
    
    ## Flatten the json format data into pandas dataframe
    search_results = json_normalize(json_search_results).fillna(0)
    search_results = search_results.fillna(0).sort_values(['@search.score'], ascending=False)
    search_results['query'] = query.lower()
    search_results['sessionid'] = sessionid
    print('{} rows for query : {}'.format(search_results.shape[0], query))
    
    return search_results

def dcg_score(y_true, y_score, k=50, gains="exponential"):
    """Discounted cumulative gain (DCG) at rank k
    Parameters
    ----------
    y_true : array-like, shape = [n_samples]
        Ground truth (true relevance labels).
    y_score : array-like, shape = [n_samples]
        Predicted scores.
    k : int
        Rank.
    gains : str
        Whether gains should be "exponential" (default) or "linear".
    Returns
    -------
    DCG @k : float
    """
    order = np.argsort(y_score)[::-1]
    y_true = np.take(y_true, order[:k])
    if gains == "exponential":
        gains = 2 ** y_true - 1
    elif gains == "linear":
        gains = y_true
    else:
        raise ValueError("Invalid gains option.")

    # highest rank is 1 so +2 instead of +1
    discounts = np.log2(np.arange(len(y_true)) + 2)
    return np.sum(gains / discounts)

def ndcg_score(y_true, y_score, k=50, gains="exponential"):
    """Normalized discounted cumulative gain (NDCG) at rank k
    Parameters
    ----------
    y_true : array-like, shape = [n_samples]
        Ground truth (true relevance labels).
    y_score : array-like, shape = [n_samples]
        Predicted scores.
    k : int
        Rank.
    gains : str
        Whether gains should be "exponential" (default) or "linear".
    Returns
    -------
    NDCG @k : float
    """
    best = dcg_score(y_true, y_true, k, gains)
    actual = dcg_score(y_true, y_score, k, gains)
    return actual / best

This code connects to the api service using the config json

In [5]:
azs_service = azs_client.from_json('api_config.json')
azs_service

# Create the necessary queries to create dataset
query_input = ['powershell']

demo_query_dataset = pd.DataFrame()
sessionid =1
for query in query_input:
    demo_query_dataset = pd.concat([demo_query_dataset, retrieve_from_search(query, sessionid, azs_service)])
    sessionid+=1
    
demo_query_dataset['grade'] = demo_query_dataset['grade'].astype(int)

10 rows for query : powershell


### Retrieve the model

You registered a model in your workspace in tutorial1. Now, load this workspace and download the model to your local directory.

In [6]:
from azureml.core import Workspace
from azureml.core.model import Model
import os 

workspace = Workspace.from_config()

model=Model(workspace,'predict-link-xgboostmodel') # Default will get the latest version.

model.download(target_dir=os.getcwd(), exist_ok=True)
print(model)

# # Get the model file path.
file_path = os.path.join(os.getcwd(), "predict-link-xgboostmodel.pkl")

Model(workspace=Workspace.create(name='csidmlws', subscription_id='ebe8d9fa-67d0-4af1-bce2-4a5b07e50a42', resource_group='cmt-202011001'), name=predict-link-xgboostmodel, id=predict-link-xgboostmodel:5, version=5, tags={'Setting up xgb params': "{'objective': 'rank:ndcg', 'learning_rate': 0.5, 'min_child_weight': 0.1, 'max_depth': 10, 'n_estimators': 200}", 'XGboost NDCG is': '0.9783009542123144', 'Azure Search NDCG is': '0.7557246194876953'}, properties={})


### Predict test data

Feed the test dataset to the model to get predictions.

import joblib to load file path of model


In [7]:
import joblib # Use this to load the model that was created on local in Tutorial #1
xgbmodel = joblib.load(file_path)
print(xgbmodel)
features = ['@search.features.keyphrases.similarityScore',
   '@search.features.keyphrases.termFrequency',
   '@search.features.keyphrases.uniqueTokenMatches',
   '@search.features.query.similarityScore',
   '@search.features.query.termFrequency',
   '@search.features.query.uniqueTokenMatches',
   '@search.features.url.similarityScore',
   '@search.features.url.termFrequency',
   '@search.features.url.uniqueTokenMatches', '@search.score']
    
y_hat = xgbmodel.predict(demo_query_dataset[features])

output_score = pd.DataFrame(y_hat)
output_score.columns=['score']
output_score_sorted = output_score.sort_values('score', ascending = True).reset_index()
num_rows = output_score_sorted.shape[0]
demo_query_dataset['index'] = demo_query_dataset.index
check_acc = demo_query_dataset.merge(output_score_sorted, on = ['index'])

azs_search = check_acc.sort_values(['@search.score'], ascending = False)\
[['url', 'grade']].reset_index(drop = True)
xgboost = check_acc.sort_values(['sessionid','score'], ascending = False)\
[['url','grade']].reset_index(drop = True)

xgboost_ndcg = ndcg_score(check_acc.sort_values(['grade'], ascending = False).grade
           , xgboost.grade)

azs_search_ndcg = ndcg_score(check_acc.sort_values(['grade'], ascending = False).grade
           , azs_search.grade)

print('Score')
print(azs_search_ndcg, xgboost_ndcg)

XGBRanker(base_score=0.5, booster='gbtree', colsample_bylevel=1,
          colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1,
          importance_type='gain', interaction_constraints='', learning_rate=0.5,
          max_delta_step=0, max_depth=10, min_child_weight=0.1, missing=nan,
          monotone_constraints='()', n_estimators=200, n_jobs=2,
          num_parallel_tree=1, objective='rank:ndcg', random_state=0,
          reg_alpha=0, reg_lambda=1, scale_pos_weight=None, subsample=1,
          tree_method='exact', validate_parameters=1, verbosity=None)
Score
0.5692044172607783 0.988999019408289


Convert our demo_query_dataset into a JSON format as Azure Containter Instance read in data in JSON format.

In [8]:
json_data = demo_query_dataset[features].to_json(orient='records')
data = pd.read_json(json_data, orient='records')

prediction_score = xgbmodel.predict(data)
output_score = pd.DataFrame(prediction_score)
output_score.columns=['score']
output_score_sorted = output_score.sort_values('score', ascending = False).reset_index()

num_rows = output_score_sorted.shape[0]
demo_query_dataset['index'] = demo_query_dataset.index
check_acc = demo_query_dataset.merge(output_score_sorted, on = ['index'])

azs_search = check_acc.sort_values(['@search.score'], ascending = False)\
[['url', 'grade']].reset_index(drop = True)
xgboost = check_acc.sort_values(['sessionid','score'], ascending = False)\
[['url','grade']].reset_index(drop = True)

xgboost_ndcg = ndcg_score(check_acc.sort_values(['grade'], ascending = False).grade
           , xgboost.grade)

azs_search_ndcg = ndcg_score(check_acc.sort_values(['grade'], ascending = False).grade
           , azs_search.grade)

print('Score')
print(azs_search_ndcg, xgboost_ndcg)

Score
0.5692044172607783 0.988999019408289


## Deploy model

Once you've tested the model and are satisfied with the results, deploy the model as a web service hosted in ACI. 

To build the correct environment for ACI, provide the following:
* A scoring script to show how to use the model
* An environment file to show what packages need to be installed
* A configuration file to build the ACI
* The model you trained before

Note: the deployed web service can be found in your Workspace &gt; Deployments.

### Create a scoring script

You must provide two required functions in the scoring script:
* The `init()` function, which typically loads the model into a global object. This function is run only once when the Docker container is started. 

* The `run(input_data)` function uses the model to predict a value based on the input data. Inputs and outputs to the run typically use JSON for serialization and de-serialization, but other formats are supported.



In [9]:
%%writefile score_link_xgboost.py
import json
import numpy as np
import os
import pickle

import pandas as pd
import joblib
import xgboost as xgb
from azureml.core.model import Model


def init():
    global model
    # retrieve the path to the model file using the model name
    model_path = Model.get_model_path('predict-link-xgboostmodel')
    model = joblib.load(model_path)

def run(raw_data):

    data = pd.read_json(raw_data, orient='records')
    prediction_score = model.predict(data)
    output_score = pd.DataFrame(prediction_score)
    output_score.columns=['score']
    output_score_sorted = output_score.sort_values('score', ascending = False).reset_index()
    
    return output_score_sorted.to_json(orient='records')

Overwriting score_link_xgboost.py


### Deploy model as an ACI webservice
The deployment goes through these steps:
1. Build an image using:
   * The scoring file (`score_link_xgboost.py`)
   * The environment and resources required
   * The model file
1. Register that image under the workspace. 
1. Send the image to the ACI container.
1. Start up a container in ACI using the image.
1. Get the web service HTTP endpoint.

Note:
If you see "ERROR - Error, there is already a service with name xgboost-link-svc found in workspace <your workspace name>", 
go to your **Azure ML Workspace &gt; Deployments**, where you can delete it if you need to recreate the service.


This step may take a while to start after you run the cell, you will see the message "Running" appearing when it starts and will take few minutes to complete.


In [10]:
%%time
from azureml.core import Environment
from azureml.core.model import InferenceConfig, Model
from azureml.core.webservice import AciWebservice, Webservice
from azureml.core.conda_dependencies import CondaDependencies

# Configure the environment to run the scoring script.
# You definitley need azureml-sdk, azureml-defaults and joblib
env = Environment('my_env_xgboost')
cd = CondaDependencies.create(pip_packages=['azureml-sdk','azureml-defaults','xgboost==1.3.3','joblib==1.0.1'])
env.python.conda_dependencies = cd

# Combine scoring script & environment in Inference configuration
inference_config = InferenceConfig(entry_script="score_link_xgboost.py", environment=env)

# Set deployment configuration. While it depends on your model, the default of 1 core and 1 gigabyte of RAM 
# is usually sufficient for many models. If you feel you need more later, you would have to recreate the 
# image and redeploy the service.
deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, 
                                               memory_gb=1, 
                                               tags={"data": "Link Ranking",  "method" : "xgboost"}, 
                                               description='Predict link with xgboost')


# Define the model, inference, & deployment configuration and web service name and location to deploy
service = Model.deploy(
    workspace = workspace,
    name = "xgboost-link-svc",
    models = [model],
    inference_config = inference_config,
    deployment_config = deployment_config)

service.wait_for_deployment(show_output=True)

Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2021-04-26 01:22:04+00:00 Creating Container Registry if not exists.
2021-04-26 01:22:04+00:00 Registering the environment.
2021-04-26 01:22:05+00:00 Use the existing image.
2021-04-26 01:22:06+00:00 Generating deployment configuration.
2021-04-26 01:22:06+00:00 Submitting deployment to compute..
2021-04-26 01:22:12+00:00 Checking the status of deployment xgboost-link-svc..
2021-04-26 01:24:43+00:00 Checking the status of inference endpoint xgboost-link-svc.
Succeeded
ACI service creation operation finished, operation "Succeeded"
CPU times: user 422 ms, sys: 24.3 ms, total: 446 ms
Wall time: 2min 43s


This code should be run immediately after the above cell in order to capture the logs of the service.

In [11]:
print(service.get_logs())

2021-04-26T01:24:33,388038700+00:00 - iot-server/run 
2021-04-26T01:24:33,389407800+00:00 - rsyslog/run 
2021-04-26T01:24:33,397639600+00:00 - gunicorn/run 
2021-04-26T01:24:33,425529800+00:00 - nginx/run 
/usr/sbin/nginx: /azureml-envs/azureml_5a2612773d8ca70946c62488132c4632/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_5a2612773d8ca70946c62488132c4632/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_5a2612773d8ca70946c62488132c4632/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_5a2612773d8ca70946c62488132c4632/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_5a2612773d8ca70946c62488132c4632/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
EdgeHubC

Get the scoring web service's HTTP endpoint, which accepts REST client calls. This endpoint can be shared with anyone who wants to test the web service or integrate it into an application.

In [12]:
service = Webservice(workspace=workspace, name='xgboost-link-svc')
# print(service.scoring_uri)

## Test deployed model

Test the deployed model with another test data using another query. For simplicity, the same
training data is used.

The following code goes through these steps:
1. Send the data as a JSON array to the web service hosted in ACI. 

1. Use the SDK's `run` API to invoke the service. You can also make raw calls using any HTTP tool such as curl.

Run below code cell few times to see the predictions.

In [13]:
import json
import numpy as np

from azure_search_client import azure_search_client as azs_client 
from pandas.io.json import json_normalize
import pandas as pd

# Converting user query into JSON in order to push to ACI
json_data = demo_query_dataset[features].to_json(orient='records')

# Push JSON data into ACI to rerank the result which will return a json. "service" is the endpoint
result = service.run(input_data=json_data)
result = pd.read_json(result, orient='records')
result

Unnamed: 0,index,score
0,6,5.597492
1,2,3.208377
2,7,0.903777
3,8,0.903777
4,9,-0.201843
5,0,-0.783272
6,1,-0.783272
7,3,-1.290542
8,4,-1.290542
9,5,-1.290542


You can pass **result.url** back to your user interface

In [14]:
demo_query_dataset = demo_query_dataset.merge(result, on = ['index']) 
azs_search = demo_query_dataset.sort_values(['@search.score'], ascending = False)\
[['url', 'grade']].reset_index(drop = True)
xgboost = demo_query_dataset.sort_values(['score'], ascending = False)\
[['url','grade']].reset_index(drop = True)

compare_df = pd.concat([azs_search[['url', 'grade']], xgboost[['url', 'grade']]], axis = 1)
compare_df.columns = ['azs_search_url','azs_search_grade','xgboost_url','xgboost_grade']
compare_df.head()

Unnamed: 0,azs_search_url,azs_search_grade,xgboost_url,xgboost_grade
0,https://docs.microsoft.com/en-us/powershell/sc...,7,https://docs.microsoft.com/en-us/powershell/sc...,10
1,https://docs.microsoft.com/en-us/powershell/sc...,6,https://docs.microsoft.com/en-us/powershell/,9
2,https://docs.microsoft.com/en-us/powershell/,9,https://docs.microsoft.com/en-us/windows-serve...,8
3,https://docs.microsoft.com/en-us/powershell/sc...,5,https://docs.microsoft.com/en-us/powershell/az...,3
4,https://docs.microsoft.com/en-us/powershell/sc...,4,https://docs.microsoft.com/en-us/virtualizatio...,1


You can also send raw HTTP request to test the web service. Run below code cell few times to see different predictions.

In [15]:
import requests

input_data = json_data 

headers = {'Content-Type':'application/json'}

resp = requests.post(service.scoring_uri, input_data, headers=headers)

print("POST to url", service.scoring_uri)
# print("input data:", input_data)
print("prediction:", resp.text)

POST to url http://5a8f0051-0cbf-4522-ab66-97fae0522a11.southeastasia.azurecontainer.io/score
prediction: "[{\"index\":6,\"score\":5.597492218},{\"index\":2,\"score\":3.2083768845},{\"index\":7,\"score\":0.9037772417},{\"index\":8,\"score\":0.9037772417},{\"index\":9,\"score\":-0.2018429041},{\"index\":0,\"score\":-0.783272326},{\"index\":1,\"score\":-0.783272326},{\"index\":3,\"score\":-1.2905415297},{\"index\":4,\"score\":-1.2905415297},{\"index\":5,\"score\":-1.2905415297}]"


## Clean up resources

To keep the resource group and workspace for other tutorials and exploration, you can delete only the ACI deployment using this API call:

In [16]:
service.delete()

You can also manually delete the deployed web service which can be found in your **Azure ML Workspace &gt; Deployments**.

If you're not going to use what you've created here, delete the resources you just created so you don't incur any charges. 