# Creating a Real-Time Inferencing Service

You've spent a lot of time in this course training and registering machine learning models. Now it's time to deploy a model as a real-time service that clients can use to get predictions from new data.

## Connect to Your Workspace

The first thing you need to do is to connect to your workspace using the Azure ML SDK.

> **Note**: If the authenticated session with your Azure subscription has expired since you completed the previous exercise, you'll be prompted to reauthenticate.

In [1]:
import azureml.core
from azureml.core import Workspace

# Load the workspace from the saved config file
ws = Workspace.from_config()
print('Ready to use Azure ML {} to work with {}'.format(azureml.core.VERSION, ws.name))

Ready to use Azure ML 1.2.0 to work with myaml


## Train and Register a Model

Now let's train and register a model.

In [2]:
from azureml.core import Experiment
from azureml.core import Model
import pandas as pd
import numpy as np
import joblib
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import roc_auc_score
from sklearn.metrics import roc_curve

# Create an Azure ML experiment in your workspace
experiment = Experiment(workspace = ws, name = "diabetes-training")
run = experiment.start_logging()
print("Starting experiment:", experiment.name)

# load the diabetes dataset
print("Loading Data...")
diabetes = pd.read_csv('data/diabetes.csv')

# Separate features and labels
X, y = diabetes[['Pregnancies','PlasmaGlucose','DiastolicBloodPressure','TricepsThickness','SerumInsulin','BMI','DiabetesPedigree','Age']].values, diabetes['Diabetic'].values

# Split data into training set and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=0)

# Train a decision tree model
print('Training a decision tree model')
model = DecisionTreeClassifier().fit(X_train, y_train)

# calculate accuracy
y_hat = model.predict(X_test)
acc = np.average(y_hat == y_test)
print('Accuracy:', acc)
run.log('Accuracy', np.float(acc))

# calculate AUC
y_scores = model.predict_proba(X_test)
auc = roc_auc_score(y_test,y_scores[:,1])
print('AUC: ' + str(auc))
run.log('AUC', np.float(auc))

# Save the trained model
model_file = 'diabetes_model.pkl'
joblib.dump(value=model, filename=model_file)
run.upload_file(name = 'outputs/' + model_file, path_or_stream = './' + model_file)

# Complete the run
run.complete()

# Register the model
run.register_model(model_path='outputs/diabetes_model.pkl', model_name='diabetes_model',
                   tags={'Training context':'Inline Training'},
                   properties={'AUC': run.get_metrics()['AUC'], 'Accuracy': run.get_metrics()['Accuracy']})

print('Model trained and registered.')

Starting experiment: diabetes-training
Loading Data...
Training a decision tree model
Accuracy: 0.8936666666666667
AUC: 0.8808423304642021
Model trained and registered.


## Deploy a Model as a Web Service

You have trained and registered a machine learning model that classifies patients based on the likelihood of them having diabetes. This model could be used in a production environment such as a doctor's surgery where only patients deemed to be at risk need to be subjected to a clinical test for diabetes. To support this scenario, you will deploy the model as a web service.

First, let's determine what models you have registered in the workspace.

In [3]:
from azureml.core import Model

for model in Model.list(ws):
    print(model.name, 'version:', model.version)
    for tag_name in model.tags:
        tag = model.tags[tag_name]
        print ('\t',tag_name, ':', tag)
    for prop_name in model.properties:
        prop = model.properties[prop_name]
        print ('\t',prop_name, ':', prop)
    print('\n')

diabetes_model version: 6
	 Training context : Inline Training
	 AUC : 0.8808423304642021
	 Accuracy : 0.8936666666666667


diabetes_model version: 5
	 Training context : Pipeline


diabetes_model version: 4
	 Training context : Azure ML compute
	 AUC : 0.8869152478187508
	 Accuracy : 0.9017777777777778


diabetes_model version: 3
	 Training context : Estimator + Environment (Decision Tree)
	 AUC : 0.8814544934927504
	 Accuracy : 0.8975555555555556


diabetes_model version: 2
	 Training context : SKLearn Estimator (using Datastore)
	 AUC : 0.8568655044545174
	 Accuracy : 0.7893333333333333


diabetes_model version: 1
	 Training context : Estimator
	 AUC : 0.8484929598487486
	 Accuracy : 0.774




Right, now let's get the model that we want to deploy. By default, if we specify a model name, the latest version will be returned.

In [4]:
model = ws.models['diabetes_model']
print(model.name, 'version', model.version)

diabetes_model version 6


We're going to create a web service to host this model, and this will require some code and configuration files; so let's create a folder for those.

In [5]:
import os

folder_name = 'diabetes_service'

# Create a folder for the web service files
experiment_folder = './' + folder_name
os.makedirs(folder_name, exist_ok=True)

print(folder_name, 'folder created.')

diabetes_service folder created.


The web service where we deploy the model will need some Python code to load the input data, get the model from the workspace, and generate and return predictions. We'll save this code in an *entry script* that will be deployed to the web service:

In [6]:
%%writefile $folder_name/score_diabetes.py
import json
import joblib
import numpy as np
from azureml.core.model import Model

# Called when the service is loaded
def init():
    global model
    # Get the path to the deployed model file and load it
    model_path = Model.get_model_path('diabetes_model')
    model = joblib.load(model_path)

# Called when a request is received
def run(raw_data):
    # Get the input data as a numpy array
    data = np.array(json.loads(raw_data)['data'])
    # Get a prediction from the model
    predictions = model.predict(data)
    # Get the corresponding classname for each prediction (0 or 1)
    classnames = ['not-diabetic', 'diabetic']
    predicted_classes = []
    for prediction in predictions:
        predicted_classes.append(classnames[prediction])
    # Return the predictions as JSON
    return json.dumps(predicted_classes)

Writing diabetes_service/score_diabetes.py


The web service will be hosted in a container, and the container will need to install any required Python dependencies when it gets initialized. In this case, our scoring code requires **scikit-learn**, so we'll create a .yml file that tells the container host to install this into the environment.

In [7]:
from azureml.core.conda_dependencies import CondaDependencies 

# Add the dependencies for our model (AzureML defaults is already included)
myenv = CondaDependencies()
myenv.add_conda_package('scikit-learn')

# Save the environment config as a .yml file
env_file = folder_name + "/diabetes_env.yml"
with open(env_file,"w") as f:
    f.write(myenv.serialize_to_string())
print("Saved dependency info in", env_file)

# Print the .yml file
with open(env_file,"r") as f:
    print(f.read())

Saved dependency info in diabetes_service/diabetes_env.yml
# Conda environment specification. The dependencies defined in this file will
# be automatically provisioned for runs with userManagedDependencies=False.

# Details about the Conda environment file format:
# https://conda.io/docs/user-guide/tasks/manage-environments.html#create-env-file-manually

name: project_environment
dependencies:
  # The python interpreter version.
  # Currently Azure ML only supports 3.5.2 and later.
- python=3.6.2

- pip:
    # Required packages for AzureML execution, history, and data preparation.
  - azureml-defaults

- scikit-learn
channels:
- anaconda
- conda-forge



Now you're ready to deploy. We'll deploy the container a service named **diabetes-service**. The deployment process includes the following steps:

1. Define an inference configuration, which includes the scoring and environment files required to load and use the model.
2. Define a deployment configuration that defines the execution environment in which the service will be hosted. In this case, an Azure Container Instance.
3. Deploy the model as a web service.
4. Verify the status of the deployed service.

> **More Information**: For more details about model deployment, and options for target execution environments, see the [documentation](https://docs.microsoft.com/azure/machine-learning/how-to-deploy-and-where).

Deployment will take some time as it first runs a process to create a container image, and then runs a process to create a web service based on the image. When deployment has completed successfully, you'll see a status of **Healthy**.

In [8]:
from azureml.core.webservice import AciWebservice
from azureml.core.model import InferenceConfig

# Configure the scoring environment
inference_config = InferenceConfig(runtime= "python",
                                   source_directory = folder_name,
                                   entry_script="score_diabetes.py",
                                   conda_file="diabetes_env.yml")

deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)

service_name = "diabetes-service"

service = Model.deploy(ws, service_name, [model], inference_config, deployment_config)

service.wait_for_deployment(True)
print(service.state)

Running..................................................................................................
Succeeded
ACI service creation operation finished, operation "Succeeded"
Healthy


Hopefully, the deployment has been successful and you can see a status of **Healthy**. If not, you can use the following code to check the status and get the service logs to help you troubleshoot.

In [9]:
print(service.state)
print(service.get_logs())

# If you need to make a change and redeploy, you may need to delete unhealthy service using the following code:
#service.delete()

Healthy
2020-04-13T21:56:33,069563338+00:00 - rsyslog/run 
2020-04-13T21:56:33,070566743+00:00 - gunicorn/run 
2020-04-13T21:56:33,072464551+00:00 - iot-server/run 
2020-04-13T21:56:33,074916163+00:00 - nginx/run 
/usr/sbin/nginx: /azureml-envs/azureml_4b824bcb98517d791c41923f24d65461/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_4b824bcb98517d791c41923f24d65461/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_4b824bcb98517d791c41923f24d65461/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_4b824bcb98517d791c41923f24d65461/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_4b824bcb98517d791c41923f24d65461/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)


Take a look at your workspace in [Azure ML Studio](https://ml.azure.com) and view the **Endpoints** page, which shows the deployed services in your workspace.

You can also retrieve the names of web services in your workspace by running the following code:

In [10]:
for webservice_name in ws.webservices:
    print(webservice_name)

diabetes-service


## Use the Web Service

With the service deployed, now you can consume it from a client application.

In [15]:
import json

x_new = [[2,180,74,24,21,23.9091702,1.488172308,22]]
print ('Patient: {}'.format(x_new[0]))

# Convert the array to a serializable list in a JSON document
input_json = json.dumps({"data": x_new})

# Call the web service, passing the input data (the web service will also accept the data in binary format)
predictions = service.run(input_data = input_json)

# Get the predicted class - it'll be the first (and only) one.
predicted_classes = json.loads(predictions)
print(predicted_classes[0])

Patient: [2, 180, 74, 24, 21, 23.9091702, 1.488172308, 22]
diabetic


In [16]:
json.dumps({"data": x_new})

'{"data": [[2, 180, 74, 24, 21, 23.9091702, 1.488172308, 22]]}'

You can also send multiple patient observations to the service, and get back a prediction for each one.

In [12]:
import json

# This time our input is an array of two feature arrays
x_new = [[2,180,74,24,21,23.9091702,1.488172308,22],
         [0,148,58,11,179,39.19207553,0.160829008,45]]

# Convert the array or arrays to a serializable list in a JSON document
input_json = json.dumps({"data": x_new})

# Call the web service, passing the input data
predictions = service.run(input_data = input_json)

# Get the predicted classes.
predicted_classes = json.loads(predictions)
   
for i in range(len(x_new)):
    print ("Patient {}".format(x_new[i]), predicted_classes[i] )

Patient [2, 180, 74, 24, 21, 23.9091702, 1.488172308, 22] diabetic
Patient [0, 148, 58, 11, 179, 39.19207553, 0.160829008, 45] not-diabetic


The code above uses the Azure ML SDK to connect to the containerized web service and use it to generate predictions from your diabetes classification model. In production, a model is likely to be consumed by business applications that do not use the Azure ML SDK, but simply make HTTP requests to the web service.

Let's determine the URL to which these applications must submit their requests:

In [13]:
endpoint = service.scoring_uri
print(endpoint)

http://0b4964f1-e727-4a27-a7a2-ec6d98d4d3f4.northeurope.azurecontainer.io/score


Now that you know the endpoint URI, an application can simply make an HTTP request, sending the patient data in JSON (or binary) format, and receive back the predicted class(es).

In [14]:
import requests
import json

x_new = [[2,180,74,24,21,23.9091702,1.488172308,22],
         [0,148,58,11,179,39.19207553,0.160829008,45]]

# Convert the array to a serializable list in a JSON document
input_json = json.dumps({"data": x_new})

# Set the content type
headers = { 'Content-Type':'application/json' }

predictions = requests.post(endpoint, input_json, headers = headers)
predicted_classes = json.loads(predictions.json())

for i in range(len(x_new)):
    print ("Patient {}".format(x_new[i]), predicted_classes[i] )

Patient [2, 180, 74, 24, 21, 23.9091702, 1.488172308, 22] diabetic
Patient [0, 148, 58, 11, 179, 39.19207553, 0.160829008, 45] not-diabetic


You've deployed your web service as an Azure Container Instance (ACI) service that requires no authentication. This is fine for development and testing, but for production you should consider deploying to an Azure Kubernetes Service (AKS) cluster and enabling authentication. This would require REST requests to include an **Authorization** header.

## Delete the Service

When you no longer need your service, you should delete it to avoid incurring unecessary charges.

In [17]:
service.delete()
print ('Service deleted.')

Service deleted.


## Deploy to a Local Container 

In [1]:
import azureml.core
from azureml.core import Workspace
from azureml.core import Model

# Load the workspace from the saved config file
ws = Workspace.from_config()

In [3]:
from azureml.core import Model

for model in Model.list(ws):
    print(model.name, 'version:', model.version)
    for tag_name in model.tags:
        tag = model.tags[tag_name]
        print ('\t',tag_name, ':', tag)
    for prop_name in model.properties:
        prop = model.properties[prop_name]
        print ('\t',prop_name, ':', prop)
    print('\n')

diabetes_model version: 6
	 Training context : Inline Training
	 AUC : 0.8808423304642021
	 Accuracy : 0.8936666666666667


diabetes_model version: 5
	 Training context : Pipeline


diabetes_model version: 4
	 Training context : Azure ML compute
	 AUC : 0.8869152478187508
	 Accuracy : 0.9017777777777778


diabetes_model version: 3
	 Training context : Estimator + Environment (Decision Tree)
	 AUC : 0.8814544934927504
	 Accuracy : 0.8975555555555556


diabetes_model version: 2
	 Training context : SKLearn Estimator (using Datastore)
	 AUC : 0.8568655044545174
	 Accuracy : 0.7893333333333333


diabetes_model version: 1
	 Training context : Estimator
	 AUC : 0.8484929598487486
	 Accuracy : 0.774




In [4]:
model = Model(ws, 'diabetes_model')

In [5]:
model.serialize()

{'createdTime': '2020-04-13T21:11:56.471008+00:00',
 'createdBy': {'userObjectId': 'f64e8bcf-37e5-4336-8308-b2bbc84751e3',
  'userPuId': '100320003F695FEE',
  'userIdp': 'live.com',
  'userAltSecId': '1:live.com:00030000E22C9612',
  'userIss': 'https://sts.windows.net/b114a8b5-5fa7-4e53-ab5b-fadfbe7ac46f/',
  'userTenantId': 'b114a8b5-5fa7-4e53-ab5b-fadfbe7ac46f',
  'userName': 'Parinaz Bassampour'},
 'description': None,
 'id': 'diabetes_model:6',
 'mimeType': 'application/json',
 'name': 'diabetes_model',
 'framework': 'Custom',
 'frameworkVersion': None,
 'tags': {'Training context': 'Inline Training'},
 'properties': {'AUC': '0.8808423304642021', 'Accuracy': '0.8936666666666667'},
 'unpack': False,
 'url': 'aml://asset/ae4a532cca7e46358158c26ec2085762',
 'version': 6,
 'experimentName': 'diabetes-training',
 'runId': '2bfbe196-13a6-41a2-bba5-65acff49fdc4',
 'runDetails': 'Run(Experiment: diabetes-training,\nId: 2bfbe196-13a6-41a2-bba5-65acff49fdc4,\nType: None,\nStatus: Completed)'

In [8]:
from azureml.core.webservice import LocalWebservice 
import json
from azureml.core.model import InferenceConfig

folder_name = 'diabetes_service'

x_new = [[2,180,74,24,21,23.9091702,1.488172308,22],
         [0,148,58,11,179,39.19207553,0.160829008,45]]


deployment_config = LocalWebservice.deploy_configuration(port=8890)

# Configure the scoring environment
inference_config = InferenceConfig(runtime= "python",
                                   source_directory = folder_name,
                                   entry_script="score_diabetes.py",
                                   conda_file="diabetes_env.yml")

 
deployment_config = LocalWebservice.deploy_configuration(port=8890) 
service = Model.deploy(ws, 'test-local', [model], inference_config, deployment_config) 

Downloading model diabetes_model:6 to /tmp/azureml_54ljvzwa/diabetes_model/6
Generating Docker build context.
Package creation Succeeded
Logging into Docker registry myaml4968649f.azurecr.io
Logging into Docker registry myaml4968649f.azurecr.io
Building Docker image from Dockerfile...
Step 1/5 : FROM myaml4968649f.azurecr.io/azureml/azureml_2700b94ff0947541a5ba92bcd69cf07c
 ---> 7d669b01a6a4
Step 2/5 : COPY azureml-app /var/azureml-app
 ---> 40634f73077f
Step 3/5 : RUN mkdir -p '/var/azureml-app' && echo eyJhY2NvdW50Q29udGV4dCI6eyJzdWJzY3JpcHRpb25JZCI6IjQ2OTI2YmZmLWZlN2QtNDI4NC1iYzYyLWVhZmRkYThkOGYyYyIsInJlc291cmNlR3JvdXBOYW1lIjoiZGF0YXNpZW5jZXNvbHV0aW9uYXp1cmUiLCJhY2NvdW50TmFtZSI6Im15YW1sIiwid29ya3NwYWNlSWQiOiI1MTg3ZGJlYS0yNjgxLTQ3ZDEtOWUxMS1mZmJkZjg0YTI5MzgifSwibW9kZWxzIjp7fSwibW9kZWxzSW5mbyI6e319 | base64 --decode > /var/azureml-app/model_config_map.json
 ---> Running in 375cab2e30ec
 ---> 3aff67a570d7
Step 4/5 : RUN mv '/var/azureml-app/tmpltj_3wjg.py' /var/azureml-app/main.py
 ---

In [13]:
x_new = [[2,180,74,24,21,23.9091702,1.488172308,22],
         [0,148,58,11,179,39.19207553,0.160829008,45]]

# Convert the array to a serializable list in a JSON document
input_json = json.dumps({"data": x_new})

print(service.run(input_data = input_json)) 

["diabetic", "not-diabetic"]


In [14]:
service.scoring_uri

'http://localhost:8890/score'

In [16]:
service.delete()
print ('Service deleted.')

Container has been successfully cleaned up.
Service deleted.


In [18]:
from azureml.core.webservice import LocalWebservice 
LocalWebservice.list(ws)

[]

For more information about publishing a model as a service, see the [documentation](https://docs.microsoft.com/azure/machine-learning/how-to-deploy-and-where)