# Creating a Real-Time Inferencing Service

You've spent a lot of time in this course training and registering machine learning models. Now it's time to deploy a model as a real-time service that clients can use to get predictions from new data.

## Before You Start

Before you start this lab, ensure that you have completed the *Create an Azure Machine Learning Workspace* and *Create a Compute Instance* tasks in [Lab 1: Getting Started with Azure Machine Learning](./labdocs/Lab01.md). Then open this notebook in Jupyter on your Compute Instance.

## Connect to Your Workspace

The first thing you need to do is to connect to your workspace using the Azure ML SDK.

> **Note**: If you do not have a current authenticated session with your Azure subscription, you'll be prompted to authenticate. Follow the instructions to authenticate using the code provided.

In [16]:
import joblib
from azureml import core
from sklearn import model_selection, tree, metrics
import pandas as pd
import numpy as np
from azureml.core import conda_dependencies, webservice
import json
import requests


In [2]:
ws = core.Workspace.from_config()
print(f'Ready to use Azure ML {core.VERSION} to work with {ws.name}')


Ready to use Azure ML 1.11.0 to work with workspace


## Train and Register a Model

You'll need a trained model to deploy. Run the cell below to train and register a model that predicts the likelihood of a clinic patient being diabetic.

In [3]:
experiment = core.Experiment(workspace=ws, name = 'diabetes-training')
run = experiment.start_logging()
print("Starting experiment:", experiment.name)

print("Loading Data...")
diabetes: pd.DataFrame = pd.read_csv('data/diabetes.csv')

X = diabetes[
        [
            'Pregnancies', 'PlasmaGlucose', 'DiastolicBloodPressure',
            'TricepsThickness', 'SerumInsulin', 'BMI', 'DiabetesPedigree',
            'Age',
        ]
    ].to_numpy()
y = diabetes['Diabetic'].to_numpy()

X_train, X_test, y_train, y_test = model_selection.train_test_split(
    X, y, test_size=0.30, random_state=0
)

print('Training a decision tree model')
model = tree.DecisionTreeClassifier().fit(X_train, y_train)

y_hat = model.predict(X_test)
acc = np.average(y_hat == y_test)
print('Accuracy:', acc)
run.log('Accuracy', acc)

y_scores = model.predict_proba(X_test)
auc = metrics.roc_auc_score(y_test,y_scores[:,1])
print('AUC:', auc)
run.log('AUC', auc)

model_file = 'diabetes_model.pkl'
joblib.dump(value=model, filename=model_file)
run.upload_file(
    name = 'outputs/' + model_file, path_or_stream = './' + model_file
)

run.complete()

run.register_model(
    model_path='outputs/diabetes_model.pkl', model_name='diabetes_model',
    tags={'Training context':'Inline Training'},
    properties={
        'AUC': run.get_metrics()['AUC'],
        'Accuracy': run.get_metrics()['Accuracy']
    }
)

print('Model trained and registered.')


Starting experiment: diabetes-training
Loading Data...
Training a decision tree model
Accuracy: 0.8903333333333333
AUC: 0.8778421812030448
Model trained and registered.


## Deploy a Model as a Web Service

Now you have trained and registered a machine learning model that classifies patients based on the likelihood of them having diabetes. This model could be used in a production environment such as a doctor's surgery where only patients deemed to be at risk need to be subjected to a clinical test for diabetes. To support this scenario, you will deploy the model as a web service.

First, let's determine what models you have registered in the workspace.

In [4]:
for model in core.Model.list(ws):
    print(model.name, 'version:', model.version)
    for tag_name in model.tags:
        tag = model.tags[tag_name]
        print ('\t',tag_name, ':', tag)
    for prop_name in model.properties:
        prop = model.properties[prop_name]
        print ('\t',prop_name, ':', prop)
    print('\n')


diabetes_model version: 9
	 Training context : Inline Training
	 AUC : 0.8778421812030448
	 Accuracy : 0.8903333333333333


diabetes_model version: 8
	 Training context : Pipeline


diabetes_model version: 7
	 Training context : Pipeline


diabetes_model version: 6
	 Training context : Pipeline


diabetes_model version: 5
	 Training context : Parameterized SKLearn Estimator
	 AUC : 0.8483904671874223
	 Accuracy : 0.7736666666666666


diabetes_model version: 4
	 Training context : Parameterized SKLearn Estimator
	 AUC : 0.8483904671874223
	 Accuracy : 0.7736666666666666


diabetes_model version: 3
	 Training context : Estimator
	 AUC : 0.8484929598487486
	 Accuracy : 0.774


diabetes_model version: 2
	 Training context : Estimator
	 AUC : 0.8483377282451863
	 Accuracy : 0.774


diabetes_model version: 1
	 Training context : Estimator
	 AUC : 0.8483377282451863
	 Accuracy : 0.774




Right, now let's get the model that we want to deploy. By default, if we specify a model name, the latest version will be returned.

In [5]:
model = ws.models['diabetes_model']
print(model.name, 'version', model.version)

diabetes_model version 9


Now you're ready to deploy. We'll deploy the container a service named **diabetes-service**. The deployment process includes the following steps:

1. Define an inference configuration, which includes the scoring and environment files required to load and use the model.
2. Define a deployment configuration that defines the execution environment in which the service will be hosted. In this case, an Azure Container Instance.
3. Deploy the model as a web service.
4. Verify the status of the deployed service.

> **More Information**: For more details about model deployment, and options for target execution environments, see the [documentation](https://docs.microsoft.com/en-gb/azure/machine-learning/service/how-to-deploy-and-where).

Deployment will take some time as it first runs a process to create a container image, and then runs a process to create a web service based on the image. When deployment has completed successfully, you'll see a status of **Healthy**.

In [9]:
inference_config = core.model.InferenceConfig(
    runtime='python',
    source_directory=folder_name,
    entry_script='score_diabetes.py',
    conda_file='diabetes_env.yml'
)

deployment_config = webservice.AciWebservice.deploy_configuration(
    cpu_cores=1, memory_gb=1
)

service_name = 'diabetes-service'

service = core.model.Model.deploy(
    workspace=ws, name=service_name, models=[model], 
    inference_config=inference_config, deployment_config=deployment_config
)

service.wait_for_deployment(True)
print(service.state)


Running..............................................................................................................
Succeeded
ACI service creation operation finished, operation "Succeeded"
Healthy


Hopefully, the deployment has been successful and you can see a status of **Healthy**. If not, you can use the following code to check the status and get the service logs to help you troubleshoot.

In [10]:
print(service.get_logs())

# If you need to make a change and redeploy, you may need to delete unhealthy service using the following code:
#service.delete()

2020-08-15T22:37:28,394589217+00:00 - gunicorn/run 
2020-08-15T22:37:28,394915217+00:00 - rsyslog/run 
2020-08-15T22:37:28,396764817+00:00 - iot-server/run 
2020-08-15T22:37:28,567318743+00:00 - nginx/run 
/usr/sbin/nginx: /azureml-envs/azureml_4b824bcb98517d791c41923f24d65461/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_4b824bcb98517d791c41923f24d65461/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_4b824bcb98517d791c41923f24d65461/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_4b824bcb98517d791c41923f24d65461/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_4b824bcb98517d791c41923f24d65461/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
Starting

Take a look at your workspace in the [Azure web interface](https://ml.azure.com) and view the **Endpoints** page, which shows the deployed services in your workspace.

You can also retrieve the names of web services in your workspace by running the following code:

In [11]:
for webservice_name in ws.webservices:
    print(webservice_name)

diabetes-service


## Use the Web Service

With the service deployed, now you can consume it from a client application.

In [13]:
x_new = [[2,180,74,24,21,23.9091702,1.488172308,22]]
print (f'Patient: {x_new[0]}')

input_json = json.dumps({"data": x_new})

predictions = service.run(input_data=input_json)

predicted_classes = json.loads(predictions)
print(predicted_classes[0])


Patient: [2, 180, 74, 24, 21, 23.9091702, 1.488172308, 22]
diabetic


You can also send multiple patient observations to the service, and get back a prediction for each one.

In [14]:
x_new = [
    [2,180,74,24,21,23.9091702,1.488172308,22],
    [0,148,58,11,179,39.19207553,0.160829008,45]
]

input_json = json.dumps({"data": x_new})

predictions = service.run(input_data=input_json)

predicted_classes = json.loads(predictions)
   
for i in range(len(x_new)):
    print (f"Patient {x_new[i]} {predicted_classes[i]}")


Patient [2, 180, 74, 24, 21, 23.9091702, 1.488172308, 22] diabetic
Patient [0, 148, 58, 11, 179, 39.19207553, 0.160829008, 45] not-diabetic


The code above uses the Azure ML SDK to connect to the containerized web service and use it to generate predictions from your diabetes classification model. In production, a model is likely to be consumed by business applications that do not use the Azure ML SDK, but simply make HTTP requests to the web service.

Let's determine the URL to which these applications must submit their requests:

In [15]:
endpoint = service.scoring_uri
print(endpoint)

http://06907fa4-7154-4421-b14b-9817d7ec617f.westus.azurecontainer.io/score


Now that you know the endpoint URI, an application can simply make an HTTP request, sending the patient data in JSON (or binary) format, and receive back the predicted class(es).

In [17]:
headers = {'Content-Type':'application/json'}

predictions = requests.post(endpoint, input_json, headers=headers)
predicted_classes = json.loads(predictions.json())

for i in range(len(x_new)):
    print (f"Patient {x_new[i]} {predicted_classes[i]}")


Patient [2, 180, 74, 24, 21, 23.9091702, 1.488172308, 22] diabetic
Patient [0, 148, 58, 11, 179, 39.19207553, 0.160829008, 45] not-diabetic


You've deployed your web service as an Azure Container Instance (ACI) service that requires no authentication. This is fine for development and testing, but for production you should consider deploying to an Azure Kubernetes Service (AKS) cluster and enabling authentication. This would require REST requests to include an **Authorization** header.

### More Information

For more information about publishing a model as a service, see the [documentation](https://docs.microsoft.com/azure/machine-learning/how-to-deploy-and-where)

## Clean Up

If you've finished exploring, you can delete your service by running the cell below. Then close this notebook and shut down your Compute Instance.

In [18]:
service.delete()