# Exercise 5 - Deploying a Model into Production

In [the previous exercise](./03%20-%20Compute%20Contexts.ipynb), you explored options for running experiments on local and remote compute to train machine learning models. Int his exercise, you'll build on that work and deploy a model as a production web service.

> **Important**: This exercise assumes you have completed the previous exercises in this series - specifically, you must have:
>
> - Created an Azure ML Workspace.
> - Uploaded the diabetes.csv data file to the workspace's default datastore.
> - Registered a **Diabetes Dataset** dataset in the workspace.
> - Trained and registered at least one **diabetes_model** model in the workspace.
>
> If you haven't done that, what are you waiting for?

## Task 1: Connect to Your Workspace

The first thing you need to do is to connect to your workspace using the Azure ML SDK. Let's start by ensuring you still have the latest version installed (if you ended and restarted your Azure Notebooks session, the environment may have been reset)

In [None]:
!pip install --upgrade azureml-sdk[notebooks]

import azureml.core
print("Ready to use Azure ML", azureml.core.VERSION)

Now you're ready to connect to your workspace. When you created it in the previous exercise, you saved its configuration; so now you can simply load the workspace from its configuration file.

> **Note**: If the authenticated session with your Azure subscription has expired since you completed the previous exercise, you'll be prompted to reauthenticate.

In [None]:
from azureml.core import Workspace

# Load the workspace from the saved config file
ws = Workspace.from_config()
print('Ready to work with', ws.name)

## Task 2: Deploy a Model as a Web Service

In the previous exercise, you trained and registered a machine learnming model that classifies patients based on the likelihood of them having diabetes. This model could be used in a production environment such as a doctor's surgery where only patients deemed to be at risk need to be subjected to a clinical test for diabetes. To support this scenario, you will deploy the model as a web service.

First, let's determine what models you have registered in the workspace.

In [None]:
from azureml.core import Model

for model in Model.list(ws):
    print(model.name, 'version:', model.version)
    for tag_name in model.tags:
        tag = model.tags[tag_name]
        print ('\t',tag_name, ':', tag)
    for prop_name in model.properties:
        prop = model.properties[prop_name]
        print ('\t',prop_name, ':', prop)
    print('\n')

Right, now let's get the model that we want to deploy. By default, if we specify a model name, the latest version will be returned.

In [None]:
model = ws.models['diabetes_model']
print(model.name, 'version', model.version)

We're going to create a web service to host this model, and this will require some code and configuration files; so let's create a folder for those.

In [None]:
import os

folder_name = 'diabetes_service'

# Create a folder for the web service files
experiment_folder = './' + folder_name
os.makedirs(folder_name, exist_ok=True)

print(folder_name, 'folder created.')

The web service where we deploy the model will need some Python code to load the input data, get the model from the workspace, and generate and return predictions. We'll save this code in a *scoring* file that will be deployed to the web service:

In [None]:
%%writefile $folder_name/score_diabetes.py
import json
import numpy as np
import os
import joblib
from azureml.core.model import Model
import azureml.train.automl # Required for AutoML models

# Called when the service is loaded
def init():
    global model
    # Get the path to the deployed model file and load it
    model_path = Model.get_model_path('diabetes_model')
    model = joblib.load(model_path)

# Called when a request is received
def run(raw_data):
    # Get the input data - the features of patients to be classified.
    data = json.loads(raw_data)['data']
    # Get a prediction from the model
    predictions = model.predict(data)
    # Get the corresponding classname for each prediction (0 or 1)
    classnames = ['not-diabetic', 'diabetic']
    predicted_classes = []
    for prediction in predictions:
        predicted_classes.append(classnames[prediction])
    # Return the predictions as JSON
    return json.dumps(predicted_classes)

The web service will be hosted in a container, and the container will need to install any required Python dependencies when it gets initialized. In this case, our scoring code requires **scikit-learn**, so we'll create a .yml file that tells the container host to install this into the environment.

In [None]:
from azureml.core.conda_dependencies import CondaDependencies 

# Add the dependencies for our model (AzureML defaults is already included)
myenv = CondaDependencies()
myenv.add_conda_package("scikit-learn")
myenv.add_pip_package("azureml-sdk[automl]") # Required for AutoML models

# Save the environment config as a .yml file
env_file = folder_name + "/diabetes_env.yml"
with open(env_file,"w") as f:
    f.write(myenv.serialize_to_string())
print("Saved dependency info in", env_file)

# Print the .yml file
with open(env_file,"r") as f:
    print(f.read())

Now you're ready to deploy. We'll deploy the container a service named **diabetes-service**. The deployment process includes the following steps:

1. Define an inference configuration, which includes the scoring and environment files required to load and use the model.
2. Define a deployment configuration that defines the execution environment in which the service will be hosted. In this case, an Azure Container Instance.
3. Deploy the model as a web service.
4. Verify the status of the deployed service.

> **More Information**: For more details about model deployment, and options for target execution environments, see the [documentation](https://docs.microsoft.com/en-gb/azure/machine-learning/service/how-to-deploy-and-where).

Deployment will take some time as it first runs a process to create a container image, and then runs a process to create a web service based on the image. When deployment has completed successfully, you'll see a status of **Healthy**.

In [None]:
from azureml.core.webservice import AciWebservice
from azureml.core.model import InferenceConfig

# Configure the scoring environment
inference_config = InferenceConfig(runtime= "python",
                                   source_directory = folder_name,
                                   entry_script="score_diabetes.py",
                                   conda_file="diabetes_env.yml")

deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)

service_name = "diabetes-service"

service = Model.deploy(ws, service_name, [model], inference_config, deployment_config)

service.wait_for_deployment(True)
print(service.state)

Take a look at your workspace in the [Azure portal](https://portal.azure.com) and view the **Deployments** tab, which shows the deployed services in your workspace.

You can also enumerate the web services using the following code:

In [None]:
for webservice_name in ws.webservices:
    webservice = ws.webservices[webservice_name]
    print(webservice.name)

## Task 3: Use the Web Service

With the service deployed, now you can consume it from a client application.

In [None]:
import json

x_new = [[2,180,74,24,21,23.9091702,1.488172308,22]]
print ('Patient: {}'.format(x_new[0]))

# Convert the array to a serializable list in a JSON document
input_json = json.dumps({"data": x_new})

# Call the web service, passing the input data (the web service will also accept the data in binary format)
predictions = service.run(input_data = input_json)

# Get the predicted class - it'll be the first (and only) one.
predicted_classes = json.loads(predictions)
print(predicted_classes[0])

You can also send multiple patient observations to the service, and get back a prediction for each one.

In [None]:
import json

# This time our input is an array of two feature arrays
x_new = [[2,180,74,24,21,23.9091702,1.488172308,22],
         [0,148,58,11,179,39.19207553,0.160829008,45]]

# Convert the array or arrays to a serializable list in a JSON document
input_json = json.dumps({"data": x_new})

# Call the web service, passing the input data
predictions = service.run(input_data = input_json)

# Get the predicted classes.
predicted_classes = json.loads(predictions)
   
for i in range(len(x_new)):
    print ("Patient {}".format(x_new[i]), predicted_classes[i] )

The code above uses the Azure ML SDK to connect to the containerized web service and use it to generate predictions from your diabetes classification model. In production, a model is likely to be consumed by business applications that do not use the Azure ML SDK, but simply make HTTP requests to the web service.

Let's determine the URL to which these applications must submit their requests:

In [None]:
endpoint = service.scoring_uri
print(endpoint)

Now that you know the endpoint URI, an application can simply make an HTTP request, sending the patient data in JSON (or binary) format, and receive back the predicted class(es).

In [None]:
import requests
import json

x_new = [[2,180,74,24,21,23.9091702,1.488172308,22],
         [0,148,58,11,179,39.19207553,0.160829008,45]]

# Convert the array to a serializable list in a JSON document
input_json = json.dumps({"data": x_new})

# Set the content type
headers = { 'Content-Type':'application/json' }

predictions = requests.post(endpoint, input_json, headers = headers)
predicted_classes = json.loads(predictions.json())

for i in range(len(x_new)):
    print ("Patient {}".format(x_new[i]), predicted_classes[i] )