# Creating a Real-Time Inferencing Service

After training a predictive model, you can deploy it as a real-time service that clients can use to get predictions from new data.

I've adapted the code found in our documentation for [creating a real-time inferening service](https://github.com/MicrosoftDocs/mslearn-aml-labs/blob/master/06-Deploying_a_model.ipynb)

In [11]:
import azureml.core
from azureml.core import Workspace
from azureml.core import Model

# Load the workspace from the saved config file
ws = Workspace.from_config()
print('Ready to use Azure ML {} to work with {}'.format(azureml.core.VERSION, ws.name))

Ready to use Azure ML 1.27.0 to work with mlops


In [12]:
# Set the folder for the experiment files used
training_folder = 'driver-training'

In [13]:
# List the models registered in the workspace

for model in Model.list(ws):
    print(model.name, 'version:', model.version)
    for tag_name in model.tags:
        tag = model.tags[tag_name]
        print ('\t',tag_name, ':', tag)
    for prop_name in model.properties:
        prop = model.properties[prop_name]
        print ('\t',prop_name, ':', prop)
    print('\n')

driver_model version: 3
	 Training context : Pipeline


driver_model.pkl version: 4


driver_model.pkl version: 3


driver_model version: 2
	 Training context : Pipeline


driver_model.pkl version: 2


driver_model version: 1
	 Training context : Pipeline


driver_model.pkl version: 1


compliance-classifier version: 17
	 type : classification
	 run_id : 0767a613-e8c2-4811-ae6b-64369e340725
	 build_number : 20201217.1


BikeBuyer.mml version: 4


AutoMLb9be0a22f28 version: 1


compliance-classifier version: 16
	 type : classification
	 run_id : a7eb32c9-3207-4a2c-865a-d65fba7946ba
	 build_number : 20201119.1


compliance-classifier version: 15
	 type : classification
	 run_id : 8e84d5f5-54c9-48c9-b733-9bbbcd86558b
	 build_number : 20201118.1


IBM_attrition_model version: 1
	 area : HR
	 type : attrition


AutoML015fd913221 version: 1


diabetes_model version: 1
	 Training context : Inline Training
	 AUC : 0.8857431111811085
	 Accuracy : 0.9002222222222223


glove-text-classifier versi

In [14]:
# Get the latest registered version of the model
model = ws.models['driver_model']
print(model.name, 'version', model.version)

driver_model version 3


In [15]:
# you need to figure out where you want to put the service folder
# ideally this should be at the same level as the driver-training folder
!pwd

/mnt/batch/tasks/shared/LS_root/mounts/clusters/davew202105/code/git/MLOps-E2E-sdkv2/Lab12


In [17]:
import os

# changeme
service_path = 'inf-svc'

# Create a folder for the web service files
os.makedirs(service_path, exist_ok=True)

print(service_path, 'folder created.')

inf-svc folder created.


The web service where we deploy the model will need some Python code to load the input data, get the model from the workspace, and generate and return predictions. We'll save this code in an entry script (often called a scoring script: the convention `score.py`) that will be deployed to the web service:

Steps: 

* Open the score.py file in the `Lab12` folder.  
* Get a feel for what it is doing
* you might need to change the model name
* this is what our inferencing webservice will call
* we need to upload this file into our `service_path` folder.  do that now (you can just copy/paste from Lab12 to `service_path`

In [18]:
# now we need to add our model dependencies (AzureML defaults is already included)
# this is similar to what we did in previous labs but now we are going to save the environment file to a yaml file
from azureml.core.conda_dependencies import CondaDependencies 

# we can get these from our training script
myenv = CondaDependencies()
myenv.add_conda_package("scikit-learn")
myenv.add_conda_package("pandas")
myenv.add_pip_package("lightgbm")

# Save the environment config as a .yml file
env_file = service_path + "/driver_env.yml"
with open(env_file,"w") as f:
    f.write(myenv.serialize_to_string())
print("Saved dependency info in", env_file)

# Print the .yml file
with open(env_file,"r") as f:
    print(f.read())

# the yml file and the score.py should likely be put under source control correctly in the 
# driver-service folder.  
# examine the created driver_env.yml so you understand it.  

Saved dependency info in inf-svc/driver_env.yml
# Conda environment specification. The dependencies defined in this file will
# be automatically provisioned for runs with userManagedDependencies=False.

# Details about the Conda environment file format:
# https://conda.io/docs/user-guide/tasks/manage-environments.html#create-env-file-manually

name: project_environment
dependencies:
  # The python interpreter version.
  # Currently Azure ML only supports 3.5.2 and later.
- python=3.6.2

- pip:
    # Required packages for AzureML execution, history, and data preparation.
  - azureml-defaults

  - lightgbm
- scikit-learn
- pandas
channels:
- anaconda
- conda-forge



In [19]:
# we should have score.py and driver_env.yml in this folder
# we will use those next
!ls $service_path

driver_env.yml	score.py


The deployment process includes the following steps:

1. Define an inference configuration, which includes the scoring and environment files required to load and use the model.
2. Define a deployment configuration that defines the execution environment in which the service will be hosted. In this case, an Azure Container Instance.
3. Deploy the model as a web service.
4. Verify the status of the deployed service.

> **More Information**: For more details about model deployment, and options for target execution environments, see the [documentation](https://docs.microsoft.com/azure/machine-learning/how-to-deploy-and-where).

Deployment will take some time as it first runs a process to create a container image, and then runs a process to create a web service based on the image. When deployment has completed successfully, you'll see a status of **Healthy**.

## Scoring/Inferencing Configuration

* Now we need to configure the scoring environment.  
* We'll use an ACI environment for this, but AKS is another approach that works well but requires a few more steps. We'll do that in another lab later.  
* You can view the status of the deployment in the AMLS portal under `Endpoints`.  
* the first deployment always takes a little while.  

In [22]:
from azureml.core.webservice import AciWebservice
from azureml.core.model import InferenceConfig

# Configure the scoring environment
inference_config = InferenceConfig(runtime= "python",
                                   source_directory = service_path,
                                   entry_script="score.py",
                                   conda_file="driver_env.yml")

deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)

service_name = "driver-service-sdk2"

service = Model.deploy(ws, service_name, [model], inference_config, deployment_config)

service.wait_for_deployment(True)
print(service.state)

Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2021-06-01 14:20:09+00:00 Creating Container Registry if not exists.
2021-06-01 14:20:09+00:00 Registering the environment.
2021-06-01 14:20:12+00:00 Use the existing image.
2021-06-01 14:20:12+00:00 Generating deployment configuration.
2021-06-01 14:20:13+00:00 Submitting deployment to compute..
2021-06-01 14:20:41+00:00 Checking the status of deployment driver-service-sdk2..
2021-06-01 14:22:28+00:00 Checking the status of inference endpoint driver-service-sdk2.
Succeeded
ACI service creation operation finished, operation "Succeeded"
Healthy


Hopefully, the deployment has been successful and you can see a status of **Healthy**. If not, you can use the following code to check the status and get the service logs to help you troubleshoot.

Take a look at your workspace in [Azure ML Studio](https://ml.azure.com) and view the **Endpoints** page, which shows the deployed services in your workspace.

In [21]:
# if you need to make a change and redeploy you may need to delete the old, unhealthy service.  Use the following code:
# service.delete()
# but don't delete the service yet after you get a good deployment.  We will use test the service next.  

In [23]:
# print the webservice endpoints
for webservice_name in ws.webservices:
    print(webservice_name)

driver-service-sdk2
driver-service
bikebuyeraciws
ar-factoring-2class
predict-attrition-svc1
support-ticket-duration
bikebuyer2-aks-service
bikebuyer-aci
compliance-classifier-service
bb-aks-service


In [24]:
# these are helpful commands
#print(service.state)
print(service.get_logs())
#service.delete()

2021-06-01T14:22:15,046221600+00:00 - iot-server/run 
2021-06-01T14:22:15,047277400+00:00 - rsyslog/run 
2021-06-01T14:22:15,047277300+00:00 - gunicorn/run 
2021-06-01T14:22:15,113525700+00:00 - nginx/run 
/usr/sbin/nginx: /azureml-envs/azureml_deade6e796a908fdec1522cb257b3da0/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_deade6e796a908fdec1522cb257b3da0/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_deade6e796a908fdec1522cb257b3da0/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_deade6e796a908fdec1522cb257b3da0/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_deade6e796a908fdec1522cb257b3da0/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
EdgeHubC

## Use the Web Service

With the service deployed, now you can consume it from a client application.  Let's simulate that with python, but using the SDK initially.  

In [28]:
import json

# this is our test data, it's actually 2 observations
TEST_ROW = [[0,1,8,1,0,0,1,0,0,0,0,0,0,0,12,1,0,0,0.5,0.3,0.610327781,7,1,-1,0,-1,1,1,1,2,1,65,1,0.316227766,0.669556409,0.352136337,3.464101615,0.1,0.8,0.6,1,1,6,3,6,2,9,1,1,1,12,0,1,1,0,0,1],
            [4,2,5,1,0,0,0,0,1,0,0,0,0,0,5,1,0,0,0.9,0.5,0.771362431,4,1,-1,0,0,11,1,1,0,1,103,1,0.316227766,0.60632002,0.358329457,2.828427125,0.4,0.5,0.4,3,3,8,4,10,2,7,2,0,3,10,0,0,1,1,0,1]
           ]
# convert the array to a serializable list in a JSON document
input_json = json.dumps({"data": TEST_ROW})

# call the web service, passing the input data (it will aslo accept the data in binary format)
predictions = service.run (input_data = input_json)
print("predictions: ", predictions)

predictions:  {'result': [0.027311045841034335, 0.0261231327307869]}


The code above uses the Azure ML SDK to connect to the containerized web service and use it to generate predictions from your model. In production, a model is likely to be consumed by business applications that do not use the Azure ML SDK, but simply make HTTP requests to the web service.

Let's determine the URL to which these applications must submit their requests:

In [29]:
endpoint = service.scoring_uri
print(endpoint)

http://07f0d01a-35fc-4398-9234-c2e936c593e9.eastus.azurecontainer.io/score


Now that you know the endpoint URI, an application can simply make an HTTP request, sending the input data in JSON (or binary) format, and receive back the predicted class(es).

In [40]:
# we'll recycle the same inputs and vars as above, but add some scaffolding to simulate an API call
import requests

# Set the content type
headers = { 'Content-Type':'application/json' }

predictions = requests.post(endpoint, input_json, headers = headers)
print (predictions)
print (predictions.text)

<Response [200]>
{"result": [0.027311045841034335, 0.0261231327307869]}


You've deployed your web service as an Azure Container Instance (ACI) service that requires no authentication. This is fine for development and testing, but for production you should consider deploying to an Azure Kubernetes Service (AKS) cluster and enabling authentication. This would require REST requests to include an **Authorization** header.

## Delete the Service

When you no longer need your service, you should delete it to avoid incurring unecessary charges.

In [None]:
service.delete()
print ('Service deleted.')

For more information about publishing a model as a service, see the [documentation](https://docs.microsoft.com/azure/machine-learning/how-to-deploy-and-where)