# Deploying a web service to Azure Container Instance (ACI)

This notebook shows the steps for deploying a model as service to ACI. The workflow is similar no matter where you deploy your model:

1. Register the model.
2. Prepare to deploy. (Specify assets, usage, compute target.)
3. Deploy the model to the compute target.
4. Test the deployed model, also called a web service.
5. Consume the model using Power BI

In [1]:
from azureml.core import Workspace
from azureml.core.compute import AksCompute, ComputeTarget
from azureml.core.webservice import Webservice, AksWebservice
from azureml.core.model import Model

In [2]:
import azureml.core
print(azureml.core.VERSION)

1.17.0


# Get workspace
Load existing workspace from the config file info.

In [3]:
from azureml.core.workspace import Workspace

ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')

demo-ent-ws
demo
westeurope
bcbf34a7-1936-4783-8840-8f324c37f354


# Get or Register the model
If not already done, register an existing trained model, add description and tags.

This is the model you've already trained using manual training or using [Automated Machine Learning](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-create-portal-experiments).

In the code snippet below we're using the already trained model original_model.pkl that is saved in the folder that contains this notebook. We're registering this model with the name "IBM-attrition-model". Later on we will use the same name in the scoring script.

In [4]:
from azureml.core.model import Model

# if the model is already registered as part of training then uncomment the line below. Make sure model is registered with the name "IBM_attrition_model"
model = Model(ws, 'aml-wrkshp-classif-empl-attrition')

#Register the model
# # if the model is not already registered as part of training register the original_model.pkl file provided in the same folder as this notebook
# model = Model.register(model_path = "original_model.pkl", # this points to a local file
#                        model_name = "IBM_attrition_model", # this is the name the model is registered as
#                        tags = {'area': "HR", 'type': "attrition"},
#                        description = "Attrition model to understand attrition risk",
#                        workspace = ws)

print('Model name: ', model.name, '\n', 'Model description: ', model.description, '\n', 'Model version: ', model.version, sep='')

Model name: aml-wrkshp-classif-empl-attrition
Model description: Binary classification model for employees attrition
Model version: 3


In [5]:
# Name of the sotred model as artifact
model.get_sas_urls()

{'classif-empl-attrition.pkl': 'https://demoentws5367325393.blob.core.windows.net/azureml/ExperimentRun/dcid.HD_b644d7b0-1449-4a4e-8b77-f5a45df2828e_7/outputs/classif-empl-attrition.pkl?sv=2019-02-02&sr=b&sig=xHNJS6N0IaOQO5qwxVlOHskUJb5lIKVwX3StQduROqE%3D&st=2020-11-12T09%3A03%3A45Z&se=2020-11-12T17%3A13%3A45Z&sp=r'}

# Prepare to deploy

To deploy the model, you need the following items:

- **An entry script**, this script accepts requests, scores the requests by using the model, and returns the results.
- **Dependencies**, like helper scripts or Python/Conda packages required to run the entry script or model.
- **The deployment configuration** for the compute target that hosts the deployed model. This configuration describes things like memory and CPU requirements needed to run the model.

## 1. Define your entry script and dependencies

### Entry script

We will first write the entry script as shown below. Note a few points in the entry script.

The script contains two functions that load and run the model:

**init()**: Typically, this function loads the model into a global object. This function is run only once, when the Docker container for your web service is started.

When you register a model, you provide a model name that's used for managing the model in the registry. You use this name with the Model.get_model_path() method to retrieve the path of the model file or files on the local file system. If you register a folder or a collection of files, this API returns the path of the directory that contains those files.

**run(input_data)**: This function uses the model to predict a value based on the input data. Inputs and outputs of the run typically use JSON for serialization and deserialization. You can also work with raw binary data. You can transform the data before sending it to the model or before returning it to the client.

In [30]:
%%writefile score.py
import os
import json
import numpy as np
import pandas as pd

from sklearn.linear_model import LogisticRegression

#sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23.
#Please import this functionality directly from joblib, which can be installed with: pip install joblib.

#from sklearn.externals import joblib
import joblib

from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType


input_sample = pd.DataFrame(data=[{'Age': 41, 'BusinessTravel': 'Travel_Rarely', 'DailyRate': 1102, 'Department': 'Sales', 'DistanceFromHome': 1, 'Education': 2, 'EducationField': 'Life Sciences', 'EnvironmentSatisfaction': 2, 'Gender': 'Female', 'HourlyRate': 94, 'JobInvolvement': 3, 'JobLevel': 2, 'JobRole': 'Sales Executive', 'JobSatisfaction': 4, 'MaritalStatus': 'Single', 'MonthlyIncome': 5993, 'MonthlyRate': 19479, 'NumCompaniesWorked': 8, 'OverTime': 0, 'PercentSalaryHike': 11, 'PerformanceRating': 3, 'RelationshipSatisfaction': 1, 'StockOptionLevel': 0, 'TotalWorkingYears': 8, 'TrainingTimesLastYear': 0, 'WorkLifeBalance': 1, 'YearsAtCompany': 6, 'YearsInCurrentRole': 4, 'YearsSinceLastPromotion': 0, 'YearsWithCurrManager': 5}])
output_sample = np.array([0])


def init():
    # AZUREML_MODEL_DIR is an environment variable created during deployment. Join this path with the filename of the model file.
    # It holds the path to the directory that contains the deployed model (./azureml-models/$MODEL_NAME/$VERSION).
    # If there are multiple models, this value is the path to the directory containing all deployed models (./azureml-models).
    global model
    
    model_path = os.getenv('AZUREML_MODEL_DIR')
    if (model_path is None):
        model_path = '.'
    
    model_path = os.path.join(model_path, 'classif-empl-attrition.pkl')
    print(model_path)
    
    # Deserialize the model file back into a sklearn model
    model = joblib.load(model_path)


@input_schema('data', PandasParameterType(input_sample))
@output_schema(NumpyParameterType(output_sample))
def run(data):
    try:
        result = model.predict(data)
        return json.dumps({"result": result.tolist()})
    except Exception as e:
        result = str(e)
        return json.dumps({"error": result})
    
# Test the functions if run locally
if __name__ == "__main__":
    init()
    
    prediction = run(input_sample)

    print(prediction)

Overwriting score.py


### Automatic schema generation
To automatically generate a schema for your web service, provide a sample of the input and/or output in the constructor for one of the defined type objects. The type and sample are used to automatically create the schema. Azure Machine Learning then creates an OpenAPI (Swagger) specification for the web service during deployment.
To use schema generation, include the _inference-schema_ package in your Conda environment file.

### Define dependencies

The following YAML is the Conda dependencies file we will use for inference. If you want to use automatic schema generation, your entry script must import the inference-schema packages.

In [52]:
%%writefile myenv.yml

name: project_environment

dependencies:
- python=3.6.2
- pip:
  - azureml-core==1.17.0
  - azureml-defaults==1.17.0
  - scikit-learn==0.22.2.post1
  - sklearn-pandas
  - inference-schema[numpy-support]
- pandas
- numpy


Overwriting myenv.yml


In [53]:
from azureml.core import Environment

# Instantiate environment
myenv = Environment.from_conda_specification(name = "myenv",
                                             file_path = "myenv.yml")

## 2. Define your inference configuration

The inference configuration describes how to configure the model to make predictions. This configuration isn't part of your entry script. It references your entry script and is used to locate all the resources required by the deployment. It's used later, when you deploy the model.

In [54]:
from azureml.core.model import InferenceConfig

inference_config = InferenceConfig(entry_script='score.py', environment=myenv)

## 3. Define your deployment configuration

Before deploying your model, you must define the deployment configuration. The deployment configuration is specific to the compute target that will host the web service. The deployment configuration isn't part of your entry script. It's used to define the characteristics of the compute target that will host the model and entry script.

In [55]:
from azureml.core.webservice import AciWebservice

aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, 
                                               memory_gb=1, 
                                               tags = {'area': "HR", 'type': "attrition"}, 
                                               description='Explain predictions on employee attrition')


# Deploy Model as Webservice on Azure Container Instance

Deployment uses the inference configuration deployment configuration to deploy the models. The deployment process is similar regardless of the compute target.

In summary, a deployed service is created from a model, script, and associated files. The resulting web service is a load-balanced, HTTP endpoint with a REST API. You can send data to this API and receive the prediction returned by the model.

In [56]:
# Delete web service if already exists
webservice_name = 'predict-attrition'

try:
    service = Webservice(name=webservice_name, workspace=ws)
    service.delete()
    
    print("The web service '", webservice_name, "' has been deleted.", sep='')
except Exception as e:
    if (e.args[0].split(':', 1)[0] == 'WebserviceNotFound'):
        print("The web service '", webservice_name, "' doesn't exist.", sep='')

The web service 'predict-attrition' has been deleted.


In [57]:
service = Model.deploy(ws,
                       name=webservice_name,
                       models=[model],
                       inference_config=inference_config,
                       deployment_config=aciconfig)

service.wait_for_deployment(True)

Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running...................................................................................
Succeeded
ACI service creation operation finished, operation "Succeeded"


In [58]:
print(service.state)

Healthy


In [59]:
# In case of deploying error, debug using the logs
print(service.get_logs())

2020-11-12T13:28:13.8748471Z stdout F 2020-11-12T13:28:13,867933000+00:00 - gunicorn/run 
2020-11-12T13:28:13.8748471Z stdout F 2020-11-12T13:28:13,866875800+00:00 - iot-server/run 
2020-11-12T13:28:13.8798434Z stdout F 2020-11-12T13:28:13,878948800+00:00 - nginx/run 
2020-11-12T13:28:13.893853Z stdout F 2020-11-12T13:28:13,876473000+00:00 - rsyslog/run 
2020-11-12T13:28:13.9328664Z stderr F /usr/sbin/nginx: /azureml-envs/azureml_372ee22c7114cd6a865677362c842316/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
2020-11-12T13:28:13.9328664Z stderr F /usr/sbin/nginx: /azureml-envs/azureml_372ee22c7114cd6a865677362c842316/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
2020-11-12T13:28:13.9388589Z stderr F /usr/sbin/nginx: /azureml-envs/azureml_372ee22c7114cd6a865677362c842316/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
2020-11-12T13:28:13.9388589Z stderr F /usr/sbin/nginx: 

## Web service schema

If you used automatic schema generation with your deployment, you can get the address of the OpenAPI specification for the service by using the swagger_uri property. (For example, print(service.swagger_uri).) Use a GET request or open the URI in a browser to retrieve the specification.

In [60]:
print(service.swagger_uri)

http://783095e2-403b-4910-83d0-8575207ec279.westeurope.azurecontainer.io/swagger.json


# Test the deployed model

Every deployed web service provides a REST API, so you can create client applications in a variety of programming languages. If you've enabled key authentication for your service, you need to provide a service key as a token in your request header. If you've enabled token authentication for your service, you need to provide an Azure Machine Learning JWT token as a bearer token in your request header.

In [61]:
import json
import pandas as pd

# the sample below contains the data for an employee that is not an attrition risk
sample = pd.DataFrame(data=[{'Age': 49, 'BusinessTravel': 'Travel_Rarely', 'DailyRate': 1098, 'Department': 'Research & Development', 'DistanceFromHome': 4, 'Education': 2, 'EducationField': 'Medical', 'EnvironmentSatisfaction': 4, 'Gender': 'Female', 'HourlyRate': 21, 'JobInvolvement': 3, 'JobLevel': 2, 'JobRole': 'Laboratory Technician', 'JobSatisfaction': 3, 'MaritalStatus': 'Single', 'MonthlyIncome': 711, 'MonthlyRate': 2124, 'NumCompaniesWorked': 8, 'OverTime': 1, 'PercentSalaryHike': 8, 'PerformanceRating': 4, 'RelationshipSatisfaction': 3, 'StockOptionLevel': 0, 'TotalWorkingYears': 2, 'TrainingTimesLastYear': 0, 'WorkLifeBalance': 3, 'YearsAtCompany': 2, 'YearsInCurrentRole': 1, 'YearsSinceLastPromotion': 0, 'YearsWithCurrManager': 1}])

# the sample below contains the data for an employee that is an attrition risk
# sample = pd.DataFrame(data=[{'Age': 49, 'BusinessTravel': 'Travel_Rarely', 'DailyRate': 1098, 'Department': 'Research & Development', 'DistanceFromHome': 4, 'Education': 2, 'EducationField': 'Medical', 'EnvironmentSatisfaction': 4, 'Gender': 'Female', 'HourlyRate': 21, 'JobInvolvement': 3, 'JobLevel': 2, 'JobRole': 'Laboratory Technician', 'JobSatisfaction': 3, 'MaritalStatus': 'Single', 'MonthlyIncome': 711, 'MonthlyRate': 2124, 'NumCompaniesWorked': 8, 'OverTime': 'Yes', 'PercentSalaryHike': 8, 'PerformanceRating': 4, 'RelationshipSatisfaction': 3, 'StockOptionLevel': 0, 'TotalWorkingYears': 2, 'TrainingTimesLastYear': 0, 'WorkLifeBalance': 3, 'YearsAtCompany': 2, 'YearsInCurrentRole': 1, 'YearsSinceLastPromotion': 0, 'YearsWithCurrManager': 1}])


# converts the sample to JSON string
sample = pd.DataFrame.to_json(sample)

# deserializes sample to a python object 
sample = json.loads(sample)

# serializes sample to JSON formatted string as expected by the scoring script
sample = json.dumps({"data":sample})

prediction = service.run(sample)

print(prediction)

{"result": [1]}


# Consume the model using Power BI
You can also consume the model from Power BI. See details [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-consume-web-service#consume-the-service-from-power-bi).
