# Deploying a Model using Azure Machine Learning

The steps to deploy any model are:

1. Register the model
2. Prepare an entry script
3. Prepare an inference configuration and a deployment configuration
4. Deploy the model locally to ensure everything works
5. Choose a compute target.
6. Re-deploy the model to the cloud
7. Test the resulting web service.


[You can learn more by reading these official docs](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-and-where)

## 0. Connect to AML

In [1]:
import azureml.core
from azureml.core import Workspace, Dataset, Model, Environment
from azureml.core.model import InferenceConfig

subscription_id = 'ad2a181b-b804-4179-904a-012445b7d1f5'
resource_group = 'analyticsf7fa8b0c'
workspace_name = 'SignalBoxMLDev'

ws = Workspace(subscription_id, resource_group, workspace_name)

print('Connected to workspace:', ws.name)

model_version = 'parameter-set-1'
tags = {
    'source': 'tutorial',
    'production': False,
    'version': model_version
}

If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.
Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.


Connected to workspace: SignalBoxMLDev


## 1. Register a model

A typical situation for a deployed machine learning service is that you need the following components:

* resources representing the specific model that you want deployed (for example: a pytorch model file)
* code that you will be running in th service, that executes the model on a given input

Azure Machine Learning allows you to separate the deployment into two separate components, so that you can keep the same code, but merely update the model. We define the mechanism by which you upload a model separately from your code as "registering the model".

You can register a model by providing the local path of the model. You can provide the path of either a folder or a single file on your local machine.

In [2]:
# define a simple file as the specific 'model' we're going to use.
import pickle

model_parameters = {'name': 'tutorial-model',
        'parameters': {
            'weights': [0.6, 0.3, 0.1]
        }
    }
model_filename = 'tutorial_model.pkl'

with open(model_filename, 'wb') as handle:
    pickle.dump(model_parameters, handle, protocol=pickle.HIGHEST_PROTOCOL)

In [76]:
# register the model
model_properties = {
    'source': 'tutorial',
    'version': model_version
}
model = Model.register(ws, 
                       model_name=model_parameters['name'], 
                       model_path=model_filename, 
                       tags=tags
                      )

Registering model tutorial-model


In [4]:
print(model)

Model(workspace=Workspace.create(name='SignalBoxMLDev', subscription_id='ad2a181b-b804-4179-904a-012445b7d1f5', resource_group='analyticsf7fa8b0c'), name=tutorial-model, id=tutorial-model:1, version=1, tags={'source': 'tutorial', 'production': 'False'}, properties={})


## 2. Prepare an entry script

The entry script receives data submitted to a deployed web service and passes it to the model. It then returns the model's response to the client. The script is specific to your model. The entry script must understand the data that the model expects and returns.

> You can use the environment variable `AZUREML_MODEL_DIR` to locate your model that you registered earlier

In [46]:
%%writefile source_dir/entry.py 

# this is a very basic entry script
import json

def init():
    print('This is init')

def run(data):
    test = json.loads(data)
    print(f'received data {test}')
    return(f'test is {test}')

Overwriting source_dir/entry.py


In [102]:
%%writefile source_dir/entry.py 


# this is an actual entry script
import json
import pickle
import random
import os

possible_outcomes = [
    {
        'max': 5,
        'min': 3,
        'total': 7
    },
    {
        'max': 10,
        'min': 5,
        'total': 55
    },
    {
        'max': 20,
        'min': 10,
        'total': 15
    }
]
def init():
    global model_parameters
    with open(os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'tutorial_model.pkl'), 'rb') as handle:
        model_parameters = pickle.load(handle)['parameters']

# request should have: payload: {},  version: str
def run(request):
    print(request)
    body = json.loads(request)
    version = body['version']
    ## check we can handle the version
    if version is str:
        print(f'Can handle version {version}')

    # Run inference
    recommendation = recommend(body['payload'])
    print('recommendation:', recommendation)
    wrapper = {
        'recommendedParameters': recommendation
    }
    return wrapper

# assuming this is a paremeter-set recommender
# payload is json with arguments, parameters (both objects)
def recommend(r):
    # sort the offers by ID
    parameterBounds = r['parameterBounds']
    print(parameterBounds)
    arguments = r['arguments']
    print(arguments)
    # you can load parameters from the model if you need ot.
    print('model params:', model_parameters['weights'])
    # random.choices() returns a list of k length
    return random.choices(possible_outcomes, model_parameters['weights'], k=1)[0] 



Overwriting source_dir/entry.py


## 3. Prepare an inference configuration and a deployment configuration

### Inference

An inference configuration describes the Docker container and files to use when initializing your web service. All of the files within your source directory, including subdirectories, will be zipped up and uploaded to the cloud when you deploy your web service.

### Deployment

A deployment configuration specifies the amount of memory and cores to reserve for your webservice will require in order to run, as well as configuration details of the underlying webservice. For example, a deployment configuration lets you specify that your service needs 2 gigabytes of memory, 2 CPU cores, 1 GPU core, and that you want to enable autoscaling.

In [87]:
env = Environment.from_pip_requirements(name='tutorial_environment', file_path="./model_requirements.txt")
inference_config = InferenceConfig(environment=env, source_directory='source_dir', entry_script='./entry.py')

In [88]:
# this creates a local webservice
from azureml.core.webservice import LocalWebservice
deployment_config = LocalWebservice.deploy_configuration(port=6789)

## 4. Deploy the model locally to ensure everything works

This part needs docker installed and running

In [62]:
service = Model.deploy(ws, "tutorial-service-local", [model], inference_config, deployment_config)
service.wait_for_deployment(show_output=True)
print(service.get_logs())

Downloading model tutorial-model:1 to /var/folders/r9/zky7_kgn5955p246_2p84brh0000gn/T/azureml_ae4zqmaa/tutorial-model/1
Generating Docker build context.
Package creation Succeeded
Logging into Docker registry mlresgistryfe091fe9.azurecr.io
Logging into Docker registry mlresgistryfe091fe9.azurecr.io
Building Docker image from Dockerfile...
Step 1/5 : FROM mlresgistryfe091fe9.azurecr.io/azureml/azureml_984ae42c92ab7e15d7808a45a9ac459f
 ---> 2f5da5597264
Step 2/5 : COPY azureml-app /var/azureml-app
 ---> 10f0c8dc7065
Step 3/5 : RUN mkdir -p '/var/azureml-app' && echo eyJhY2NvdW50Q29udGV4dCI6eyJzdWJzY3JpcHRpb25JZCI6ImFkMmExODFiLWI4MDQtNDE3OS05MDRhLTAxMjQ0NWI3ZDFmNSIsInJlc291cmNlR3JvdXBOYW1lIjoiYW5hbHl0aWNzZjdmYThiMGMiLCJhY2NvdW50TmFtZSI6InNpZ25hbGJveG1sZGV2Iiwid29ya3NwYWNlSWQiOiIyNTJmNTM5Ni03NmYwLTRjZDYtOGE3OS0wYjQ5ZGI3NmM1MGIifSwibW9kZWxzIjp7fSwibW9kZWxzSW5mbyI6e319 | base64 --decode > /var/azureml-app/model_config_map.json
 ---> Running in 2f50297a3607
 ---> de20f7087267
Step 4/5 : RUN 

Call the local docker container to check that it works

In [95]:
data = {
    "version": "parameter-set-1", # check this version mathes
    "payload": {
        "arguments": {
            "arg1": 5,
            "arg2": 7
        },
         "parameterBounds": [
            {
                "commonId": "max",
                "parameterType": "numeric",
                "numericBounds": {
                    "min": 10,
                    "max": 100
                }
            },
            {
                "commonId": "min",
                "parameterType": "numeric",
                "numericBounds": {
                    "min": 0,
                    "max": 50
                }
            },
        ]}
}

In [90]:
# check the model works
import requests
import json

uri = service.scoring_uri
requests.get('http://localhost:6789')
headers = {'Content-Type': 'application/json'}
data_stringified = json.dumps(data)
response = requests.post(uri, data=data_stringified, headers=headers)
print('model responded with:')
print(response.json())

ConnectionError: HTTPConnectionPool(host='localhost', port=6789): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f8ee849fb50>: Failed to establish a new connection: [Errno 61] Connection refused'))

## 5. Choose a compute target.

[Learn more about choosing a target](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-and-where?tabs=python#choose-a-compute-target)

Options are:
* Local web service: for testing and debugging
* Azure Kubernetes Services: High scale production (probably don't do this without bigger discussion)
* Azure Container Instances: Low scale, less than 48GB RAM
* Azure Machine Learning compute cluster: best for batch inferencing


## 6. Re-deploy the model to the cloud

Deploy to an Azure Container Instanec

In [99]:
from azureml.core.webservice import AciWebservice
aci_deployment_config = AciWebservice.deploy_configuration(cpu_cores = 0.5, 
                                                           memory_gb = 1, 
                                                           tags=tags, 
                                                           auth_enabled=True)

In [103]:
# this cell creates a new ACI deployed into Azure.
service = Model.deploy(ws, 
                       "tutorial-service-aci", 
                       [model], 
                       inference_config, 
                       aci_deployment_config)

service.wait_for_deployment(show_output=True)
print(service.get_logs())

Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2021-06-24 13:23:32+10:00 Creating Container Registry if not exists.
2021-06-24 13:23:32+10:00 Registering the environment.
2021-06-24 13:23:33+10:00 Use the existing image.
2021-06-24 13:23:34+10:00 Generating deployment configuration.
2021-06-24 13:23:35+10:00 Submitting deployment to compute..
2021-06-24 13:23:38+10:00 Checking the status of deployment tutorial-service-aci..
2021-06-24 13:25:24+10:00 Checking the status of inference endpoint tutorial-service-aci.
Succeeded
ACI service creation operation finished, operation "Succeeded"
2021-06-24T03:25:05,401231500+00:00 - rsyslog/run 
2021-06-24T03:25:05,402201200+00:00 - nginx/run 
/usr/sbin/nginx: /azureml-envs/azureml_abe832532d68e7634ad48db5ac241baf/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin

## 7. Test the resulting web service.

When you deploy remotely, you may have key authentication enabled. The example below shows how to get your service key with Python in order to make an inference request.

In [107]:
import requests
import json
from azureml.core import Webservice

service = Webservice(workspace=ws, name='tutorial-service-aci')
scoring_uri = service.scoring_uri
print('scoring uri: ', scoring_uri)

# If the service is authenticated, set the key or token
primary_key, _ = service.get_keys()

# Set the appropriate headers
headers = {'Content-Type': 'application/json'}
headers['Authorization'] = f'Bearer {primary_key}'

data_stringified = json.dumps(data)
response = requests.post(scoring_uri, data=data_stringified, headers=headers)
print('the model responded:')
print(response.text)
print(response.json())
print('time taken:', response.elapsed.total_seconds())


scoring uri:  http://9a084848-7f0b-4a8e-85f1-84719493df60.westus2.azurecontainer.io/score
the model responded:
{"recommendedParameters": {"max": 5, "min": 3, "total": 7}}
{'recommendedParameters': {'max': 5, 'min': 3, 'total': 7}}
time taken: 0.354373


You can get the logs from the remote ACI

In [106]:
print(service.get_logs())

2021-06-24T03:25:05,401231500+00:00 - rsyslog/run 
2021-06-24T03:25:05,402201200+00:00 - nginx/run 
/usr/sbin/nginx: /azureml-envs/azureml_abe832532d68e7634ad48db5ac241baf/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_abe832532d68e7634ad48db5ac241baf/lib/libcrypto.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_abe832532d68e7634ad48db5ac241baf/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_abe832532d68e7634ad48db5ac241baf/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
/usr/sbin/nginx: /azureml-envs/azureml_abe832532d68e7634ad48db5ac241baf/lib/libssl.so.1.0.0: no version information available (required by /usr/sbin/nginx)
2021-06-24T03:25:05,408099900+00:00 - iot-server/run 
2021-06-24T03:25:05,434992200+00:00 - gunicorn/run 
EdgeHubC

## Required information in Four2

In [105]:
print('scoring url:', scoring_uri)
print('key', primary_key)

scoring url: http://9a084848-7f0b-4a8e-85f1-84719493df60.westus2.azurecontainer.io/score
key cWAz6IhpSMeNesNnNCsQhglQlG6NdGn5


## WELL DONE!

You deployed your first model!

The next part is making it intelligent.

But first, clean up the resources you created.

In [69]:
# delete the ACI service you created
service.delete()

Container has been successfully cleaned up.


In [136]:
# delete all the models we created
models = Model.list(ws, name=model_parameters['name'])
for m in models:
    print('deleting', model.name, 'version:' , model.version)
    m.delete()