# Build and deploy a model with custom Docker Images

In these examples, we will deploy inference servers on customized Docker images using Azure Secure Container Registry. We will extend a pre-built image from Azure's curated image library and build an image from base Ubuntu 18.04. 

## Prerequisites

* To use Azure Machine Learning, you must have an Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/free/).

* Install and configure the [Python SDK v2](sdk/setup.sh).

* You must have an Azure resource group, and you (or the service principal you use) must have Contributor access to it.

* You must have an Azure Machine Learning workspace. 

* You must have an Azure Secure Container registry. One is created automatically created for a workspace without one upon first usage, however in this example we explicitly reference the container registry by name, so you need it beforehand. You can create one through the Azure Portal. 

## Initial set up

We will first get a handle to the workspace, which will be reused later as we deploy images. You must already have an existing Azure Secure Container Registry associated with the workspace.

In [1]:
subscription_id = '<YOUR_SUBSCRIPTION_ID>'
resource_group = '<YOUR_RESOURCE_GROUP>'
workspace = '<YOUR_WORKSPACE>'
container_registry_name = '<YOUR_CONTAINER_REGISTRY>'

In [2]:
subscription_id = '6fe1c377-b645-4e8e-b588-52e57cc856b2'
resource_group = 'v-alwallace-test'
workspace = 'valwallace'
container_registry_name = 'valwallaceskr'

In [3]:
import os 
from azure.ml import MLClient
from azure.ml.entities import ManagedOnlineDeployment, ManagedOnlineEndpoint
from azure.identity import DefaultAzureCredential
from random import randint

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

## Basic Docker image deployment

### Define deployment and container registry details

The name of the deployment, container registry, and container name are all required. We will create a new container using the name here, however, The endpoint name is optional, the code below will generate a random name likely to be unique within the region.

In [5]:
# Required
deployment_name = 'docker-basic'
container_name = 'docker-basic'
# Optional
endpoint_name = f'docker-basic-{randint(1e3,1e7)}'

The first image we will build is the OpenMPI3.1.2 Ubuntu 18.04 image from Azure. This image contains all of the dependencies required to score the model as well as an inference server. Our Dockerfile for this basic example is below: 

```Dockerfile 
FROM mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:20210727.v1
```

To begin, we will build the image locally and test a local deployment. If you're rebuilding, pass the `--no-cache` flag. 

In [None]:
!docker build -t {container_name} docker_basic/. 

The image is now among your local images, which you can see by running the command  `docker image list` or `docker image ls`. The image is now ready to be included in a deployment, however, let's run the image now and see the AML Inference Server load. It comes preloaded in most of the Azure-curated images.  Since there are are no models and no scoring script provided to it yet, it will exit quickly. 

In [None]:
!docker run -t {container_name}

With our container running, we will log in to Azure Container Registry to upload the image.

In [None]:
!az acr login -n {container_registry_name}

In [6]:

!az acr build --image {container_name} --registry {container_registry_name} {deployment_name}/.

[93mPacking source code into tar to upload...[0m
[93mUploading archived source code from '/tmp/build_archive_f0b9143ae2e24538afa73116a7075691.tar.gz'...[0m
[93mSending context (2.056 KiB) to registry: valwallaceskr...[0m
[K[93mQueued a build with ID: ch1f[0m
[93mWaiting for an agent...[0m
2022/04/20 21:30:43 Downloading source code...
2022/04/20 21:30:44 Finished downloading source code
2022/04/20 21:30:44 Using acb_vol_1f20f727-cde5-43a8-ad51-cd654e61c575 as the home volume
2022/04/20 21:30:44 Setting up Docker configuration...
2022/04/20 21:30:45 Successfully set up Docker configuration
2022/04/20 21:30:45 Logging in to registry: valwallaceskr.azurecr.io
2022/04/20 21:30:46 Successfully logged into valwallaceskr.azurecr.io
2022/04/20 21:30:46 Executing step ID: build. Timeout(sec): 28800, Working directory: '', Network: ''
2022/04/20 21:30:46 Scanning for dependencies...
2022/04/20 21:30:47 Successfully scanned dependencies
2022/04/20 21:30:47 Launching container with name

### Local Deployment

To deploy the inference server locally, we will proide the inference server with resources by setting our `Model`, `CodeConfiguration` and `Environment` in the ManagedOnlineDeployment YAML file. This file specifies the trained model `sklearn_regression_model.pkl1`, the scoring script under `score.py`, and the registry and repository of the image we built above.

```yaml 
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: deployment_name
endpoint_name: endpoint_name
model:
  path: sklearn_regression_model.pkl
code_configuration: 
  code: "."
  scoring_script: score.py
environment:
  image: container_registry_name.azurecr.io/docker-basic:latest
instance_type: Standard_F2s_v2
instance_count: 1
```

We will import the YAML file and update variables, however, the in your workloads the file can be directly loaded by passing the file path to the `.load` method of a `ManagedOnlineDeployment` object.

In [8]:
import yaml
with open(os.path.join(deployment_name,'deployment.yml'),'r') as f:
    deployment_yaml = yaml.safe_load(f)

FileNotFoundError: [Errno 2] No such file or directory: 'docker-basic/deployment.yml'

In [None]:
deployment_yaml['name'] = deployment_name
deployment_yaml['endpoint_name'] = endpoint_name
deployment_yaml['environment']['image'] = f'{container_registry_name}.azurecr.io/{container_name}:latest'

Now we can deploy. First we create an endpoint and then a deployment. The code below shows two ways of configuring Azure Machine Learning entities using the Python SDK v2. We can provide configuration parameters either through arguments in the constructor, or through loading a YAML file. If you do not need to preprocess a YAML file, the `.load()` method enables you to pass a file path directly. 

In [None]:
endpoint = ManagedOnlineEndpoint(name=endpoint_name)
ml_client.online_endpoints.begin_create_or_update(endpoint, local=True)
deployment = ManagedOnlineDeployment.load_from_dict(deployment_yaml)
deployment = ml_client.online_deployments.begin_create_or_update(deployment, local=True)

Run the command below to see the deployment logs.

In [None]:
!az ml online-deployment get-logs -n docker-basic -e {endpoint_name} --local

### Test the local endpoint

To test an endpoint, we need the scoring URI and an authentication key. When we called `.begin_create_or_update` above, the ml_client returned the endpoint object to us with metadata about the deployment, including the attribute `scoring_uri`. If we didn't have a reference to the endpoint, we would call `ml_client.online_endpoints.get(name=<ENDPOINT_NAME>)`. 

In [None]:
auth_token = ml_client.online_endpoints.list_keys(endpoint_name).primary_key
endpoint = ml_client.online_endpoints.get(endpoint_name,local=True)
scoring_uri = endpoint.scoring_uri

Online endpoints' scoring URIs end with `/score`. To check the aliveness of the endpoint without scoring data, a GET request can be made to the base URI.

In [None]:
import requests

response = requests.get(scoring_uri[:-6])

To score data using REST, insert the auth token in the header, load the sample JSON file, and make a POST request to the scoring URI, which ends with `/score`. 

In [None]:
import json 

with open('sample-request.json') as f:
    data = json.loads(f.read())
headers = {'Authorization' : f'Bearer {auth_token}'} 
response = requests.post(url=scoring_uri,
                        headers=headers,
                        data=json.dumps(data))

### Online Deployment
The scoring server can be deployed in the cloud with few configuration changes. Our Docker image is already built and available in the Azure Container Registry, and our deployment YAML file requires no changes. We first generate a new online endpoint name and proceed with similar steps as above. Note the removal of the `local=True` argument in `ml_client` methods.

In [None]:
endpoint_name = f'docker-basic-{randint(1e3,1e7)}'

import yaml
with open('deployment_local.yml','r') as f:
    deployment_yaml = yaml.safe_load(f)

deployment_yaml['endpoint_name'] = endpoint_name
deployment_yaml['environment']['image'] = f'{container_registry_name}.azurecr.io/{container_name}:latest'

In [None]:
endpoint = ManagedOnlineEndpoint(name=endpoint_name)
ml_client.online_endpoints.begin_create_or_update(endpoint)
deployment = ManagedOnlineDeployment.load_from_dict(deployment_yaml)
deployment = ml_client.online_deployments.begin_create_or_update(deployment)

This time, we will score the model using the `.invoke` method.

In [None]:
with open('sample-request.json') as f:
    data = json.loads(f.read())
ml_client.online_endpoints.invoke(endpoint.name,data)

## Preinstall a requirements.txt file using the AML Inference Server

First, we will extend the  no-framework inference Docker image from [Azure's curated image library](/azure/machine-learning/concept-prebuilt-docker-images-inference). This image is built from a minimal Ubuntu 18.04 base image and does not include any frameworks such as Tensorflow or Torch, however, it does include the Azure Machine Learning Inference Server, which enables the rapid deployment of inference servers through a single `score.py` file that calls the scored model. Our working directory looks like this: 

Each of these files will be copied into the image in the Dockerfile. The model directory contains the trained model object we will call to score each request. This path will be passed to the Inferencing Server, and may contain nested subdirectory trees corresponding to different models and verisons. The `score.py` file is located in the code directory. The inferencing server will call the score.py file from the relevant subdirectory depending on the model version, so there is no need for the score.py file to keep track of this tree. The requirements.txt file contains the additional Python packages we will install in the image. It looks like this: 

```
numpy==1.21.2
pip==21.2.4
scikit-learn==0.24.2
scipy==1.7.1
azureml-defaults==1.38.0
inference-schema[numpy-support]==1.3.0
joblib==1.0.1
```

For this deployment, we will  After image creation, requirements files can be dynamically loaded by the inferencing server or additional dependencies can be specified through an `Environment`. See the Environment and ManagedOnlineDeployment schemas for more details.

```dockerfile
FROM mcr.microsoft.com/azureml/minimal-ubuntu18.04-py37-cpu-inference:latest
USER root:root
COPY environment /var/environment
RUN pip install -r /var/environment/requirements.txt
```

In [None]:
!az acr login --name {container_registry_name}

In [None]:
!az acr build --image custom_container --registry {container_registry_name} --file Dockerfile .

### Create a managed online deployment

First, we deploy an online endpoint.

In [None]:
# Required
deployment_name = 'docker-ouo'
container_name = 'docker-pip'
# Optional
endpoint_name = f'docker-pip-{randint(1e3,1e7)}'
ml_client.online_endpoints.begin_create_or_update(ManagedOnlineEndpoint(name=endpoint_name))
deployment = ManagedOnlineDeployment.load(os.path.join(deployment_name,'deployment.yml'))

In [None]:
ml_client.online_deployments.begin_create_or_update(deployment)

In [None]:
auth_token = ml_client.online_endpoints.list_keys(endpoint_name).primary_key

In [None]:
auth_token = ml_client.online_endpoints.list_keys(endpoint_name).primary_key
endpoint = ml_client.online_endpoints.get(endpoint_name)
scoring_uri = endpoint.scoring_uri

In [None]:
response=None
import requests
import json 
with open(os.path.join('.','sample-request.json')) as f:
    data = json.loads(f.read())
headers = {}
headers = {'Authorization' : f'Bearer {auth_token}', 'Content-Type':'application/json'} 
#scoring_uri = "https://custom-container-9230.eastus2.inference.ml.azure.com/score"
response = requests.post(url=scoring_uri,
                        headers=headers,
                        data=json.dumps(data))