# Build custom containers from Dockerfiles locally, on Azure, and using Azure Container Registry and deploy in an online endpoint

Learn more about deploying custom containers as online endpoints in Azure Machine Learning. Custom container deployments can use web servers other than the default Python Flask server used by Azure Machine Learning. Users of these deployments can still take advantage of Azure Machine Learning's built-in monitoring, scaling, alerting, and authentication.

For an introduction to deploying custom containers in online endpoints, see [Deploy a TensorFlow model served with TF Serving using a custom container in an online endpoint](). In this example, we will go further into custom container development by building and pushing a Dockerfile locally and building from a Dockerfile on Azure, as well as 

## Prerequisites

* To use Azure Machine Learning, you must have an Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/free/).

* Install and configure the [Python SDK v2](sdk/setup.sh).

* You must have an Azure resource group, and you (or the service principal you use) must have Contributor access to it.

* You must have an Azure Machine Learning workspace. 

* To deploy locally, you must install [Docker Engine](https://docs.docker.com/engine/install/) on your local computer. We highly recommend this option, so it's easier to debug issues.

* You must have an Azure Secure Container registry. One is created automatically created for a workspace without one upon first usage, however in this example we explicitly reference the container registry by name, so you need it beforehand. You can create one through the Azure Portal. 

# 1. Setup

## 1.1 Import the required libraries

In [65]:
from azure.ml import MLClient
from azure.containerregistry import ContainerRegistryClient
from azure.ml.entities import ManagedOnlineDeployment, ManagedOnlineEndpoint, Environment, Model, BuildContext
from azure.identity import DefaultAzureCredential

## 1.2 Configure workspace details and get a handle to the workspace

In [1]:
subscription_id = '<YOUR_SUBSCRIPTION_ID>'
resource_group = '<YOUR_RESOURCE_GROUP>'
workspace = '<YOUR_WORKSPACE>'
container_registry_name = '<YOUR_CONTAINER_REGISTRY>'

In [21]:
subscription_id = '6fe1c377-b645-4e8e-b588-52e57cc856b2'
resource_group = 'v-alwallace-test'
workspace = 'valwallace'
container_registry_name = 'valwallaceskr'

In [22]:
ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

# 2. Test and deploy locally

## 2.1 Deploy a local online endpoint

In [27]:
# Creating a unique endpoint name with current datetime to avoid conflicts
import datetime
online_endpoint_name = "endpoint-" + datetime.datetime.now().strftime("%m%d%H%M%f")

#create an online endpoint
endpoint = ManagedOnlineEndpoint(
            name=online_endpoint_name,
            description='this is a sample online endpoint',
            auth_mode='key',
            tags={'foo': 'bar'})

ml_client.begin_create_or_update(endpoint,local=True)
            

Creating local endpoint (endpoint-04221733495378) Done (0m 0s)


ManagedOnlineEndpoint({'provisioning_state': None, 'scoring_uri': None, 'swagger_uri': None, 'name': 'endpoint-04221733495378', 'description': 'this is a sample online endpoint', 'tags': {'foo': 'bar'}, 'properties': {}, 'id': None, 'base_path': './', 'creation_context': None, 'serialize': <msrest.serialization.Serializer object at 0x7fc2fa5b8ad0>, 'auth_mode': 'key', 'location': None, 'identity': None, 'traffic': {}, 'mirror_traffic': {}, 'kind': None})

## Test: 

In [58]:
!docker image ls 

REPOSITORY                                                                 TAG       IMAGE ID       CREATED          SIZE
local_basic                                                                latest    df5067042df7   34 seconds ago   828MB
mcr.microsoft.com/azureml/sklearn-0.24.1-ubuntu18.04-py37-cpu-inference    latest    861ece85e7df   3 days ago       823MB
viennaglobal.azurecr.io/azureml/azureml_23e1bb4ca54a555eb02ea38dd537c165   latest    7d633f3fb312   2 weeks ago      1.59GB


In [56]:
!docker build -t local_basic ./docker/local_basic/

Sending build context to Docker daemon  3.072kB
Step 1/3 : FROM mcr.microsoft.com/azureml/sklearn-0.24.1-ubuntu18.04-py37-cpu-inference:latest
 ---> 861ece85e7df
Step 2/3 : COPY requirements.txt requirements.txt
 ---> Using cache
 ---> 0cc15fc6903b
Step 3/3 : RUN pip install -r requirements.txt
 ---> Running in 90a29514f91f
Collecting azure-ml
  Downloading azure_ml-0.0.1-py3-none-any.whl (2.4 kB)
Installing collected packages: azure-ml
Successfully installed azure-ml-0.0.1
Removing intermediate container 90a29514f91f
 ---> df5067042df7
Successfully built df5067042df7
Successfully tagged local_basic:latest


In [70]:
build_context = BuildContext(
    path=".",
    dockerfile_path='dockerfiles/local_basic')

env = Environment(
    build=build_context)

deployment = ManagedOnlineDeployment(
    name='local_deployment',
    endpoint_name=online_endpoint_name,
    environment=env)
ml_client.online_deployments.begin_create_or_update(deployment,local=True)

Creating local deployment (endpoint-04221733495378 / local_deployment) Done (0m 0s)


RequiredLocalArtifactsNotFoundError: Local endpoints only support local artifacts. Local deployment (endpoint-04221733495378 / local_deployment) did not contain required local artifact 'model.path' of type '<class 'str'>'.

## 2.3 Build a container image from a Dockerfile

In [35]:
build_context = BuildContext(
    path=".",
    dockerfile_path='dockerfiles/local_basic')

env = Environment(
    build=build_context)


[32mUploading 1-custom-containers-with-local-managed-and-ACR-... (0.02 MBs): 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24198/24198 [00:00<00:00, 250949.51i

Environment({'is_anonymous': False, 'auto_increment_version': False, 'name': 'local_build', 'description': None, 'tags': {}, 'properties': {}, 'id': '/subscriptions/6fe1c377-b645-4e8e-b588-52e57cc856b2/resourceGroups/v-alwallace-test/providers/Microsoft.MachineLearningServices/workspaces/valwallace/environments/local_build/versions/2022-04-22-17-35-38-5877467', 'base_path': './', 'creation_context': <azure.ml._restclient.v2022_02_01_preview.models._models_py3.SystemData object at 0x7fc2f851dcd0>, 'serialize': <msrest.serialization.Serializer object at 0x7fc2fa5b8f90>, 'version': '2022-04-22-17-35-38-5877467', 'latest_version': None, 'conda_file': None, 'image': None, 'build': <azure.ml.entities._assets.environment.BuildContext object at 0x7fc2fa5b8c90>, 'inference_config': None, 'os_type': 'Linux', 'arm_type': 'environment_version', 'conda_file_path': None, 'path': None, 'upload_hash': None, 'translated_conda_file': None})

The first image we will build is the Sklearn-0.24 Ubuntu 18.04 image from Azure. This image contains all of the dependencies required to score the model as well as an inference server. Our Dockerfile for this basic example is below: 

```Dockerfile 
FROM mcr.microsoft.com/azureml/sklearn-0.24.1-ubuntu18.04-py37-cpu-inference:latest
```

To begin, we will build the image and test a local deployment. If you're rebuilding, pass the `--no-cache` flag. We can build and test the image using Docker itself, however, with no scorin script or trained models, the container will fail immediately.

### Log in to ACR  and build an image locally

In [None]:
!az login
!az acr login -n {container_registry_name}
!docker build {container_name} docker_basic/. 
!docker image ls 

### Push the locally-trained image to ACR

In [None]:
!docker login {container_registry_name}.azurecr.io
!docker tag {container_name} {container_registry_name}.azurecr.io/storage-client
!docker push {container_registry_name}.azurecr.io/storage-client

### Build directly with the ACR CLI

In [None]:
!az acr login -n {container_registry_name}
!az acr build --image {container_name} --registry {container_registry_name}  ./{deployment_name}

## Local Deployment

To deploy the inference server locally, we will proide the inference server with resources by setting our `Model`, `CodeConfiguration` and `Environment` in the ManagedOnlineDeployment YAML file. This file specifies the trained model `sklearn_regression_model.pkl1`, the scoring script under `score.py`, and the registry and repository of the image we built above.

```yaml 
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: deployment_name
endpoint_name: endpoint_name
model:
  path: sklearn_regression_model.pkl
code_configuration: 
  code: "."
  scoring_script: score.py
environment:
  image: container_registry_name.azurecr.io/docker-basic:latest
instance_type: Standard_F2s_v2
instance_count: 1
```

### Create an online endpoint

In [None]:
endpoint = ManagedOnlineEndpoint(name=endpoint_name)
ml_client.online_endpoints.begin_create_or_update(endpoint, local=True)

### Import deployment YAML

We will import the YAML file for the deployment and update variables, however, the in your workloads the file can be directly loaded by passing the file path to the `.load` method of a `ManagedOnlineDeployment` object.

In [None]:
import yaml
with open(f'{deployment_name}/deployment.yml','r') as f:
    deployment_yaml = yaml.safe_load(f)
deployment_yaml['name'] = deployment_name
deployment_yaml['endpoint_name'] = endpoint_name
deployment_yaml['environment']['image'] = f'{container_registry_name}.azurecr.io/{container_name}:latest'

Now we can deploy. First we create an endpoint and then a deployment. The code below shows two ways of configuring Azure Machine Learning entities using the Python SDK v2. We can provide configuration parameters either through arguments in the constructor, or through loading a YAML file. If you do not need to preprocess a YAML file, the `.load()` method enables you to pass a file path directly. 

In [None]:
deployment = ManagedOnlineDeployment.load_from_dict(deployment_yaml)
deployment = ml_client.online_deployments.begin_create_or_update(deployment, local=True,)

### Check deployment logs

In [None]:
!az ml online-deployment get-logs -n docker-basic -e {endpoint_name} --local

## Test the local endpoint

### Get token and scoring URL 

To test an endpoint, we need the scoring URI and an authentication key. When we called `.begin_create_or_update` above, the ml_client returned the endpoint object to us with metadata about the deployment, including the attribute `scoring_uri`. If we didn't have a reference to the endpoint, we would call `ml_client.online_endpoints.get(name=<ENDPOINT_NAME>)`. 

In [None]:
auth_token = ml_client.online_endpoints.list_keys(endpoint_name).primary_key
endpoint = ml_client.online_endpoints.get(endpoint_name,local=True)
scoring_uri = endpoint.scoring_uri

### Score with REST

Online endpoints' scoring URIs end with `/score`. To check the aliveness of the endpoint without scoring data, a GET request can be made to the base URI.

In [None]:
import requests

response = requests.get(scoring_uri[:-6])

To score data using REST, insert the auth token in the header, load the sample JSON file, and make a POST request to the scoring URI, which ends with `/score`. 

In [None]:
import json 

with open('sample-request.json') as f:
    data = json.loads(f.read())
headers = {'Authorization' : f'Bearer {auth_token}'} 
response = requests.post(url=scoring_uri,
                        headers=headers,
                        data=json.dumps(data))

## Deploy to the Cloud
The scoring server can be deployed in the cloud with few configuration changes. Our Docker image is already built and available in the Azure Container Registry, and our deployment YAML file requires no changes. We first generate a new online endpoint name and proceed with similar steps as above. Note the removal of the `local=True` argument in `ml_client` methods.

### Prepare a new endpoint name and import YAML
There is no change to our YAML.

In [None]:
import os
from random import randint

endpoint_name = f'docker-basic-{randint(1e3,1e7)}'

import yaml
with open(os.path.join(deployment_name, 'deployment.yml'),'r') as f:
    deployment_yaml = yaml.safe_load(f)

deployment_yaml['endpoint_name'] = endpoint_name
deployment_yaml['environment']['image'] = f'{container_registry_name}.azurecr.io/{container_name}:latest'

### Create the endpoint and deployment

In [None]:
endpoint = ManagedOnlineEndpoint(name=endpoint_name)
ml_client.online_endpoints.begin_create_or_update(endpoint)
deployment = ManagedOnlineDeployment.load_from_dict(deployment_yaml)
adeployment = ml_client.online_deployments.begin_create_or_update(deployment)

## Test and score the model

This time, we will score the model using the `.invoke` method.

In [None]:
from pathlib import PurePath
json_file_path = PurePath(os.path.join(deployment_name, 'sample-request.json'))
ml_client.online_endpoints.invoke(endpoint_name=endpoint_name, deployment_name=deployment_name, request_file=json_file_path)

## YAML Files vs AML Python Objects and Entities

Azure YAML files are powerful, concise tools. The Python interface to Azure Machine Learning offers a complementary interface. The AML Python SDK v2. 

## Pre-loading dependencies in Docker images
Azure's flexible environment specifications make it trivially easy to deploy with pip, conda, or both, as well as custom code configurations. However in many circumstances it is necessary or advantageous preload as much as possible. Pushing dependencies to bulid time can help reduce deployment time or the size of the codebase, a prime use case for Docker. 

```dockerfile
FROM mcr.microsoft.com/azureml/minimal-ubuntu18.04-py37-cpu-inference:latest

RUN apt install python3, python3-pip
COPY requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt
```