# Deployment to AKS

In this example, we'll deploy an already trained model to Azure Kubernetes Service (AKS). We'll enable authentication, automatic Swagger-interface creation, Application Insights monitoring and model data collection.

In [None]:
!pip install azureml-sdk --upgrade

In [None]:
import os
import azureml.core

from azureml.core import Workspace, Environment
from azureml.core.webservice import AksWebservice, Webservice
from azureml.core.model import Model, InferenceConfig
from azureml.core.compute import AksCompute


print("Azure ML SDK version:", azureml.core.VERSION)

In [None]:
ws = Workspace.from_config()
print(f'WS name: {ws.name}\nRegion: {ws.location}\nSubscription id: {ws.subscription_id}\nResource group: {ws.resource_group}')

First, let's register an already trained model. Details on how to train a model in AzureML can be found in the other examples in this repo.

In [None]:
model = Model.register(model_path="model.pkl",
                       model_name="demo_model_for_aks_deployment",
                       description="Demo model for showcasing AKS deployment",
                       workspace=ws)

# AKS Cluster Creation

For sake of our tutorial, we'll create a small `DevTest` cluster (takes ~5 minutes). However, for real production workloads do not use flag. In many cases, you would create a cluster externally and then attach it to AzureML. If you want to provision your AKS cluster via the AzureML SDK, follow [this link](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-attach-kubernetes?tabs=python) for more details (VNET integration, private load balancer, etc.).

In [None]:
from azureml.core.compute import AksCompute, ComputeTarget

prov_config = AksCompute.provisioning_configuration(cluster_purpose = AksCompute.ClusterPurpose.DEV_TEST)
aks_target = ComputeTarget.create(workspace = ws,
                                  name = 'aks-workshop',
                                  provisioning_configuration = prov_config)

aks_target.wait_for_completion(show_output = True)

# Deploy model

First, we need to tell AzureML which libraries are required for our model. In this case, we just create a new enviroment which we want to use for deployment:

In [None]:
env = Environment.from_conda_specification(name='deployment-env', file_path='config/conda.yml')
env.register(workspace=ws)
build = env.build(workspace=ws)
build.wait_for_completion(show_output=True)

Next, we can create the `InferenceConfig`, which describes which code we want to use for deployment, which libraries we need, and if there are any special runtimes required to optimize our model.

In [None]:
environment = Environment.get(workspace=ws, name='deployment-env')
inference_config = InferenceConfig(source_directory='./',
                                   entry_script='score.py',
                                   environment=environment)

Lastly, we can deploy the model:

In [None]:
aks_target = AksCompute(ws, "aks-workshop")

model = Model(ws, 'demo_model_for_aks_deployment')

deployment_config = AksWebservice.deploy_configuration(
                                    cpu_cores = 1,
                                    memory_gb = 1,
                                    auth_enabled=True,
                                    enable_app_insights=True,
                                    collect_model_data=True)
service = Model.deploy(ws, "myservice", [model], inference_config, deployment_config, aks_target)
service.wait_for_deployment(show_output = True)
print(service.state)
print(service.get_logs())

Let's print out some important fields from the model service:

In [None]:
print(f"Service state: {service.state}")
print(f"Service URL: {service.scoring_uri}")
print(f"Service Swagger URL: {service.swagger_uri}")

# Test Model

We can just fire a requests call again the model's endpoint URL, sending over the authentication key:

In [None]:
import requests

url = service.scoring_uri
key1 = service.get_keys()[0]

test_data = {
  'data': [{
    "Age": 20,
    "Sex": "male",
    "Job": 0,
    "Housing": "own",
    "Saving accounts": "little",
    "Checking account": "little",
    "Credit amount": 100,
    "Duration": 48,
    "Purpose": "radio/TV"
  }]
}

headers = {'Content-Type': 'application/json',
           'Authorization': 'Bearer ' + key1}
response = requests.post(url, json=test_data, headers=headers)

print("Prediction (good, bad):", response.text)