## Download model

It's important that your model have this directory structure for Triton Inference Server to be able to load it. [Read more about the directory structure that Triton expects](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/model_repository.html).

In [None]:
!pip install azure-storage-blob

In [None]:
import os
import sys
from pathlib import Path
from src.model_utils import download_triton_models, delete_triton_models

prefix = Path(".")
download_triton_models(prefix)

## Register model

The registered model should follow the Triton specified model folder structure for Triton Inference Server to be able to load it.

In [None]:
subscription = "subscription_id"
resource_group = "resource_group"
workspace = "workspace"
model_name = "densenet-onnx"
endpoint_name = "single2"

In [None]:
!az account set --subscription $subscription
!az configure --defaults workspace=$workspace group=$resource_group

In [None]:
!az ml model create -n $model_name -v 1 -l models -g $resource_group -w $workspace --subscription $subscripton

In [None]:
!az ml model show -n $model_name -v 1 -g $resource_group -w $workspace --subscription $subscripton

## Create endpoint

Deploy to a pre-created [Aks Compute](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.compute.aks.akscompute?view=azure-ml-py#provisioning-configuration-agent-count-none--vm-size-none--ssl-cname-none--ssl-cert-pem-file-none--ssl-key-pem-file-none--location-none--vnet-resourcegroup-name-none--vnet-name-none--subnet-name-none--service-cidr-none--dns-service-ip-none--docker-bridge-cidr-none--cluster-purpose-none--load-balancer-type-none-) named aks-gpu-deploy. For other options, see [our documentation](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-and-where?tabs=azcli).

Please note for aks deployment change below yml file name to 'endpoint-aks.yml' and for managed deployment change it to 'endpoint-mir-gpu.yml'

In [None]:
!az ml endpoint create -g $resource_group -w $workspace -n $endpoint_name -f endpoint-aks.yml

In [None]:
!az ml endpoint show -g $resource_group -w $workspace -n $endpoint_name

## Test the Webservice

#### Get scoring URI 

In [None]:
url = !az ml endpoint show -g $resource_group -w $workspace -n $endpoint_name --query "scoring_uri"
service_url = url[1].strip('!"').rstrip('/score')
print(service_url)

#### Get Auth token

In [None]:
import re
key = !az ml endpoint get-credentials -n $endpoint_name -g $resource_group -w $workspace
Service_key = re.split(": |,",key[2])[1]
print(Service_key)

#### Check the state of server

In [None]:
import requests

service_key = Service_key.strip('!"')
headers = {}
headers["Authorization"] = f"Bearer {service_key}"

requests.get(f"{service_url}/v2/health/ready", headers=headers)

#### Check the status of model

In [None]:
resp = requests.post(f"{service_url}/v2/repository/index", headers=headers)
print(resp.text)

#### Check metadata of model for inference 

In [None]:
resp = requests.get(f"{service_url}/v2/models/densenet_onnx", headers=headers)
print(resp.text)

#### Install pillow for densenet preprocessing

In [None]:
!pip install pillow

#### Test the model

In [None]:
import json
import numpy as np
from src.densenet_utils import preprocess, postprocess
from pathlib import Path

img_content = requests.get("https://aka.ms/peacock-pic").content
img_data = preprocess(img_content, scaling="INCEPTION")

score_input = '{"inputs":[{"name":"data_0","data":'+str(img_data.flatten().tolist())+',"datatype":"FP32","shape":[1,3,224,224]}]}'
resp = requests.post(f"{service_url}/v2/models/densenet_onnx/infer", data=score_input, headers=headers)
data = json.loads(resp.text)

max_label = np.array(data["outputs"][0]["data"]).argmax()

label_path = Path(".").joinpath("src","densenet_labels.txt")
result = postprocess(max_label, label_path)

result

# Delete the webservice and the model

In [None]:
!az ml model delete -n $endpoint_name -g $resource_group -w $workspace
!az ml model delete -n $model_name -v 1

# Next steps

Try reading [our documentation](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-with-triton?tabs=python) to use Triton with your own models or check out the other notebooks in this folder for ways to do pre- and post-processing on the server.