## Download model

It's important that your model have this directory structure for Triton Inference Server to be able to load it. [Read more about the directory structure that Triton expects](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/model_repository.html).

In [None]:
!pip install azure-storage-blob

In [None]:
import os
import sys
from pathlib import Path
from src.model_utils import download_triton_models, delete_triton_models

prefix = Path(".")
download_triton_models(prefix)

## Register model

The registered model should follow the Triton specified model folder structure for Triton Inference Server to be able to load it.

In [33]:
subscripton = "subscription_id"
resource_group = "resource_group"
workspace = "workspace"
model_name = "densenet-onnx"
endpoint_name = "single2"

In [None]:
!az ml model create -n $model_name -v 1 -l models -g $resource_group -w $workspace --subscription $subscripton

In [None]:
!az ml model show -n $model_name -v 1 -g $resource_group -w $workspace --subscription $subscripton

## Create endpoint

Deploy to a [pre created AKS compute](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-attach-kubernetes?tabs=python#create-a-new-aks-cluster) aks-test.

In [None]:
!az ml endpoint create -g $resource_group -w $workspace -n $endpoint_name -f .\endpoint-aks.yml

In [None]:
!az ml endpoint show -g $resource_group -w $workspace -n $endpoint_name

## Test Webservice

Get scoring URI and auth token

In [27]:
!az ml endpoint show -g $resource_group -w $workspace -n $endpoint_name --query "scoring_uri"



In [None]:
!az ml endpoint list-keys -g $resource_group -w $workspace -n $endpoint_name

In [1]:
import requests

service_key = "service_key"
headers = {}
headers["Authorization"] = f"Bearer {service_key}"

# Check the state of server.
service_url = "service_url"
requests.get(f"{service_url}/v2/health/ready", headers=headers)

<Response [200]>

In [2]:
# Check the status of model.
resp = requests.post(f"{service_url}/v2/repository/index", headers=headers)
print(resp.text)

[{"name":"densenet_onnx","version":"1","state":"READY"}]


In [14]:
# Check metadata of model for inference 
resp = requests.get(f"{service_url}/v2/models/densenet_onnx", headers=headers)
print(resp.text)

{"name":"densenet_onnx","versions":["1"],"platform":"onnxruntime_onnx","inputs":[{"name":"data_0","datatype":"FP32","shape":[1,3,224,224]}],"outputs":[{"name":"fc6_1","datatype":"FP32","shape":[1,1000,1,1]}]}


In [46]:
# Install pillow for densenet preprocessing
!pip install pillow

Collecting pillow
  Downloading Pillow-8.2.0-cp38-cp38-win_amd64.whl (2.2 MB)
Installing collected packages: pillow
Successfully installed pillow-8.2.0


In [4]:
import json
import numpy as np
from src.densenet_utils import preprocess, postprocess
from pathlib import Path

img_content = requests.get("https://aka.ms/peacock-pic").content
img_data = preprocess(img_content, scaling="INCEPTION")

score_input = '{"inputs":[{"name":"data_0","data":'+str(img_data.flatten().tolist())+',"datatype":"FP32","shape":[1,3,224,224]}]}'
resp = requests.post(f"{service_url}/v2/models/densenet_onnx/infer", data=score_input, headers=headers)
data = json.loads(resp.text)

max_label = np.array(data["outputs"][0]["data"]).argmax()

label_path = Path(".").joinpath("src","densenet_labels.txt")
result = postprocess(max_label, label_path)

result

'84 : PEACOCK'