# Many Models

This is an example of inferencing multiple models with Triton. The models can also run on different frameworks.

![multi](multimodel.png)

## Download models

It's important that your model have this directory structure for Triton Inference Server to be able to load it. [Read more about the directory structure that Triton expects](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/model_repository.html).

In [1]:
!pip install azure-storage-blob



In [2]:
import os
import sys
from pathlib import Path
from src.model_utils import download_triton_models, delete_triton_models

prefix = Path(".")
download_triton_models(prefix)

successfully downloaded model: densenet_onnx
successfully downloaded model: bidaf-9


## Register models

Download multiple models into models folder. The registered models should follow the Triton specified model folder structure for Triton Inference Server to be able to load it.

In [16]:
subscripton = "92c76a2f-0e1c-4216-b65e-abf7a3f34c1e"
resource_group = "gsc-aml-lite-rg2"
workspace = "gsc-aml-ws3"
model_name = "densenet-onnx"
service_name = "multi5"

In [4]:
!az ml model create -n $model_name -v 2 -l models -g $resource_group -w $workspace --subscription $subscripton

[36mCommand group 'ml model' is experimental and under development. Reference and support levels: https://aka.ms/CLI_refstatus[0m
Uploading models: 100%|███████████████████████████| 2/2 [00:39<00:00, 19.99s/it]
{
  "creation_context": {
    "created_at": "2021-05-20T09:27:48.870589+00:00",
    "created_by": "Arzoo Aneja (ZEN3 INFOSOLUTIONS AMERICA INC)",
    "created_by_type": "User",
    "last_modified_at": "2021-05-20T09:27:48.870589+00:00",
    "last_modified_by": "Arzoo Aneja (ZEN3 INFOSOLUTIONS AMERICA INC)",
    "last_modified_by_type": "User"
  },
  "datastore": "azureml:/subscriptions/92c76a2f-0e1c-4216-b65e-abf7a3f34c1e/resourceGroups/gsc-aml-lite-rg2/providers/Microsoft.MachineLearningServices/workspaces/gsc-aml-ws3/datastores/workspaceblobstore",
  "flavors": {},
  "id": "azureml:/subscriptions/92c76a2f-0e1c-4216-b65e-abf7a3f34c1e/resourceGroups/gsc-aml-lite-rg2/providers/Microsoft.MachineLearningServices/workspaces/gsc-aml-ws3/models/densenet-onnx/versions/2",
  "name": "

In [5]:
!az ml model show -n $model_name -v 2 -g $resource_group -w $workspace --subscription $subscripton

[36mCommand group 'ml model' is experimental and under development. Reference and support levels: https://aka.ms/CLI_refstatus[0m
{
  "creation_context": {
    "created_at": "2021-05-20T09:27:48.870589+00:00",
    "created_by": "Arzoo Aneja (ZEN3 INFOSOLUTIONS AMERICA INC)",
    "created_by_type": "User",
    "last_modified_at": "2021-05-20T09:27:48.870589+00:00",
    "last_modified_by": "Arzoo Aneja (ZEN3 INFOSOLUTIONS AMERICA INC)",
    "last_modified_by_type": "User"
  },
  "datastore": "azureml:/subscriptions/92c76a2f-0e1c-4216-b65e-abf7a3f34c1e/resourceGroups/gsc-aml-lite-rg2/providers/Microsoft.MachineLearningServices/workspaces/gsc-aml-ws3/datastores/workspaceblobstore",
  "flavors": {},
  "id": "azureml:/subscriptions/92c76a2f-0e1c-4216-b65e-abf7a3f34c1e/resourceGroups/gsc-aml-lite-rg2/providers/Microsoft.MachineLearningServices/workspaces/gsc-aml-ws3/models/densenet-onnx/versions/2",
  "name": "densenet-onnx",
  "path": "az-ml-artifacts/f2c796d0d329fd5d0865f7a2681243ae/model

## Create endpoint

Deploy to a pre created AKS compute

In [17]:
!az ml endpoint create -g $resource_group -w $workspace --name $service_name -f deployment.yml

[36mCommand group 'ml endpoint' is experimental and under development. Reference and support levels: https://aka.ms/CLI_refstatus[0m

The deployment request gsc-aml-ws3-multi5-2780481 was accepted,  status can be found in the link below: 
https://ms.portal.azure.com/#blade/HubsExtension/DeploymentDetailsBlade/overview/id/%2Fsubscriptions%2F92c76a2f-0e1c-4216-b65e-abf7a3f34c1e%2FresourceGroups%2Fgsc-aml-lite-rg2%2Fproviders%2FMicrosoft.Resources%2Fdeployments%2Fgsc-aml-ws3-multi5-2780481

Registering code version (a2cd405d-9214-4f41-902e-956c6bff421d:1)  Done (5s)
Creating endpoint multi5 ..  Done (33s)
Creating deployment etblue: Failed with operation id= 6F66019DD4C88F3B, service request id=45b5cb9c-338f-440e-b816-b616e3098f44, status=NotFound, error message = {'additional_properties': {}, 'code': None, 'message': None, 'target': None, 'details': None, 'additional_info': None}.
More details: None
Polling hit the exception (DeploymentFailed) At least one resource deployment operation

[0m

In [None]:
!az ml endpoint show -g $resource_group -w $workspace --name $service_name

## Test Webservice

Get scoring URI and auth token

In [None]:
!az ml endpoint list-keys -g $resource_group -w $workspace --name $service_name

In [None]:
import requests

service_key = "service_key"
headers = {}
headers["Authorization"] = f"Bearer {service_key}"

# Check the state of server.
service_url = "service_url"
requests.get(f"{service_url}/v2/health/ready", headers=headers)

In [None]:
# Check the status of model.
resp = requests.post(f"{service_url}/v2/repository/index", headers=headers)
print(resp.text)

In [None]:
# Check metadata of model for inference 
resp = requests.get(f"{service_url}/v2/models/bidaf-9", headers=headers)
print(resp.text)

In [None]:
resp = requests.get(f"{service_url}/v2/models/densenet_onnx", headers=headers)
print(resp.text)