# Create a multimodel deployment

In this example we serve two models from the same endpoint and same deployment. 

Both model files are registered as a single model asset on Azure and loaded simultaneously in the scoring script. The scoring script parses each request for a "model" field and routes the payload accordingly.

## 1. Configure parameters, assets, and clients

### 1.1 Set workspace details

In [None]:
subscription_id = "<SUBSCRIPTION_ID>"
resource_group = "<RESOURCE_GROUP>"
workspace_name = "<AML_WORKSPACE_NAME>"

### 1.2 Set endpoint details

In [None]:
import random

endpoint_name = f"multimod-{random.randint(0,10000)}"

### 1.3 Set asset paths
Define the directories containing the two model files as well as a directory which contains the scoring script

In [None]:
import os

base_path = "../../../../../cli/endpoints/online/custom-container/minimal/multimodel"
conda_file_path = (
    "../../../assets/environment/conda-yamls/online-endpoints-multimodel.yml"
)
models_path = os.path.join(base_path, "models")
code_path = os.path.join(base_path, "code")
test_data_path = os.path.join(base_path, "test-data")

### 1.4 Examine the models folder
The models folder contains two models which will be loaded simultaneously by the scoring script.

In [None]:
import os

os.listdir(models_path)

### 1.5 Examine the scoring script

The scoring script loads both models into a dictionary keyed on their name in the `init` function. In the run function, each request is parsed for a `model` key in the JSON to choose the model. The `data` payload is then passed to the appropriate model.

```python 
import joblib
import os
import pandas as pd
from pathlib import Path
import json

models = None


def init():
    global models
    model_dir = Path(os.getenv("AZUREML_MODEL_DIR")) / "models"
    models = {m[:-4]: joblib.load(model_dir / m) for m in os.listdir(model_dir)}


def run(data):
    data = json.loads(data)
    model = models[data["model"]]
    payload = pd.DataFrame(data["data"])
    try:
        ret = model.predict(payload)
        return pd.DataFrame(ret).to_json()
    except KeyError:
        raise KeyError("No such model")

``` 

### 1.6 Examine the Conda file
The dockerfile is located at `sdk/python/assets/environment/conda-yamls/online-endpoints-multimodel.yml`

```yaml
name: multimodel
channels:
  - defaults
dependencies: 
  - python=3.8
  - pip
  - pip: 
    - pandas 
    - numpy
    - scikit-learn
    - joblib
```

### 1.7 Create an MLClient instance

In [None]:
from azure.ai.ml import MLClient
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    Model,
    CodeConfiguration,
    Environment,
    BuildContext,
    ProbeSettings,
)
from azure.identity import DefaultAzureCredential

credential = DefaultAzureCredential()
ml_client = MLClient(
    credential,
    subscription_id=subscription_id,
    resource_group_name=resource_group,
    workspace_name=workspace_name,
)

In [None]:
credential = DefaultAzureCredential()
ml_client = MLClient(
    credential,
    subscription_id=subscription_id,
    resource_group_name=resource_group,
    workspace_name=workspace_name,
)

## 2. Create an endpoint

### 2.1 Define and create the endpoint

In [None]:
endpoint = ManagedOnlineEndpoint(name=endpoint_name)
poller = ml_client.online_endpoints.begin_create_or_update(endpoint)
poller.wait()

### 2.2 Confirm that creation was successful

In [None]:
from azure.ai.ml.exceptions import DeploymentException

status = poller.status()
if status != "Succeeded":
    raise DeploymentException(status)
else:
    print("Endpoint creation succeeded")
    endpoint = poller.result()
    print(endpoint)

## 3. Create the deployment

### 3.1 Create the environment

In [None]:
environment = Environment(
    name="minimal-multimodel-conda",
    image="mcr.microsoft.com/azureml/minimal-ubuntu20.04-py38-cpu-inference",
    conda_file=conda_file_path,
)
environment = ml_client.environments.create_or_update(environment)

### 3.2 Define the deployment

In [None]:
deployment = ManagedOnlineDeployment(
    name="custom-container-multimodel",
    endpoint_name=endpoint_name,
    model=Model(name="minimal-multimodel", path=models_path),
    code_configuration=CodeConfiguration(
        code=code_path, scoring_script="minimal-multimodel-score.py"
    ),
    environment=f"azureml:{environment.name}:{environment.version}",
    instance_type="Standard_DS2_v2",
    instance_count=1,
)

### 3.3 Create the deployment

In [None]:
poller = ml_client.online_deployments.begin_create_or_update(deployment)
poller.wait()

### 3.4 Confirm that creation was successful

In [None]:
status = poller.status()
if status != "Succeeded":
    raise DeploymentException(status)
else:
    print("Deployment creation succeeded")
    deployment = poller.result()
    print(deployment)

### 3.5 Set traffic to 100% 

In [None]:
endpoint.traffic = {"custom-container-multimodel": 100}
poller = ml_client.begin_create_or_update(endpoint)
poller.wait()

## 4. Test the endpoint
The `model` JSON field in both JSON payloads indicates which model to score.

### 4.1 Test the diabetes model

In [None]:
import json

res = ml_client.online_endpoints.invoke(
    endpoint_name, request_file=os.path.join(test_data_path, "diabetes-test-data.json")
)
print(json.loads(res))

### 4.2 Test the iris model

In [None]:
res = ml_client.online_endpoints.invoke(
    endpoint_name, request_file=os.path.join(test_data_path, "iris-test-data.json")
)
print(json.loads(res))

## 5. Delete assets

### 5.1 Delete the endpoint

In [None]:
poller = ml_client.online_endpoints.begin_delete(name=endpoint_name)