# Deploy a model served with Triton using a custom container in an online endpoint
Learn how to deploy a model using Triton as an online endpoint in Azure Machine Learning.

Triton is multi-framework, open-source software that is optimized for inference. It supports popular machine learning frameworks like TensorFlow, ONNX Runtime, PyTorch, NVIDIA TensorRT, and more. It can be used for your CPU or GPU workloads.

## Prerequisites

* To use Azure Machine Learning, you must have an Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/free/).

* Install and configure the [Python SDK v2](sdk/setup.sh).

* You must have an Azure resource group, and you (or the service principal you use) must have Contributor access to it.

* You must have an Azure Machine Learning workspace. 

### Please note, for Triton no-code-deployment, testing via local endpoints is currently not supported, so this tutorial will only show how to set up on online endpoint.

# 1. Connect to Azure Machine Learning Workspace

The [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/concept-workspace) is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section we will connect to the workspace in which the job will be run.

In [1]:
# Import required libraries
from azure.ai.ml import MLClient
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    Model,
    Environment,
    CodeConfiguration,
)
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential, AzureCliCredential

## 1.2. Configure workspace details and get a handle to the workspace

To connect to a workspace, we need identifier parameters - a subscription, resource group and workspace name. We will use these details in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace. We use the default [default azure authentication](https://docs.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python) for this tutorial. Check the [configuration notebook](../../jobs/configuration.ipynb) for more details on how to configure credentials and connect to a workspace.

In [2]:
# enter details of your AML workspace
subscription_id = "f57ce3c6-5c6f-4f1e-8cba-b782d8974590"
resource_group = "rg-azureml-pg"
workspace = "aml-pg-02"

In [14]:
!az login --use-device-code

# get a handle to the workspace
ml_client = MLClient(
    AzureCliCredential(), subscription_id, resource_group, workspace # Restore to default
)

#print(ml_client.online_endpoints.list())

for object in ml_client.online_endpoints.list():
    print(object.value)

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3/dist-packages/azure/cli/__main__.py", line 33, in <module>
    az_cli = get_default_cli()
  File "/usr/lib/python3/dist-packages/azure/cli/core/__init__.py", line 554, in get_default_cli
    return AzCli(cli_name='az',
  File "/usr/lib/python3/dist-packages/azure/cli/core/__init__.py", line 61, in __init__
    register_ids_argument(self)  # global subscription must be registered first!
  File "/usr/lib/python3/dist-packages/azure/cli/core/commands/arm.py", line 182, in register_ids_argument
    from msrestazure.tools import parse_resource_id, is_valid_resource_id
  File "/usr/lib/python3/dist-packages/msrestazure/__init__.py", line 28, in <module>
    from .azure_configuration import AzureConfiguration
  File "/usr/lib/pytho

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3/dist-packages/azure/cli/__main__.py", line 33, in <module>
    az_cli = get_default_cli()
  File "/usr/lib/python3/dist-packages/azure/cli/core/__init__.py", line 554, in get_default_cli
    return AzCli(cli_name='az',
  File "/usr/lib/python3/dist-packages/azure/cli/core/__init__.py", line 61, in __init__
    register_ids_argument(self)  # global subscription must be registered first!
  File "/usr/lib/python3/dist-packages/azure/cli/core/commands/arm.py", line 182, in register_ids_argument
    from msrestazure.tools import parse_resource_id, is_valid_resource_id
  File "/usr/lib/python3/dist-packages/msrestazure/__init__.py", line 28, in <module>
    from .azure_configuration import AzureConfiguration
  File "/usr/lib/pytho

ClientAuthenticationError: /usr/lib/python3/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.12) or chardet (3.0.4) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3/dist-packages/azure/cli/__main__.py", line 33, in <module>
    az_cli = get_default_cli()
  File "/usr/lib/python3/dist-packages/azure/cli/core/__init__.py", line 554, in get_default_cli
    return AzCli(cli_name='az',
  File "/usr/lib/python3/dist-packages/azure/cli/core/__init__.py", line 61, in __init__
    register_ids_argument(self)  # global subscription must be registered first!
  File "/usr/lib/python3/dist-packages/azure/cli/core/commands/arm.py", line 182, in register_ids_argument
    from msrestazure.tools import parse_resource_id, is_valid_resource_id
  File "/usr/lib/python3/dist-packages/msrestazure/__init__.py", line 28, in <module>
    from .azure_configuration import AzureConfiguration
  File "/usr/lib/python3/dist-packages/msrestazure/azure_configuration.py", line 34, in <module>
    from msrest import Configuration
  File "/home/amluser/.local/lib/python3.8/site-packages/msrest/__init__.py", line 28, in <module>
    from .configuration import Configuration
  File "/home/amluser/.local/lib/python3.8/site-packages/msrest/configuration.py", line 38, in <module>
    from .universal_http.requests import (
  File "/home/amluser/.local/lib/python3.8/site-packages/msrest/universal_http/__init__.py", line 53, in <module>
    from ..exceptions import ClientRequestError, raise_with_traceback
  File "/home/amluser/.local/lib/python3.8/site-packages/msrest/exceptions.py", line 31, in <module>
    from azure.core.exceptions import SerializationError, DeserializationError
ImportError: cannot import name 'SerializationError' from 'azure.core.exceptions' (/usr/lib/python3/dist-packages/azure/core/exceptions.py)


# 2. Install Additional Requirements

Install additional Python requirements using the following command. These will be used for scoring.

In [5]:
%pip install numpy tritonclient[http] pillow gevent

Collecting tritonclient[http]
  Using cached tritonclient-2.25.0-py3-none-manylinux1_x86_64.whl (11.4 MB)
Collecting pillow
  Downloading Pillow-9.2.0-cp310-cp310-manylinux_2_28_x86_64.whl (3.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.2/3.2 MB[0m [31m50.6 MB/s[0m eta [36m0:00:00[0m00:01[0m
[?25hCollecting gevent
  Downloading gevent-21.12.0-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.1/6.1 MB[0m [31m71.9 MB/s[0m eta [36m0:00:00[0m:00:01[0m00:01[0m
[?25hCollecting python-rapidjson>=0.9.1
  Downloading python_rapidjson-1.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m44.8 MB/s[0m eta [36m0:00:00[0m:00:01[0m
[?25hCollecting geventhttpclient>=1.4.4
  Downloading geventhttpclient-2.0.2-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_1

In [18]:
#container_registry = "<ACR_NAME>"
container_registry = "tritoncc"

# Create registry
!az acr create --name {container_registry} --resource-group {resource_group} --location westus2 --sku Basic

[K{\ Finished ..
  "adminUserEnabled": false,
  "anonymousPullEnabled": false,
  "creationDate": "2022-09-09T17:15:23.320083+00:00",
  "dataEndpointEnabled": false,
  "dataEndpointHostNames": [],
  "encryption": {
    "keyVaultProperties": null,
    "status": "disabled"
  },
  "id": "/subscriptions/f57ce3c6-5c6f-4f1e-8cba-b782d8974590/resourceGroups/rg-azureml-pg/providers/Microsoft.ContainerRegistry/registries/tritoncc",
  "identity": null,
  "location": "westus2",
  "loginServer": "tritoncc.azurecr.io",
  "name": "tritoncc",
  "networkRuleBypassOptions": "AzureServices",
  "networkRuleSet": null,
  "policies": {
    "exportPolicy": {
      "status": "enabled"
    },
    "quarantinePolicy": {
      "status": "disabled"
    },
    "retentionPolicy": {
      "days": 7,
      "lastUpdatedTime": "2022-09-09T17:15:24.419619+00:00",
      "status": "disabled"
    },
    "trustPolicy": {
      "status": "disabled",
      "type": "Notary"
    }
  },
  "privateEndpointConnections": [],
  "pro

In [19]:
acr_endpoint = f'{container_registry}.azurecr.io/'
tag = "azureml-examples/triton-cc:latest"


# Build Image
!az acr build -r {container_registry} -t { acr_endpoint + tag } --resource-group {resource_group} .

[93mPacking source code into tar to upload...[0m
[93mUploading archived source code from '/tmp/build_archive_14a20d99c37244d88303508b36a33a0d.tar.gz'...[0m
[93mSending context (28.953 MiB) to registry: tritoncc...[0m
[K[93mQueued a build with ID: cc1[0m
[93mWaiting for an agent...[0m
2022/09/09 17:16:22 Downloading source code...
2022/09/09 17:16:23 Finished downloading source code
2022/09/09 17:16:24 Using acb_vol_0e78e7cd-64b6-4cd2-ab87-38133249f71b as the home volume
2022/09/09 17:16:24 Setting up Docker configuration...
2022/09/09 17:16:24 Successfully set up Docker configuration
2022/09/09 17:16:24 Logging in to registry: tritoncc.azurecr.io
2022/09/09 17:16:25 Successfully logged into tritoncc.azurecr.io
2022/09/09 17:16:25 Executing step ID: build. Timeout(sec): 28800, Working directory: '', Network: ''
2022/09/09 17:16:25 Scanning for dependencies...
2022/09/09 17:16:26 Successfully scanned dependencies
2022/09/09 17:16:26 Launching container with name: build
Sending

# 3. Deploy your online endpoint to Azure
Next, deploy your online endpoint to Azure.

## 3.1 Configure online endpoint
`endpoint_name`: The name of the endpoint. It must be unique in the Azure region. Naming rules are defined under [managed online endpoint limits](https://docs.microsoft.com/azure/machine-learning/how-to-manage-quotas#azure-machine-learning-managed-online-endpoints-preview).

`auth_mode` : Use `key` for key-based authentication. Use `aml_token` for Azure Machine Learning token-based authentication. A `key` does not expire, but `aml_token` does expire. 

Optionally, you can add description, tags to your endpoint.

In [25]:
# Creating a unique endpoint name with current datetime to avoid conflicts
import datetime

online_endpoint_name = "endpoint-" + datetime.datetime.now().strftime("%m%d%H%M%f")

# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="this is a sample online endpoint",
    auth_mode="key",
    tags={"foo": "bar"},
)

## 3.2 Create the endpoint
Using the `MLClient` created earlier, we will now create the Endpoint in the workspace. This command will start the endpoint creation and return a confirmation response while the endpoint creation continues.

In [26]:
ml_client.begin_create_or_update(endpoint)

ManagedOnlineEndpoint({'public_network_access': 'Enabled', 'provisioning_state': 'Succeeded', 'scoring_uri': 'https://endpoint-09091841598709.westus2.inference.ml.azure.com/score', 'swagger_uri': 'https://endpoint-09091841598709.westus2.inference.ml.azure.com/swagger.json', 'name': 'endpoint-09091841598709', 'description': 'this is a sample online endpoint', 'tags': {'foo': 'bar'}, 'properties': {'azureml.onlineendpointid': '/subscriptions/f57ce3c6-5c6f-4f1e-8cba-b782d8974590/resourcegroups/rg-azureml-pg/providers/microsoft.machinelearningservices/workspaces/aml-pg-02/onlineendpoints/endpoint-09091841598709', 'AzureAsyncOperationUri': 'https://management.azure.com/subscriptions/f57ce3c6-5c6f-4f1e-8cba-b782d8974590/providers/Microsoft.MachineLearningServices/locations/westus2/mfeOperationsStatus/oe:c2bc7f72-8acf-4a1d-9da7-efb7bf987221:3589541e-037e-4b79-94dd-ba0d77b07b9c?api-version=2022-02-01-preview'}, 'id': '/subscriptions/f57ce3c6-5c6f-4f1e-8cba-b782d8974590/resourceGroups/rg-azurem

## 3.3 Configure online deployment
A deployment is a set of resources required for hosting the model that does the actual inferencing. We will create a deployment for our endpoint using the `ManagedOnlineDeployment` class.

### Key aspects of deployment 
- `name` - Name of the deployment.
- `endpoint_name` - Name of the endpoint to create the deployment under.
- `model` - The model to use for the deployment. This value can be either a reference to an existing versioned model in the workspace or an inline model specification.
- `environment` - The environment to use for the deployment. This value can be either a reference to an existing versioned environment in the workspace or an inline environment specification.
- `code_configuration` - the configuration for the source code and scoring script
    - `path`- Path to the source code directory for scoring the model
    - `scoring_script` - Relative path to the scoring file in the source code directory
- `instance_type` - The VM size to use for the deployment. For the list of supported sizes, see [Managed online endpoints SKU list](https://docs.microsoft.com/en-us/azure/machine-learning/reference-managed-online-endpoints-vm-sku-list).
- `instance_count` - The number of instances to use for the deployment

In [44]:
# create a blue deployment
model = Model(name="sample-densenet-onnx-model-2", version="1", path="./models/model_1", type="triton_model")

inference_config = {
    "liveness_route" : 
    { 
        "path": "/v2/health/live",
        "port": 8000
    },
    "readiness_route" : 
    {
        "path": "/v2/health/ready",
        "port": 8000
    },
    "scoring_route" : {
        "path": "/",
        "port": 8000
    }
}
enviroment = Environment(name="triton-cc-env", inference_config=inference_config, image=(acr_endpoint + tag))

blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=online_endpoint_name,
    environment=enviroment,
    model=model,
    instance_type="Standard_NC6s_v3",
    instance_count=1,
    model_mount_path="/models"
)

### Readiness route vs. liveness route
An HTTP server defines paths for both liveness and readiness. A liveness route is used to check whether the server is running. A readiness route is used to check whether the server is ready to do work. In machine learning inference, a server could respond 200 OK to a liveness request before loading a model. The server could respond 200 OK to a readiness request only after the model has been loaded into memory.

Review the [Kubernetes documentation](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/) for more information about liveness and readiness probes.

Notice that this deployment uses the same path for both liveness and readiness, since TF Serving only defines a liveness route.

## 3.4 Create the deployment
Using the `MLClient` created earlier, we will now create the deployment in the workspace. This command will start the deployment creation and return a confirmation response while the deployment creation continues.

In [45]:
ml_client.begin_create_or_update(blue_deployment)

Check: endpoint endpoint-09091841598709 exists
Creating/updating online deployment blue 

..........................................................................

Done (9m 49s)


## 3.4 Set traffic to 100% for deployment

In [46]:
# blue deployment takes 100 traffic
endpoint.traffic = {"blue": 100}
ml_client.begin_create_or_update(endpoint)

ManagedOnlineEndpoint({'public_network_access': 'Enabled', 'provisioning_state': 'Succeeded', 'scoring_uri': 'https://endpoint-09091841598709.westus2.inference.ml.azure.com/', 'swagger_uri': 'https://endpoint-09091841598709.westus2.inference.ml.azure.com/swagger.json', 'name': 'endpoint-09091841598709', 'description': 'this is a sample online endpoint', 'tags': {'foo': 'bar'}, 'properties': {'azureml.onlineendpointid': '/subscriptions/f57ce3c6-5c6f-4f1e-8cba-b782d8974590/resourcegroups/rg-azureml-pg/providers/microsoft.machinelearningservices/workspaces/aml-pg-02/onlineendpoints/endpoint-09091841598709', 'AzureAsyncOperationUri': 'https://management.azure.com/subscriptions/f57ce3c6-5c6f-4f1e-8cba-b782d8974590/providers/Microsoft.MachineLearningServices/locations/westus2/mfeOperationsStatus/oe:c2bc7f72-8acf-4a1d-9da7-efb7bf987221:8c0d9c83-3ab5-47b9-a089-933316ea0738?api-version=2022-02-01-preview'}, 'id': '/subscriptions/f57ce3c6-5c6f-4f1e-8cba-b782d8974590/resourceGroups/rg-azureml-pg/

# 4. Test the endpoint with sample data
This version of the triton server requires pre- and post-image processing. Below we show how to invoke the endpoint with this processing.

In [6]:
#!az login --use-device-code

# Get the details for online endpoint
endpoint = ml_client.online_endpoints.get(name="endpoint-09091841598709")

# existing traffic details
print(endpoint.traffic)

# Get the scoring URI
print(endpoint.scoring_uri)

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3/dist-packages/azure/cli/__main__.py", line 33, in <module>
    az_cli = get_default_cli()
  File "/usr/lib/python3/dist-packages/azure/cli/core/__init__.py", line 554, in get_default_cli
    return AzCli(cli_name='az',
  File "/usr/lib/python3/dist-packages/azure/cli/core/__init__.py", line 61, in __init__
    register_ids_argument(self)  # global subscription must be registered first!
  File "/usr/lib/python3/dist-packages/azure/cli/core/commands/arm.py", line 182, in register_ids_argument
    from msrestazure.tools import parse_resource_id, is_valid_resource_id
  File "/usr/lib/python3/dist-packages/msrestazure/__init__.py", line 28, in <module>
    from .azure_configuration import AzureConfiguration
  File "/usr/lib/pytho

ClientAuthenticationError: /usr/lib/python3/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (1.26.12) or chardet (3.0.4) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3/dist-packages/azure/cli/__main__.py", line 33, in <module>
    az_cli = get_default_cli()
  File "/usr/lib/python3/dist-packages/azure/cli/core/__init__.py", line 554, in get_default_cli
    return AzCli(cli_name='az',
  File "/usr/lib/python3/dist-packages/azure/cli/core/__init__.py", line 61, in __init__
    register_ids_argument(self)  # global subscription must be registered first!
  File "/usr/lib/python3/dist-packages/azure/cli/core/commands/arm.py", line 182, in register_ids_argument
    from msrestazure.tools import parse_resource_id, is_valid_resource_id
  File "/usr/lib/python3/dist-packages/msrestazure/__init__.py", line 28, in <module>
    from .azure_configuration import AzureConfiguration
  File "/usr/lib/python3/dist-packages/msrestazure/azure_configuration.py", line 34, in <module>
    from msrest import Configuration
  File "/home/amluser/.local/lib/python3.8/site-packages/msrest/__init__.py", line 28, in <module>
    from .configuration import Configuration
  File "/home/amluser/.local/lib/python3.8/site-packages/msrest/configuration.py", line 38, in <module>
    from .universal_http.requests import (
  File "/home/amluser/.local/lib/python3.8/site-packages/msrest/universal_http/__init__.py", line 53, in <module>
    from ..exceptions import ClientRequestError, raise_with_traceback
  File "/home/amluser/.local/lib/python3.8/site-packages/msrest/exceptions.py", line 31, in <module>
    from azure.core.exceptions import SerializationError, DeserializationError
ImportError: cannot import name 'SerializationError' from 'azure.core.exceptions' (/usr/lib/python3/dist-packages/azure/core/exceptions.py)


In [71]:
# Enter the scoring_url and auth token for your AML deployment

scoring_uri = endpoint.scoring_uri
auth_token = "mdkeaVMG5QZF2anWi0THwfJlaajrDaym"
#img_url = "https://aka.ms/peacock-pic" # This is a sample image

In [70]:
# test the blue deployment with some sample data

import requests
import numpy as np
from PIL import Image

from scoring_utils import prepost

import gevent.ssl
import tritonclient.http as tritonhttpclient

# We remove the scheme from the url
scoring_uri = scoring_uri[8:]

# Initialize client handler 
triton_client = tritonhttpclient.InferenceServerClient(
        url=scoring_uri,
        ssl=True,
        ssl_context_factory=gevent.ssl._create_default_https_context,
    )

# Create headers
headers = {}
headers["Authorization"] = f"Bearer {auth_token}"

# Check status of triton server
health_ctx = triton_client.is_server_ready(headers=headers)
print("Is server ready - {}".format(health_ctx))

# Check status of model
model_name = "model_1"
status_ctx = triton_client.is_model_ready(model_name, "1", headers)
print("Is model ready - {}".format(status_ctx))

#img_content = requests.get(img_url).content
img_data = prepost.preprocess('peacock-image.png')

# Populate inputs and outputs
input = tritonhttpclient.InferInput("data_0", img_data.shape, "FP32")
input.set_data_from_numpy(img_data)
inputs = [input]
output = tritonhttpclient.InferRequestedOutput("fc6_1")
outputs = [output]

result = triton_client.infer(model_name, inputs, outputs=outputs, headers=headers)
max_label = np.argmax(result.as_numpy("fc6_1"))
label_name = prepost.postprocess(max_label)
print(label_name)

gaierror: [Errno -2] Name or service not known

# 5. Managing endpoints and deployments

## 5.1 Get the logs for the new deployment
Get the logs for the blue deployment and verify as needed

In [53]:
ml_client.online_deployments.get_logs(
    name="blue", endpoint_name=online_endpoint_name, lines=50
)

"Instance status:\nSystemSetup: Succeeded\nUserContainerImagePull: Succeeded\nModelDownload: Succeeded\nUserContainerStart: Succeeded\n\nContainer events:\nKind: Pod, Name: Downloading, Type: Normal, Time: 2022-08-26T17:17:43.099773Z, Message: Start downloading models\nKind: Pod, Name: Pulling, Type: Normal, Time: 2022-08-26T17:17:48.443422Z, Message: Start pulling container image\nKind: Pod, Name: Pulled, Type: Normal, Time: 2022-08-26T17:22:00.421567Z, Message: Container image is pulled successfully\nKind: Pod, Name: Downloaded, Type: Normal, Time: 2022-08-26T17:22:00.421567Z, Message: Models are downloaded successfully\nKind: Pod, Name: Created, Type: Normal, Time: 2022-08-26T17:22:00.457065Z, Message: Created container inference-server\nKind: Pod, Name: Started, Type: Normal, Time: 2022-08-26T17:22:00.628379Z, Message: Started container inference-server\nKind: Pod, Name: ContainerReady, Type: Normal, Time: 2022-08-26T17:22:19.82993125Z, Message: Container is ready\n\nContainer logs

# 6. Delete the endpoint

In [None]:
ml_client.online_endpoints.begin_delete(name=online_endpoint_name)