# Deploy a flow to managed online endpoint

In this notebook, you'll learn how to deploy a flow to a managed online endpoint using AzureML SDK v2.
This notebook will use the [sample basic chat flow](../../../../../cli/generative-ai/promptflow/basic-chat) as an example and deploy it to AzureML managed online endpoint.

## Prerequisites

- Get familiar with [prompt flow](https://learn.microsoft.com/azure/machine-learning/prompt-flow/get-started-prompt-flow?view=azureml-api-2).
- Make sure you have created the connection used in the flow in your AzureML workspace.

## 1. Set up

### 1.1 Import the required libraries

In [None]:
# import required libraries
from azure.ai.ml import MLClient
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    Model,
    Environment,
    CodeConfiguration,
)
from azure.identity import DefaultAzureCredential

### 1.2 Configure workspace details and get a handle to the workspace

In [None]:
# enter details of your Azure Machine Learning workspace
subscription_id = "<SUBSCRIPTION_ID>"
resource_group = "<RESOURCE_GROUP>"
workspace = "<AZUREML_WORKSPACE_NAME>"

In [None]:
# get a handle to the workspace
ml_client = MLClient(
    DefaultAzureCredential(), subscription_id, resource_group, workspace
)

## 2. Create Online Endpoint

In [None]:
# Define an endpoint name
endpoint_name = "basic-chat-endpoint-sdk"

# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=endpoint_name, description="this is a sample endpoint", auth_mode="key"
)

In [None]:
ml_client.online_endpoints.begin_create_or_update(endpoint).result()

## 3. Create Deployment

In [None]:
model = Model(
    path="../../../../../cli/generative-ai/promptflow/basic-chat",
    properties={
        "azureml.promptflow.source_flow_id": "basic-chat",  # endpoint detail UI Test tab needs this property to know it's from prompt flow
        # Following are properties only for chat flow
        "azureml.promptflow.mode": "chat",
        "azureml.promptflow.chat_input": "question",
        "azureml.promptflow.chat_output": "answer",
    },
)
env = Environment(
    image="mcr.microsoft.com/azureml/promptflow/promptflow-runtime:latest",
    inference_config={
        "liveness_route": {"path": "/health", "port": 8080},
        "readiness_route": {"path": "/health", "port": 8080},
        "scoring_route": {"path": "/score", "port": 8080},
    },
)

blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=endpoint_name,
    model=model,
    environment=env,
    instance_type="Standard_E16s_v3",
    instance_count=1,
    environment_variables={
        "PROMPTFLOW_RUN_MODE": "serving",
        # currently it is for pulling connections from workspace
        "PRT_CONFIG_OVERRIDE": "deployment.subscription_id=<subscription_id>,deployment.resource_group=<resource_group>,deployment.workspace_name=<workspace_name>,deployment.endpoint_name=<endpoint_name>,deployment.deployment_name=<deployment_name>",
    },
)

In [None]:
ml_client.online_deployments.begin_create_or_update(blue_deployment).result()

In [None]:
# blue deployment takes 100 traffic
endpoint.traffic = {"blue": 100}
ml_client.online_endpoints.begin_create_or_update(endpoint).result()

## 4. Test the endpoint with sample data

In [None]:
# test the blue deployment with some sample data
ml_client.online_endpoints.invoke(
    endpoint_name=endpoint_name,
    deployment_name="blue",
    request_file="./sample-request.json",
)