# Unleashing the new possibilities: Deploy your LangChain application on managed online endpoint 

In rapidly evolving AI landscape, the need for swift and scalable AI application deployment is essential. Recognizing this, we are excited to unveil a new capability that supports LangChain applications- the leading tool for building LLM apps to be deployed quickly and effortlessly on managed online endpoints and enjoy the perks of managed online endpoints.

In this notebook, you will learn how to deploy rapidly your LangChain app to a managed online endpoint for use in real-time inference. You can follow below steps to easily deploy your LangCahin app on managed online endpoints. We’re providing you with a QuickStart image that you can reuse for your development purpose along with example below to follow along.

Langchain Integration into Azure Machine Learning (AzureML)

**Requirements** - In order to benefit from this tutorial, you will need:
* A basic understanding of Machine Learning and Large Language Models
* An Azure account with an active subscription. [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F)
* An Azure Machine Learning Workspace, Azure OpenAI Service Endpoint and Deployed model.
* A working GPT deployment in a Azure AI services.

**Motivations** - The Langchain framework allows for rapid development of applications powered by large language models. This sample creates a chat bot application backed by a large language model and deploys the application to AzureML.

**Outline** - 
1. Prepare the required resources
2. Deploy the app to an **AzureML Managed Online Endpoint**
3. Test


# 1. Connect to Azure Machine Learning

In [7]:
# required for Azure OpenAI API
AZURE_OPENAI_ENDPOINT = "<AOAI endpoint>"
AZURE_OPENAI_DEPLOYMENT = "<deployment-name>"
OPENAI_API_VERSION = "2023-03-15-preview"

# set to env var for the langchain code to consume
%env OPENAI_API_VERSION=$OPENAI_API_VERSION
%env AZURE_OPENAI_API_ENDPOINT=$AZURE_OPENAI_API_ENDPOINT
%env AZURE_OPENAI_DEPLOYMENT=$AZURE_OPENAI_DEPLOYMENT

env: OPENAI_API_VERSION=2023-03-15-preview
env: AZURE_OPENAI_API_ENDPOINT=https://aoai-ibisckmf5skye.openai.azure.com
env: AZURE_OPENAI_DEPLOYMENT=gpt-35-turbo


### 1.1 Install required packages

In [None]:
%pip install -r requirements.txt

### 1.2 Set workspace details

In [21]:
# enter details of your AML workspace
SUBSCRIPTION_ID = "<SUBSCRIPTION_ID>"
RESOURCE_GROUP = "<RESOURCE_GROUP>"
AML_WORKSPACE_NAME = "<AML_WORKSPACE_NAME>"
AZURE_AI_SERVICES_NAME = "<AZURE_AI_SERVICES_NAME>"

### 1.3 Login to your Azure account

In [3]:
# Authenticate clients
from azure.identity import (
    DefaultAzureCredential,
    InteractiveBrowserCredential,
    AzureCliCredential,
)

try:
    credential = DefaultAzureCredential(additionally_allowed_tenants=["*"])
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential(additionally_allowed_tenants=["*"])

# If login doesn't work above, uncomment the code below and login using device code
# !az login --use-device-code

# 2. Managed Online Endpoint

In [None]:
# create a endpoint
import datetime

from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
)

from azure.ai.ml import (
    MLClient,
)

time = str(datetime.datetime.now().strftime("%m%d%H%M%f"))
online_endpoint_name = f"aml-llm-lc-demo-{time}"

# get a handle to the workspace
ml_client = MLClient(credential, SUBSCRIPTION_ID, RESOURCE_GROUP, AML_WORKSPACE_NAME)

# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="online endpoint for Langchain server",
    auth_mode="key",
)

endpoint = ml_client.begin_create_or_update(endpoint).result()

print(endpoint)

In [None]:
# assign the Cognitive Services User role to the endpoint
endpoint_principal_id = endpoint.identity.principal_id
!az role assignment create --assignee-principal-type ServicePrincipal --assignee-object-id {endpoint_principal_id} --role "Cognitive Services User" --scope /subscriptions/{SUBSCRIPTION_ID}/resourceGroups/{RESOURCE_GROUP}/providers/Microsoft.CognitiveServices/accounts/{AZURE_AI_SERVICES_NAME}

# 3. Deploy to Endpoint

In [None]:
from azure.ai.ml.entities import (
    ManagedOnlineDeployment,
    OnlineRequestSettings,
    Model,
)

environment_uri = (
    "azureml://registries/azureml/environments/minimal-app-quickstart/labels/latest"
)

deployment_name = f"deploy-{time}-4"
lc_deployment = ManagedOnlineDeployment(
    name=deployment_name,
    environment=environment_uri,
    model=Model(path="../src/langchain"),
    request_settings=OnlineRequestSettings(request_timeout_ms=60000),
    environment_variables={
        "OPENAI_API_VERSION": OPENAI_API_VERSION,
        "AZURE_OPENAI_ENDPOINT": AZURE_OPENAI_ENDPOINT,
        "AZURE_OPENAI_DEPLOYMENT": AZURE_OPENAI_DEPLOYMENT,
    },
    endpoint_name=online_endpoint_name,
    instance_type="Standard_F2s_v2",
    instance_count=1,
)
ml_client.online_deployments.begin_create_or_update(lc_deployment).result()

endpoint.traffic = {deployment_name: 100}
ml_client.begin_create_or_update(endpoint).result()

# 4. Test
Now endpoint has been deployed, let's test it.

In [None]:
# you can customize your inference experience following Langserve instruction. The below code is just a simple example.
"""

uncomment the code below to test the endpoint

from langserve import RemoteRunnable

token = ml_client.online_endpoints.get_keys(name=endpoint.name).primary_key
url = endpoint.scoring_uri
url = url.replace("/score", "")
runnable_az = RemoteRunnable(
    f"{url}/openai-functions-agent", headers={"Authorization": "Bearer " + token}
)
async for msg in runnable_az.astream({"chat_history": [], "input": "Holle?"}):
    print(msg, end="", flush=True)
"""

# 5. Clean up resources

In [None]:
ml_client.online_endpoints.begin_delete(name=online_endpoint_name)