# Langchain Integration into Azure Machine Learning (AzureML)

**Requirements** - In order to benefit from this tutorial, you will need:
* A basic understanding of Machine Learning and Large Language Models
* An Azure account with an active subscription. [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F)
* An Azure Machine Learning Workspace, Azure Key Vault, and Azure Container Registry
* An OpenAI API Key which can be found in User Settings in OpenAI

**Motivations** - The Langchain framework allows for rapid development of applications powered by large language models. This sample creates a chat bot application backed by a large language model and deploys the application to AzureML.

**Outline** - 
1. Test langchain app locally using [**azureml-inference-server-http**](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-inference-server-http?view=azureml-api-2) pacakge.
2. Deploy the app to an **AzureML Managed Online Endpoint**


# 1. Test Locally
Before deploy, you could test the langchain app locally. We are using [Langchain ChatGPT plugin](https://python.langchain.com/en/latest/modules/agents/tools/examples/chatgpt_plugins.html) as an example app here. Execute the code below to try out. You can inspect the [simple_agent_app_test.py](../src/langchain/simple_agent_app_test.py) to see the implementation itself. It's a langchain ZERO_SHOT_REACT_DESCRIPTION agent with Klarna plugin.

In [None]:
import os

OPENAI_API_TYPE = "openai"  # 'azure' or 'openai'
OPENAI_API_KEY = "<OPENAI_API_KEY>"

# required for OpenAI API
OPENAI_ORG_ID = ""
OPENAI_MODEL_ID = "gpt-3.5-turbo"

# required for Azure OpenAI API
AZURE_OPENAI_API_ENDPOINT = "<AOAI endpoint>"
AZURE_OPENAI_API_DEPLOYMENT_NAME = "<deployment-name>"

# set to env var for the langchain code to consume
%env OPENAI_API_KEY=$OPENAI_API_KEY
%env OPENAI_API_TYPE=$OPENAI_API_TYPE
%env OPENAI_MODEL_ID=$OPENAI_MODEL_ID
%env OPENAI_ORG_ID=$OPENAI_ORG_ID
%env AZURE_OPENAI_API_ENDPOINT=$AZURE_OPENAI_API_ENDPOINT
%env AZURE_OPENAI_API_DEPLOYMENT_NAME=$AZURE_OPENAI_API_DEPLOYMENT_NAME

In [None]:
%pip install -r requirements.txt

In [None]:
# uncomment to test the app locally
# %run -i ../src/langchain/simple_agent_app_test.py

The test file above is directly running the langchain agent. To deploy to Online Endpoint, let's turn that into a scoring script. You can view the code here: [simple_agent_app_test.py](../src/langchain/simple_agent_app_test.py). We basically create an **init()** and a **run()** function and put the langchain code inside.

You can read more documentation about scoring file here: [How to deploy online endpoint - Understand the scoring script](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-online-endpoints?view=azureml-api-2&tabs=azure-cli#understand-the-scoring-script).

## 1.1 (Optional) Test locally using AzureML local server
Before we deploy to Online Endpoint, you could test locally first. Azure Machine Learning offers [**azureml-inference-server-http**](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-inference-server-http?view=azureml-api-2) to help. We will
* Start a subprocess to start the server locally
* Test the server by sending JSON requests
* Kill the server

Start the server in a sub process

In [None]:
"""
import subprocess

p = subprocess.Popen(
    "cd ../src/langchain/; azmlinfsrv --entry_script simple_agent_score.py",
    shell=True,
    stdout=subprocess.PIPE,
)
"""

You can stream the logs to see what's going on by executing this cell

In [None]:
"""
import time

while p.poll() is None:
    # Read one line from the output pipe
    output = p.stdout.readline()
    error = None
    if p.stderr:
        error = p.stderr.readline()
    if output:
        print(output.decode("utf-8").strip())
    if error:
        print(error.decode("utf-8").strip())
    time.sleep(0.1)
"""

Stop the log cell, now let's call the local server to test

In [None]:
"""
import requests, json, time
from urllib.parse import urlsplit

url = "http://localhost:5001/score"
headers = {"Content-Type": "application/json"}

payload = json.dumps(
    {"question": "what are the top 5 results for womens t shirts on klarna?"}
)

response = requests.post(url, headers=headers, data=payload)
print(f"Result:\n", response.text)
"""

You should see response similar to this
```
    The top 5 results for women's t-shirts on Klarna are: 
    1. Armani Exchange Slim Fit Logo Cotton T-shirt - Black
    2. Balmain White Printed T-Shirt
    3. Versace Jeans Logo T-shirt - Black
    4. Icebreaker Women's Tech Lite II Merino Short Sleeve T-shirt - Grey
    5. Off-White Logo T-Shirt White
```

We are done testing, let's kill the process

In [None]:
# pkill = subprocess.Popen("kill $(lsof -t -i :5001)", shell=True, stdout=subprocess.PIPE)

Since our scoring script is working, let's deploy to online endpoint.

# 2. Deploy Online Endpoint
On a high level, we will perform the following tasks:

* **Preparation**
    * Store OpenAI key in Azure Keyvault to keep it safe.
* **Create Managed Online Endpoint**, and give the endpoint permission to access Azure Keyvault above.
* **Deploy to Managed Online Endpoint**
* **Test**

## 2.1 Preparation

In [None]:
# enter details of your AML workspace
SUBSCRIPTION_ID = "<SUBSCRIPTION_ID>"
RESOURCE_GROUP = "<RESOURCE_GROUP>"
AML_WORKSPACE_NAME = "<AML_WORKSPACE_NAME>"

# Keyvault information
KEYVAULT_NAME = "<KEY_VAULT_NAME>"
# Key name in keyvault for the OPENAI-AI-KEY
KV_OPENAI_KEY = "OPENAI-API-KEY"

Login to your account

In [None]:
# Authenticate clients
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

try:
    credential = DefaultAzureCredential(additionally_allowed_tenants=["*"])
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential(additionally_allowed_tenants=["*"])

# If login doesn't work above, uncomment the code below and login using device code
# !az login --use-device-code

## 2.2 (Optional) Store Secrets in Azure Keyvault

In [None]:
"""
MY_OBJECT_ID = !az ad signed-in-user show --query id -o tsv
KEYVAULT_RESOURCE_URI = f"/subscriptions/{SUBSCRIPTION_ID}/resourcegroups/{RESOURCE_GROUP}/providers/Microsoft.KeyVault/vaults/{KEYVAULT_NAME}"

need_interactive_auth = False
if "AADSTS530003".lower() in MY_OBJECT_ID[0].lower():
    need_interactive_auth = True
    print("\n".join(MY_OBJECT_ID))
    print(
        "\nYou are geting this error probably because you are using a device login. And this operation needs interactive login. If you can't login interactively, you could simply copy and run the following command in Azure Cloud Shell in Bash mode.\n"
    )
    print("MY_OBJECT_ID=`az ad signed-in-user show --query id -o tsv`")
    print(
        f"az role assignment create --role 'Key Vault Administrator' --scope {KEYVAULT_RESOURCE_URI}  --assignee-object-id $MY_OBJECT_ID --assignee-principal-type User"
    )
    print(
        f"az keyvault secret set --name {KV_OPENAI_KEY} --vault-name {KEYVAULT_NAME} --value {OPENAI_API_KEY}"
    )
else:
    # Let's set OpenAI key as a secret in keyvault
    need_interactive_auth = False
    !az role assignment create --role "Key Vault Administrator" --scope {KEYVAULT_RESOURCE_URI}  --assignee-object-id {MY_OBJECT_ID[0]} --assignee-principal-type User
    !az keyvault secret set --name {KV_OPENAI_KEY} --vault-name {KEYVAULT_NAME} --value {OPENAI_API_KEY}
"""

# 3. Manage Online Endpoint
## 3.1 Create Endpoint

In [None]:
# create a endpoint
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
)

from azure.ai.ml import (
    MLClient,
)

online_endpoint_name = "aml-llm-demo-langchain-endpoint"

# get a handle to the workspace
ml_client = MLClient(credential, SUBSCRIPTION_ID, RESOURCE_GROUP, AML_WORKSPACE_NAME)

# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="online endpoint for Langchain server",
    auth_mode="key",
)

endpoint = ml_client.begin_create_or_update(endpoint).result()

print(endpoint)

## 3.2 (Optional) Grant Endpoint Permission to Dependencies
If using keyvault to store your OpenAI API key, uncomment the below code. The endpoint will use AAD to access dependent resources, so you don't have to hardcode secrets.

In [None]:
"""
# Allow the endpoint to access secrets in keyvault
KEYVAULT_RESOURCE_URI = f"/subscriptions/{SUBSCRIPTION_ID}/resourcegroups/{RESOURCE_GROUP}/providers/Microsoft.KeyVault/vaults/{KEYVAULT_NAME}"
need_interactive_auth = False
if need_interactive_auth:
    print(
        "If you can't login interactively, you could run the following command in Azure Cloud Bash Shell."
    )
    print(
        f"az role assignment create --role 'Key Vault Secrets User' --scope {KEYVAULT_RESOURCE_URI}  --assignee {endpoint.identity.principal_id}"
    )
else:
    !az role assignment create --role "Key Vault Secrets User" --scope {KEYVAULT_RESOURCE_URI}  --assignee {endpoint.identity.principal_id}
"""

3.3 Deploy to Endpoint

In [None]:
import datetime

from azure.ai.ml.entities import (
    ManagedOnlineDeployment,
    OnlineRequestSettings,
    Environment,
    CodeConfiguration,
)

KEYVAULT_URL = f"https://{KEYVAULT_NAME}.vault.azure.net"

env = Environment(
    conda_file="deployments/env.yml",
    image="mcr.microsoft.com/azureml/minimal-ubuntu20.04-py38-cpu-inference:latest",
)

deployment_name = f"deploy-{str(datetime.datetime.now().strftime('%m%d%H%M%f'))}"
sk_deployment = ManagedOnlineDeployment(
    name=deployment_name,
    environment=env,
    code_configuration=CodeConfiguration(
        code="../src", scoring_script="langchain/simple_agent_score.py"
    ),
    request_settings=OnlineRequestSettings(request_timeout_ms=60000),
    environment_variables={
        "OPENAI_API_KEY": OPENAI_API_KEY,
        "OPENAI_API_TYPE": OPENAI_API_TYPE,
        "OPENAI_MODEL_ID": OPENAI_MODEL_ID,
        "OPENAI_ORG_ID": OPENAI_ORG_ID,
        "AZURE_OPENAI_API_ENDPOINT": AZURE_OPENAI_API_ENDPOINT,
        "AZURE_OPENAI_API_DEPLOYMENT_NAME": AZURE_OPENAI_API_DEPLOYMENT_NAME,
    },
    endpoint_name=online_endpoint_name,
    instance_type="Standard_F2s_v2",
    instance_count=1,
)
ml_client.online_deployments.begin_create_or_update(sk_deployment).result()

endpoint.traffic = {deployment_name: 100}
ml_client.begin_create_or_update(endpoint).result()

# 4. Test
Now endpoint has been deployed, let's test it. We are going to re-use the same request when we test it locally earlier on.

In [None]:
import requests, json, time
from urllib.parse import urlsplit

url_parts = urlsplit(endpoint.scoring_uri)
url = url_parts.scheme + "://" + url_parts.netloc

token = ml_client.online_endpoints.get_keys(name=online_endpoint_name).primary_key
headers = {"Authorization": "Bearer " + token, "Content-Type": "application/json"}
payload = json.dumps(
    {"question": "what are the top 5 results for womens t shirts on klarna?"}
)

response = requests.post(f"{url}/score", headers=headers, data=payload)
print(f"Response:\n", response.text)

# 5. Delete the deployment and endpoint

In [None]:
ml_client.online_endpoints.begin_delete(name=online_endpoint_name)