# Semantic Kernel (SK) Integration into Azure Machine Learning (AzureML)

**Requirements** - In order to benefit from this tutorial, you will need:
* A basic understanding of Machine Learning and Large Language Models
* An Azure account with an active subscription. [Create an account for free](https://azure.microsoft.com/free/?WT.mc_id=A261C142F)
* An Azure Machine Learning Workspace, Azure Key Vault, and Azure Container Registry
* An OpenAI API Key which can be found in User Settings in OpenAI

**Motivations** - Semantic kernel has a slightly different approach to LLM agents. It offers an interesting Plan->Execute pattern, where it could use LLM to form a plan first, then human could confirm and execute on the plan. In this notebook, we use the [planner](https://github.com/microsoft/semantic-kernel/blob/main/samples/notebooks/python/05-using-the-planner.ipynb) example from Semantic Kernel as a base. But additionally, we've made the following modifications:
* Created a **python SemanticKernelHttp server** based on Flask.
* Deploy SemanticKernelHttp to an **AzureML Managed Online Endpoint**

Managed online endpoints provide an easy to manage inferencing server for your ML workload. It's perfect for LLM based applications. Since we need a REST service, we won't use the default endpoint docker image, we will create a custom docker image instead.

**Outline** - 
1. Test SemanticKernelHttpServe locally
2. Prepare Dependencies
3. Deploy to Managed Online Endpoint
4. Test

# 1. Test SemanticKernelHttpServer Locally
Before we proceed, let's startup the server and test locally first. Grab your OpenAI infor and fill in the variables below

In [None]:
import os
OPENAI_API_TYPE='azure' # 'azure' or 'openai'
OPENAI_API_KEY=os.environ.get('OPENAI_API_KEY') if os.environ.get('OPENAI_API_KEY') else input('OPENAI_API_KEY')

# required for OpenAI API
OPENAI_ORG_ID=''
OPENAI_MODEL_ID='gpt-3.5-turbo'

# required for Azure OpenAI API
AZURE_OPENAI_API_ENDPOINT='https://<azure-openai-endpoint>.openai.azure.com/'
AZURE_OPENAI_API_DEPLOYMENT_NAME='<deployment-name>'

# set to true for chat completion API, false for text completion
IS_CHAT_COMPLETION=True

# setting up env variables for local server
%env OPENAI_API_TYPE=$OPENAI_API_TYPE
%env OPENAI_API_KEY=$OPENAI_API_KEY
%env OPENAI_MODEL_ID=$OPENAI_MODEL_ID
%env OPENAI_ORG_ID=$OPENAI_ORG_ID
%env AZURE_OPENAI_API_ENDPOINT=$AZURE_OPENAI_API_ENDPOINT
%env AZURE_OPENAI_API_DEPLOYMENT_NAME=$AZURE_OPENAI_API_DEPLOYMENT_NAME
%env IS_CHAT_COMPLETION=$IS_CHAT_COMPLETION

# Install python dependencies
%pip install -r ../src/sk/requirements.txt --user

In [None]:
# start the server locally
# %run -i ../src/sk/app.py

## 1.1 Test the server API
Now the server is running, since the Jupyter notebook kernal is blocked.
You could use postman to test. If you don't have postman, you could also open a terminal and execute the curl commands below to try. 
```
curl -i -X POST -H "Content-Type: application/json" -d "{\"value\": \"Tomorrow is Valentine day. I need to come up with a few date ideas. She speaks French so write it in French.\"}" http://127.0.0.1:5001/planner/createplan
```
You should be seeing resposes like
```
{
    "input": "Valentine's Day Date Ideas",
    "subtasks": [
        {"function": "WriterSkill.Brainstorm"},
        {"function": "WriterSkill.EmailTo", "args": {"to": "significant_other"}},
        {"function": "WriterSkill.Translate", "args": {"language": "French"}}
    ]
}
```

Then you could execute this plan
```
curl -i -X POST -H "Content-Type: application/json" -d "{\"input\": \"Valentine's Day Date Ideas\",    \"subtasks\": [{\"function\": \"WriterSkill.Brainstorm\"},{\"function\": \"WriterSkill.EmailTo\", \"args\": {\"to\": \"significant_other\"}},{\"function\": \"WriterSkill.Translate\", \"args\": {\"language\": \"French\"}}]}" http://127.0.0.1:5001/planner/executeplan
```
You should see responses like 
```
Assurez-vous d'utiliser uniquement le français.

Cher(e) partenaire,

Je pensais à des activités amusantes et romantiques que nous pourrions faire ensemble et j'ai eu quelques idées. Nous pourrions avoir un dîner romantique dans un restaurant chic, ou faire un pique-nique dans le parc. Une autre idée est de faire une soirée cinéma à la maison avec du popcorn fait maison, ou aller pour un massage en couple dans un spa. Nous pourrions également essayer une dégustation de vin dans un vignoble local, ou aller patiner sur glace dans une patinoire à proximité. Si nous sommes d'humeur aventureuse, nous pourrions prendre un cours de cuisine ensemble, ou faire un tour en montgolfière au lever ou au coucher du soleil. Pour une expérience plus culturelle, nous pourrions visiter un musée ou une galerie d'art, ou faire une randonnée ou une promenade panoramique.

Faites-moi savoir ce que vous en pensez !

Merci,
[Votre nom]
```

Great! Now our server and plugin are both working well locally. Let's deploy to AML so our team members could try out too.

# 2. Deploy Server to AML Managed Online Endpoint
On a high level, we will perform the following tasks:

* **Preparation**
    * Store Secets in Azure Keyvault
    * Create docker image in Azure Container Registry
* **Create Managed Online Endpoint**, grant the endpoint permission to access the resources above.
* **Deploy to Managed Online Endpoint**, now we are ready deploy the code
* **Test**

## 2.1. Preparation

In [None]:
# Azure resources needed for the demo. You can change them to use your own. 
# If you don't have the resources, you could keep the names as is and we will create them for you in next step.
SUBSCRIPTION_ID = "<subscription-id>"
RESOURCE_GROUP = "llmops-demo"
REGION = "westus3"
AML_WORKSPACE_NAME = "llmops-demo-aml-ws"

# Keyvault information
KEYVAULT_NAME = "aml-llm-demo-kv"
# Key name in keyvault for the OPENAI-AI-KEY
KV_OPENAI_KEY = "OPENAI-API-KEY"

ACR_NAME='amlllmdemoacr'
ACR_IMAGE_NAME='serving'

Let's login first

In [None]:
# Authenticate clients
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

try:
    credential = DefaultAzureCredential(additionally_allowed_tenants = ['*'])
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential(additionally_allowed_tenants = ['*'])

# If login doesn't work above, uncomment the code below and login using device code
# !az login --use-device-code

### 2.2 Store Secrets in Azure Keyvault

In [None]:
MY_OBJECT_ID = !az ad signed-in-user show --query id -o tsv
KEYVAULT_RESOURCE_URI = f"/subscriptions/{SUBSCRIPTION_ID}/resourcegroups/{RESOURCE_GROUP}/providers/Microsoft.KeyVault/vaults/{KEYVAULT_NAME}"

need_interactive_auth = False
if "AADSTS530003".lower() in MY_OBJECT_ID[0].lower():
    need_interactive_auth = True
    print('\n'.join(MY_OBJECT_ID))
    print("\nYou are geting this error probably because you are using a device login. And this operation needs interactive login. If you can't login interactively, you could simply copy and run the following command in Azure Cloud Shell in Bash mode.\n")
    print("MY_OBJECT_ID=`az ad signed-in-user show --query id -o tsv`")
    print(f"az role assignment create --role 'Key Vault Administrator' --scope {KEYVAULT_RESOURCE_URI}  --assignee-object-id $MY_OBJECT_ID --assignee-principal-type User")
    print(f"az keyvault secret set --name {KV_OPENAI_KEY} --vault-name {KEYVAULT_NAME} --value {OPENAI_API_KEY}")
else:
    # Let's set OpenAI key as a secret in keyvault
    need_interactive_auth = False
    !az role assignment create --role "Key Vault Administrator" --scope {KEYVAULT_RESOURCE_URI}  --assignee-object-id {MY_OBJECT_ID[0]} --assignee-principal-type User
    !az keyvault secret set --name {KV_OPENAI_KEY} --vault-name {KEYVAULT_NAME} --value {OPENAI_API_KEY}

### 2.3 Create Docker Image

In [None]:

!az acr build --image {ACR_IMAGE_NAME} --registry {ACR_NAME} ./environment/serving/. --subscription {SUBSCRIPTION_ID}

## 3. Manage Online Endpoint
### 3.1 Create Endpoint

In [None]:
# Authenticate clients
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

try:
    credential = DefaultAzureCredential(additionally_allowed_tenants = ['*'])
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
    credential = InteractiveBrowserCredential()
    
# create a endpoint
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
)

from azure.ai.ml import (
    MLClient,
)

online_endpoint_name = "aml-llm-demo-sk-endpoint"

# get a handle to the workspace
ml_client = MLClient(credential, SUBSCRIPTION_ID, RESOURCE_GROUP, AML_WORKSPACE_NAME)

try:
    endpoint = ml_client.online_endpoints.get(online_endpoint_name)
except Exception as ex:
    # create an online endpoint
    endpoint = ManagedOnlineEndpoint(
        name=online_endpoint_name,
        description="online endpoint for SemanticKernelHttp server",
        auth_mode="key",
    )

    endpoint = ml_client.begin_create_or_update(endpoint).result()

print(endpoint)

### 3.2 Grant Endpoint Permission to Dependencies
Endpoint uses AAD to access dependent resources, so you don't have to hardcode secrets.

In [None]:
# Allow the endpoint to access keyvault and acr
KEYVAULT_RESOURCE_URI = f"/subscriptions/{SUBSCRIPTION_ID}/resourcegroups/{RESOURCE_GROUP}/providers/Microsoft.KeyVault/vaults/{KEYVAULT_NAME}"
ACR_RESOURCE_URI = f"/subscriptions/{SUBSCRIPTION_ID}/resourcegroups/{RESOURCE_GROUP}/providers/Microsoft.ContainerRegistry/registries/{ACR_NAME}"

need_interactive_auth=True
if need_interactive_auth:
    print("If you can't login interactively, you could run the following command in Azure Cloud Bash Shell.")
    print(f"az role assignment create --role 'Key Vault Secrets User' --scope {KEYVAULT_RESOURCE_URI}  --assignee {endpoint.identity.principal_id}")
    print(f"az role assignment create --role 'AcrPull' --scope {ACR_RESOURCE_URI}  --assignee {endpoint.identity.principal_id}")
else:
    !az role assignment create --role "Key Vault Secrets User" --scope {KEYVAULT_RESOURCE_URI}  --assignee {endpoint.identity.principal_id}
    !az role assignment create --role "AcrPull" --scope {ACR_RESOURCE_URI}  --assignee {endpoint.identity.principal_id}

### 3.3 Deploy to Endpoint

In [None]:
import datetime
import logging
import sys

from azure.ai.ml.entities import (
    ManagedOnlineDeployment,
    OnlineRequestSettings,
    Model,
    Environment,
)

KEYVAULT_URL = f"https://{KEYVAULT_NAME}.vault.azure.net"

deployment_name = f"deploy-{str(datetime.datetime.now().strftime('%m%d%H%M%f'))}"
sk_deployment = ManagedOnlineDeployment(
    name = deployment_name,
    model=Model(path="../src"),
    request_settings=OnlineRequestSettings(
        request_timeout_ms= 60000
    ),
    environment= Environment(
        image=f"{ACR_NAME}.azurecr.io/{ACR_IMAGE_NAME}:latest",
        name="serving",
        description="A generic serving environment, allowing customer to provide their own entry point to bring up an http server",
        inference_config = {
            "liveness_route": {"port": 5001, "path": "/health"},
            "readiness_route": {"port": 5001, "path": "/health"},
            "scoring_route": {"port": 5001, "path": "/"},
        }),
    environment_variables={
        "AZUREML_SERVING_ENTRYPOINT": "../src/sk/entry.sh",
        "OPENAI_API_KEY": f"keyvaultref:{KEYVAULT_URL}/secrets/{KV_OPENAI_KEY}",
        "OPENAI_API_TYPE": OPENAI_API_TYPE,
        "OPENAI_MODEL_ID": OPENAI_MODEL_ID,
        "OPENAI_ORG_ID": OPENAI_ORG_ID,
        "AZURE_OPENAI_API_ENDPOINT": AZURE_OPENAI_API_ENDPOINT,
        "AZURE_OPENAI_API_DEPLOYMENT_NAME": AZURE_OPENAI_API_DEPLOYMENT_NAME,
        "IS_CHAT_COMPLETION": True,
    },
    endpoint_name=online_endpoint_name,
    instance_type="Standard_F2s_v2",
    instance_count=1
)
ml_client.online_deployments.begin_create_or_update(sk_deployment).result()

endpoint.traffic = {deployment_name: 100}
ml_client.begin_create_or_update(endpoint).result()

# 4. Test
Now endpoint has been deployed, let's test it. We are going to re-use the same request when we test it locally earlier on.

In [None]:
import requests, json
from urllib.parse import urlsplit

url_parts = urlsplit(endpoint.scoring_uri)
url = url_parts.scheme + "://" + url_parts.netloc

token = ml_client.online_endpoints.get_keys(name=online_endpoint_name).primary_key
headers = {'Authorization': 'Bearer ' + token, 'Content-Type': 'application/json'}
payload = json.dumps({
  "value": "Tomorrow is Valentine's day. I need to come up with a few date ideas. She speaks French so write it in French."
})

response = requests.post(f'{url}/planner/createplan', headers=headers, data=payload)
print(f'Created Plan:\n', response.text)


In [None]:
payload = response.text
response = requests.request("POST", f'{url}/planner/executeplan', headers=headers, data=payload)
print(f'Execution Result:\n', response.text)