# APIM ❤️ OpenAI

## Vector Searching lab
![flow](../../images/vector-searching.gif)

Playground to try the [Retrieval Augmented Generation (RAG) pattern](https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview) with Azure AI Search, Azure OpenAI embeddings and Azure OpenAI completions. All the endpoints are managed via APIM.

> ℹ️ Reuses the [AI Orchestration with Azure AI Search](https://github.com/Azure/intro-to-intelligent-apps/blob/main/labs/03-orchestration/04-ACS/acs-lc-python.ipynb) notebook from the [intro to intelligent Apps workshop](https://github.com/Azure/intro-to-intelligent-apps/).

### Result
![result](result.png)

### TOC
- [0️⃣ Initialize notebook variables](#0)
- [1️⃣ Create the Azure Resource Group](#1)
- [2️⃣ Create deployment using 🦾 Bicep](#2)
- [3️⃣ Get the deployment outputs](#3)
- [4️⃣ Install packages](#4)
- [5️⃣ Create an Azure AI Search index and load movie data](#5)
- [🧪 Vector store searching using Azure AI Search](#search)
- [🧪 Bringing it All Together with Retrieval Augmented Generation (RAG) + Langchain (LC)](#langchain)
- [🗑️ Clean up resources](#clean)

### Prerequisites
- [Python 3.8 or later version](https://www.python.org/) installed
- [VS Code](https://code.visualstudio.com/) installed with the [Jupyter notebook extension](https://marketplace.visualstudio.com/items?itemName=ms-toolsai.jupyter) enabled
- [Azure CLI](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli) installed
- [An Azure Subscription](https://azure.microsoft.com/en-us/free/) with Contributor permissions
- [Access granted to Azure OpenAI](https://aka.ms/oai/access) or just enable the mock service
- [Sign in to Azure with Azure CLI](https://learn.microsoft.com/en-us/cli/azure/authenticate-azure-cli-interactively)

<a id='0'></a>
### 0️⃣ Initialize notebook variables

- Resources will be suffixed by a unique string based on your subscription id
- Adjust the location parameters according your preferences and on the [product availability by Azure region.](https://azure.microsoft.com/en-us/explore/global-infrastructure/products-by-region/?cdn=disable&products=cognitive-services,api-management) 
- Adjust the OpenAI model and version according the [availability by region.](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models) 

In [None]:
import os
import json
import datetime
import requests

deployment_name = os.path.basename(os.path.dirname(globals()['__vsc_ipynb_file__']))
resource_group_name = f"lab-{deployment_name}" # change the name to match your naming style
resource_group_location = "westeurope"
apim_resource_name = "apim"
apim_resource_location = "westeurope"
apim_resource_sku = "Basicv2"
openai_resources = [ {"name": "openai1", "location": "uksouth"} ] # list of OpenAI resources to deploy.
openai_resources_sku = "S0"
openai_model_name = "gpt-35-turbo"
openai_model_version = "0613"
openai_deployment_name = "gpt-35-turbo"
openai_api_version = "2024-02-01"
openai_specification_url='https://raw.githubusercontent.com/Azure/azure-rest-api-specs/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/stable/' + openai_api_version + '/inference.json'
openai_backend_pool = "openai-backend-pool"
mock_backend_pool = "mock-backend-pool"
mock_webapps = [ ] # mocking is not supported in this lab

# variables specific for the vector search lab
log_analytics_name = "workspace"
app_insights_name = 'insights'
openai_embeddings_deployment_name = "text-embedding-ada-002"
openai_embeddings_model_name = "text-embedding-ada-002"
openai_embeddings_model_version = "2"
searchservice_resource_name = "search"
searchservice_sku = "standard"
searchservice_api_path = "searchservice"
searchindex_api_path = "searchindex"
searchindex_name = "movies"


<a id='1'></a>
### 1️⃣ Create the Azure Resource Group
All resources deployed in this lab will be created in the specified resource group. Skip this step if you want to use an existing resource group.

In [None]:
resource_group_stdout = ! az group create --name {resource_group_name} --location {resource_group_location}
if resource_group_stdout.n.startswith("ERROR"):
    print(resource_group_stdout)
else:
    print("✅ Azure Resource Group ", resource_group_name, " created ⌚ ", datetime.datetime.now().time())

<a id='2'></a>
### 2️⃣ Create deployment using 🦾 Bicep

This lab uses [Bicep](https://learn.microsoft.com/en-us/azure/azure-resource-manager/bicep/overview?tabs=bicep) to declarative define all the resources that will be deployed. Change the parameters or the [main.bicep](main.bicep) directly to try different configurations. 

In [None]:

if len(openai_resources) > 0:
    backend_id = openai_backend_pool if len(openai_resources) > 1 else openai_resources[0].get("name")
elif len(mock_webapps) > 0:
    backend_id = mock_backend_pool if len(mock_webapps) > 1 else mock_webapps[0].get("name")

with open("policy.xml", 'r') as policy_xml_file:
    policy_template_xml = policy_xml_file.read()
    policy_xml = policy_template_xml.replace("{backend-id}", backend_id)
    policy_xml_file.close()
open("policy.xml", 'w').write(policy_xml)

bicep_parameters = {
  "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentParameters.json#",
  "contentVersion": "1.0.0.0",
  "parameters": {
    "mockWebApps": { "value": mock_webapps },
    "mockBackendPoolName": { "value": mock_backend_pool },
    "openAIBackendPoolName": { "value": openai_backend_pool },
    "openAIConfig": { "value": openai_resources },
    "openAIDeploymentName": { "value": openai_deployment_name },
    "openAISku": { "value": openai_resources_sku },
    "openAIModelName": { "value": openai_model_name },
    "openAIModelVersion": { "value": openai_model_version },
    "openAIAPISpecURL": { "value": openai_specification_url },
    "apimResourceName": { "value": apim_resource_name},
    "apimResourceLocation": { "value": apim_resource_location},
    "apimSku": { "value": apim_resource_sku},
    "logAnalyticsName": { "value": log_analytics_name },
    "applicationInsightsName": { "value": app_insights_name },
    "openAIEmbeddingsDeploymentName": { "value": openai_embeddings_deployment_name},
    "openAIEmbeddingsModelName": { "value": openai_embeddings_model_name},
    "openAIEmbeddingsModelVersion": { "value": openai_embeddings_model_version},
    "searchServiceName": { "value": searchservice_resource_name},
    "searchServiceSku": { "value": searchservice_sku},
    "searchServiceAPIPath": { "value": searchservice_api_path},
    "searchIndexAPIPath": { "value": searchindex_api_path}
  }
}
with open('params.json', 'w') as bicep_parameters_file:
    bicep_parameters_file.write(json.dumps(bicep_parameters))

! az deployment group create --name {deployment_name} --resource-group {resource_group_name} --template-file "main.bicep" --parameters "params.json"

open("policy.xml", 'w').write(policy_template_xml)


<a id='3'></a>
### 3️⃣ Get the deployment outputs

The APIM gateway URL will be used as the OpenAI and AI Search endpoints.

In [None]:
deployment_stdout = ! az deployment group show --name {deployment_name} -g {resource_group_name} --query properties.outputs.apimSubscriptionKey.value -o tsv
apim_subscription_key = deployment_stdout.n
deployment_stdout = ! az deployment group show --name {deployment_name} -g {resource_group_name} --query properties.outputs.apimResourceGatewayURL.value -o tsv
apim_resource_gateway_url = deployment_stdout.n
print("👉🏻 API Gateway URL: ", apim_resource_gateway_url)

<a id='4'></a>
### 4️⃣ Install packages

In [None]:
! pip install openai -q
! pip install azure-search-documents==11.4.0 -q
! pip install azure-identity -q
! pip install langchain==0.1.6 -q
! pip install langchain-openai -q
! pip install langchain-community==0.0.19 -q

<a id='5'></a>
### 5️⃣ Create an Azure AI Search index and load movie data
Next, we'll step through the process of configuring an Azure AI Search index to store sample [movie data](movies.csv) and then loading the data into the index.

> ℹ️ The following code is really well explained in the [intro to AI workshop](https://github.com/Azure/intro-to-intelligent-apps/blob/main/labs/03-orchestration/04-ACS/acs-lc-python.ipynb)

In [None]:
from langchain.document_loaders.csv_loader import CSVLoader
from langchain_openai import AzureOpenAIEmbeddings
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (VectorSearch, VectorSearchProfile, HnswAlgorithmConfiguration, SemanticPrioritizedFields, SemanticSearch, SemanticField, SemanticConfiguration, SimpleField, SearchableField, SearchField, SearchFieldDataType, SearchIndex)
from azure.search.documents.models import (VectorizedQuery)


loader = CSVLoader(file_path='./movies.csv', source_column='original_title', encoding='utf-8', csv_args={'delimiter':',', 'fieldnames': ['id', 'original_language', 'original_title', 'popularity', 'release_date', 'vote_average', 'vote_count', 'genre', 'overview', 'revenue', 'runtime', 'tagline']})
data = loader.load()

# Rather than load all 500 movies into Azure AI search, we will use a
# smaller subset of movie data to make things quicker. The more movies you load,
# the more time it will take for embeddings to be generated.

data = data[1:51]
print('Loaded %s movies.' % len(data))


azure_openai_embeddings = AzureOpenAIEmbeddings(
    azure_endpoint = apim_resource_gateway_url,
    openai_api_key = apim_subscription_key,
    azure_deployment = openai_embeddings_deployment_name,
    openai_api_version = openai_api_version,
    model= openai_embeddings_model_name
)

fields = [
    SimpleField(name="id", type=SearchFieldDataType.String, key=True, sortable=True, filterable=True, facetable=True),
    SearchableField(name="title", type=SearchFieldDataType.String),
    SearchableField(name="overview", type=SearchFieldDataType.String),
    SearchableField(name="genre", type=SearchFieldDataType.String),
    SearchableField(name="tagline", type=SearchFieldDataType.String),
    SearchableField(name="release_date", type=SearchFieldDataType.DateTimeOffset, sortable=True),
    SearchableField(name="popularity", type=SearchFieldDataType.Double, sortable=True),
    SearchableField(name="vote_average", type=SearchFieldDataType.Double, sortable=True),
    SearchableField(name="vote_count", type=SearchFieldDataType.Int32, sortable=True),
    SearchableField(name="runtime", type=SearchFieldDataType.Int32, sortable=True),
    SearchableField(name="revenue", type=SearchFieldDataType.Int64, sortable=True),
    SearchableField(name="original_language", type=SearchFieldDataType.String),
    SearchField(name="vector", type=SearchFieldDataType.Collection(SearchFieldDataType.Single), searchable=True, vector_search_dimensions=1536, vector_search_profile_name="movies-vector-profile"),
]

vector_search = VectorSearch(
    profiles=[VectorSearchProfile(name="movies-vector-profile", algorithm_configuration_name="movies-vector-config")],
    algorithms=[HnswAlgorithmConfiguration(name="movies-vector-config")],
)

semantic_config = SemanticConfiguration(
    name="movies-semantic-config",
    prioritized_fields=SemanticPrioritizedFields(
        title_field=SemanticField(field_name="title"),
        keywords_fields=[SemanticField(field_name="genre")],
        content_fields=[SemanticField(field_name="title"),
                        SemanticField(field_name="overview"),
                        SemanticField(field_name="tagline"),
                        SemanticField(field_name="genre"),
                        SemanticField(field_name="release_date"),
                        SemanticField(field_name="popularity"),
                        SemanticField(field_name="vote_average"),
                        SemanticField(field_name="vote_count"),
                        SemanticField(field_name="runtime"),
                        SemanticField(field_name="revenue"),
                        SemanticField(field_name="original_language")],
    )
)

semantic_search = SemanticSearch(configurations=[semantic_config])

# Create the search index with the desired vector search and semantic configurations
index = SearchIndex(
    name=searchindex_name,
    fields=fields,
    vector_search=vector_search,
    semantic_search=semantic_search
)

index_client = SearchIndexClient(
    f"{apim_resource_gateway_url}/{searchservice_api_path}",
    AzureKeyCredential(apim_subscription_key)
)

result = index_client.create_or_update_index(index)

print(f'Index {result.name} created.')

# Loop through all of the movies and create a new item for each one.

items = []
for movie in data:
    content = movie.page_content
    fields = movie.page_content.split('\n')
    movieId = (fields[0].split(': ')[1])[:-2]
    movieTitle = (fields[2].split(': ')[1])
    movieOverview = (fields[8].split(': ')[1])
    movieGenre = (fields[7].split(': ')[1])[1:-1]
    movieTagline = (fields[11].split(': ')[1])
    movieReleaseDate = (fields[4].split(': ')[1])
    moviePopularity = (fields[3].split(': ')[1])
    movieVoteAverage = (fields[5].split(': ')[1])
    movieVoteCount = (fields[6].split(': ')[1])
    movieRuntime = (fields[10].split(': ')[1])
    movieRevenue = (fields[9].split(': ')[1])
    movieOriginalLanguage = (fields[1].split(': ')[1])

    items.append(dict([
        ("id", movieId), 
        ("title", movieTitle),
        ("overview", movieOverview),
        ("genre", movieGenre),
        ("tagline", movieTagline),
        ("release_date", movieReleaseDate),
        ("popularity", moviePopularity),
        ("vote_average", movieVoteAverage),
        ("vote_count", movieVoteCount),
        ("runtime", movieRuntime),
        ("revenue", movieRevenue),
        ("original_language", movieOriginalLanguage),
        ("vector", azure_openai_embeddings.embed_query(content))
    ]))

    print(f"Movie {movieTitle} added.")

print(f"New items structure with embeddings created for {len(items)} movies.")

from azure.search.documents import SearchClient

search_client = SearchClient(
    f"{apim_resource_gateway_url}/{searchindex_api_path}",
    searchindex_name,
    AzureKeyCredential(apim_subscription_key)
)

result = search_client.upload_documents(items)

print(f"Successfully loaded {len(data)} movies into Azure AI Search index.")

<a id='search'></a>
### 🧪 Vector store searching using Azure AI Search
We've loaded the movies into Azure AI Search, so now let's experiment with some of the different types of searches you can perform.

First we'll just perform a simple keyword search.

In [None]:
query = "What are the best movies about superheroes?"

search_client = SearchClient(
    f"{apim_resource_gateway_url}/{searchindex_api_path}",
    searchindex_name,
    AzureKeyCredential(apim_subscription_key)
)

results = list(search_client.search(
    search_text=query,
    query_type="simple",
    include_total_count=True,
    top=5
))

for result in results:
    print("Movie: {}".format(result["title"]))
    print("Genre: {}".format(result["genre"]))
    print("----------")

<a id='langchain'></a>
### 🧪 Bringing it All Together with Retrieval Augmented Generation (RAG) + Langchain (LC)
Now that we have our Vector Store setup and data loaded, we are now ready to implement the RAG pattern using AI Orchestration. At a high-level, the following steps are required:

Ask the question
Create Prompt Template with inputs
Get Embedding representation of inputted question
Use embedded version of the question to search Azure AI Search (ie. The Vector Store)
Inject the results of the search into the Prompt Template & Execute the Prompt to get the completion

In [None]:
# Implement RAG using Langchain (LC)

from langchain_openai import AzureOpenAIEmbeddings
from langchain_openai import AzureChatOpenAI
from langchain.chains import LLMChain
import uuid

UUID = str(uuid.uuid4())
print(f"Request-Id: {UUID} - use this ID to trace the requests in Azure Application Insights.")

azure_openai_embeddings = AzureOpenAIEmbeddings(
    azure_endpoint = apim_resource_gateway_url,
    openai_api_key = apim_subscription_key,
    azure_deployment = openai_embeddings_deployment_name,
    openai_api_version = openai_api_version,
    model= openai_embeddings_model_name
)

azure_openai = AzureChatOpenAI(
    default_headers = {"Request-Id": UUID},
    azure_endpoint = apim_resource_gateway_url,
    openai_api_key = apim_subscription_key,
    azure_deployment = openai_deployment_name,
    openai_api_version = openai_api_version,
    model= openai_model_name
)

# Ask the question
query = "What are the best movies about superheroes?"

# Create a prompt template with variables, note the curly braces
from langchain.prompts import PromptTemplate
prompt = PromptTemplate(
    input_variables=["original_question","search_results"],
    template="""
    Question: {original_question}

    Do not use any other data.
    Only use the movie data below when responding.
    Provide detailed information about the synopsis of the movie.
    {search_results}
    """,
)

# Search Vector Store
search_client = SearchClient(
    f"{apim_resource_gateway_url}/{searchindex_api_path}",
    searchindex_name,
    AzureKeyCredential(apim_subscription_key)    
)

vector = VectorizedQuery(vector=azure_openai_embeddings.embed_query(query), k_nearest_neighbors=5, fields="vector")

results = list(search_client.search(
    search_text=query,
    query_type="simple",
    semantic_configuration_name="movies-semantic-config",
    include_total_count=True,
    vector_queries=[vector],
    select=["title","genre","overview","tagline","release_date","popularity","vote_average","vote_count","runtime","revenue","original_language"],
    top=5,
    headers={"Request-Id": UUID}
))

# Build the Prompt and Execute against the Azure OpenAI to get the completion
chain = LLMChain(llm=azure_openai, prompt=prompt, verbose=False)
response = chain.invoke({"original_question": query, "search_results": results})
print(response['text'])

<a id='clean'></a>
### 🗑️ Clean up resources

When you're finished with the lab, you should remove all your deployed resources from Azure to avoid extra charges and keep your Azure subscription uncluttered.
Use the [clean-up-resources notebook](clean-up-resources.ipynb) for that.