# Build RAG w Azure AI Search

This notebook provides sample script for the indexing pipeline in [Build a RAG solution in Azure AI Search](https://learn.microsoft.com/azure/search/tutorial-rag-build-solution). 

Steps in this notebook include:

- Set up the environment
- Set up the Azure resources used in the pipeline
- Create an index, data source, skillset, and indexer on Azure AI Search
- Send a query to an LLM to chat with your data

Sample data is a collection of PDF's that you load into Azure Blob Storage and retrieve during indexing.

This tutorial assumes embedding and chat models on Azure OpenAI so that you can use the integrated vectorization capabilities of Azure AI Search. 

## Prerequisites

You need the following Azure resources to run all of the script in this notebook.

- [Azure Storage](https://learn.microsoft.com/azure/storage/common/storage-account-create), general purpose account, used for storing the PDFs.

- [Azure OpenAI](https://learn.microsoft.com/azure/ai-services/openai/how-to/create-resource) provides the embedding and chat models.

- [Azure AI Services multiservice account](https://learn.microsoft.com/azure/ai-services/multi-service-resource), in the same region as Azure AI Search

- [Azure AI Search](https://learn.microsoft.com/azure/search/search-create-service-portal), basic tier or higher is recommended. Choose the same region as Azure OpenAI and Azure AI multiservice.

To meet the same-region requirement, start by reviewing the [regions for the embedding and chat models](https://learn.microsoft.com/azure/ai-services/openai/concepts/models#model-summary-table-and-region-availability) you want to use. Once you identify a region, confirm that Azure AI Search with AI services integration is available in the [same region](https://learn.microsoft.com/azure/search/search-region-support#azure-public-regions).

## Set up Azure resources using the Azure portal

We recommend using the Azure portal for setting up resources.

You must be a subscription **Owner** or **User Access Administrator** to create roles. If you don't have permission to create roles, you can use API keys instead. If you're using keys, you can skip the steps that enable system assigned managed identities.

1. Download the required PDF files.

1. Sign in to the [Azure portal](https://portal.azure.com).

1. Make sure Azure Blob Storage , Azure AI Search, Azure OpenAI, and Azure AI multiservice resources are in the same region.

### Configure Azure Storage

1. On the Azure Storage left menu, select **Storage browser** > **Blob containers**, and then **Add container**.

1. Name the container *index-and-chat*.

1. On the left menu, select **Settings** > **Identity** and turn on system assigned managed identity.

### Configure Azure AI Search

1. On the left menu, select **Settings** > **Keys** to grab the same.

1. On the left menu, select **Settings** > **Identity** and turn on system assigned managed identity.

### Configure Azure OpenAI

Deploy the following models on Azure OpenAI:

- text-embedding-3-large on Azure OpenAI for embeddings
- gpt-4o on Azure OpenAI for chat completion

You must have [**Cognitive Services OpenAI Contributor**]( /azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-contributor) or higher to deploy models in Azure OpenAI.

1. Go to [Azure OpenAI Studio](https://oai.azure.com/).

1. Select **Deployments** on the left menu.

1. Select **Deploy model** > **Deploy base model**.

1. Select **text-embedding-3-large** from the dropdown list and confirm the selection.

1. Specify a deployment name. We recommend "text-embedding-3-large".

1. Accept the defaults.

1. Select **Deploy**.

1. Repeat the previous steps for **gpt-4o**.


### Configure search engine role-based access to Azure Storage

1. Sign in to the [Azure portal](https://portal.azure.com) and find your storage account.

1. On the left menu, select **Access control (IAM)**.

1. Add a role for **Storage Blob Data Reader**, assigned to the search service system-managed identity.

### Configure search engine role-based access to Azure models

Assign yourself *and* the search service identity permissions on Azure OpenAI. The code for this tutorial runs locally. Requests to Azure OpenAI originate from your system. Also, embedding requests and query reponses from the search engine are passed to Azure OpenAI. For these reasons, both you and the search service need permissions on Azure OpenAI.

1. Sign in to the [Azure portal](https://portal.azure.com) and find your Azure OpenAI resource.

1. On the left menu, select **Access control (IAM)**.

1. Add a role for [**Cognitive Services OpenAI User**](/azure/ai-services/openai/how-to/role-based-access-control#cognitive-services-openai-userpermissions).

1. Select **Managed identity** and then select **Members**. Find the system-managed identity for your search service in the dropdown list.

1. Next, select **User, group, or service principal** and then select **Members**. Search for your user account and then select it from the dropdown list.

1. Select **Review and Assign** to create the role assignments.

This step concludes provisioning services in the Azure portal. Continuing to the next section, you switch to Visual Studio Code and a local environment.

## Create a virtual environment in Visual Studio Code

Create a virtual environment so that you can install the dependencies in isolation.

1. In Visual Studio Code, open the folder containing index-and-chat.ipynb.

1. Press Ctrl-shift-P to open the command palette, search for "Python: Create Environment", and then select `Venv` to create a virtual environment in the current workspace.

1. Select Tutorial-RAG\tutorial-rag-requirements.txt for the dependencies.

It takes several minutes to create the environment. When the environment is ready, continue to the next step.

### Install packages

In [None]:
! pip install -r requirements-nb.txt --quiet

### Load .env file (Copy .env-sample to .env and update accordingly)

Set the appropriate environment variables below:

1. Use the [Document Layout Skill](https://learn.microsoft.com/en-us/azure/search/cognitive-search-skill-document-intelligence-layout) to convert PDFs and other compatible documents to markdown. It requires an [AI Services account](https://learn.microsoft.com/en-us/azure/search/cognitive-search-attach-cognitive-services) and a search service in a [supported region](https://learn.microsoft.com/en-us/azure/search/cognitive-search-attach-cognitive-services)
   1. Specify `AZURE_AI_SERVICES_KEY` if using key-based authentication, and specify `AZURE_AI_SERVICES_ENDPOINT`.


In [1]:
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.core.credentials import AzureKeyCredential
import os

load_dotenv(override=True) # take environment variables from .env.

# Variables not used here do not need to be updated in your .env file
endpoint = os.environ["AZURE_SEARCH_SERVICE_ENDPOINT"]
credential = AzureKeyCredential(os.getenv("AZURE_SEARCH_ADMIN_KEY")) if os.getenv("AZURE_SEARCH_ADMIN_KEY") else DefaultAzureCredential()
index_namespace = os.getenv("AZURE_SEARCH_INDEX_NAMESPACE", "index-and-chat")
blob_connection_string = os.environ["BLOB_CONNECTION_STRING"]
# search blob datasource connection string is optional - defaults to blob connection string
# This field is only necessary if you are using MI to connect to the data source
# https://learn.microsoft.com/azure/search/search-howto-indexing-azure-blob-storage#supported-credentials-and-connection-strings
search_blob_connection_string = os.getenv("SEARCH_BLOB_DATASOURCE_CONNECTION_STRING", blob_connection_string)
blob_container_name = os.getenv("BLOB_CONTAINER_NAME", "index-and-chat")
azure_openai_endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]
azure_openai_key = os.getenv("AZURE_OPENAI_KEY")
azure_openai_api_version = os.environ["AZURE_OPENAI_API_VERSION"]
azure_openai_embedding_deployment = os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT", "text-embedding-3-large")
azure_openai_model_name = os.getenv("AZURE_OPENAI_EMBEDDING_MODEL_NAME", "text-embedding-3-large")
azure_openai_model_dimensions = int(os.getenv("AZURE_OPENAI_EMBEDDING_DIMENSIONS", 3072))
azure_openai_chat_deployment = os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT"]
azure_ai_services_endpoint = os.environ["AZURE_AI_SERVICES_ENDPOINT"]
# This field is only necessary if you want to authenticate using a key to Azure AI Services
azure_ai_services_key = os.getenv("AZURE_AI_SERVICES_KEY", "")

# Deepest nesting level in markdown that should be considered. See https://learn.microsoft.com/azure/search/cognitive-search-skill-document-intelligence-layout to learn more
document_layout_depth = os.getenv("LAYOUT_MARKDOWN_HEADER_DEPTH", "h3")

## Connect to Blob Storage and load documents

Retrieve documents from Blob Storage. You can use the sample documents in the data/documents folder.  

In [None]:
from azure.storage.blob import BlobServiceClient  
import glob

def upload_sample_documents(
        blob_connection_string: str,
        blob_container_name: str,
        documents_directory: str,
        # Set to false if you want to use credentials included in the blob connection string
        # Otherwise your identity will be used as credentials
        use_user_identity: bool = True,
    ):
        # Connect to Blob Storage
        blob_service_client = BlobServiceClient.from_connection_string(logging_enable=True, conn_str=blob_connection_string, credential=DefaultAzureCredential() if use_user_identity else None)
        container_client = blob_service_client.get_container_client(blob_container_name)
        if not container_client.exists():
            container_client.create_container()

        pdf_files = glob.glob(os.path.join(documents_directory, '*.pdf'))
        for file in pdf_files:
            with open(file, "rb") as data:
                name = os.path.basename(file)
                if not container_client.get_blob_client(name).exists():
                    container_client.upload_blob(name=name, data=data)

upload_sample_documents(
    blob_connection_string=blob_connection_string,
    blob_container_name=blob_container_name,
    documents_directory = os.path.join("..", "..", "..", "..", "data", "benefitdocs")
    # documents_directory=r"your-local-path-to-sample-documents"
)

print(f"Setup sample data in {blob_container_name}")

Setup sample data in index-and-chat


## Create a blob data source connector on Azure AI Search

In [3]:
from azure.search.documents.indexes import SearchIndexerClient
from azure.search.documents.indexes.models import (
    SearchIndexerDataContainer,
    SearchIndexerDataSourceConnection
)
from azure.search.documents.indexes.models import NativeBlobSoftDeleteDeletionDetectionPolicy

# Create a data source 
indexer_client = SearchIndexerClient(endpoint, credential)
container = SearchIndexerDataContainer(name=blob_container_name)
data_source_connection = SearchIndexerDataSourceConnection(
    name=f"{index_namespace}-blob",
    type="azureblob",
    connection_string=search_blob_connection_string,
    container=container,
    data_deletion_detection_policy=NativeBlobSoftDeleteDeletionDetectionPolicy()
)
data_source = indexer_client.create_or_update_data_source_connection(data_source_connection)

print(f"Data source '{data_source.name}' created or updated")

Data source 'index-and-chat-blob' created or updated


## Create search indexes

Vector and nonvector content is stored in a search index.
There's 1 index for the chunks and 1 index for the parent markdown documents

In [4]:
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
    SearchField,
    SearchFieldDataType,
    VectorSearch,
    HnswAlgorithmConfiguration,
    VectorSearchProfile,
    AzureOpenAIVectorizer,
    AzureOpenAIVectorizerParameters,
    SemanticConfiguration,
    SemanticSearch,
    SemanticPrioritizedFields,
    SemanticField,
    SearchIndex,
    BinaryQuantizationCompression
)

# Create a search index  
index_client = SearchIndexClient(endpoint=endpoint, credential=credential)  
child_index_fields = [  
    SearchField(name="parent_id", type=SearchFieldDataType.String, sortable=True, filterable=True, facetable=True),  
    SearchField(name="title", type=SearchFieldDataType.String),  
    SearchField(name="chunk_id", type=SearchFieldDataType.String, key=True, sortable=True, filterable=True, facetable=True, analyzer_name="keyword"),  
    SearchField(name="chunk", type=SearchFieldDataType.String, sortable=False, filterable=False, facetable=False),  
    SearchField(name="vector", type=SearchFieldDataType.Collection(SearchFieldDataType.Single), stored=False, vector_search_dimensions=azure_openai_model_dimensions, vector_search_profile_name="myHnswProfile"), 
    SearchField(name="header_1", type=SearchFieldDataType.String, sortable=False, filterable=False, facetable=False),
    SearchField(name="header_2", type=SearchFieldDataType.String, sortable=False, filterable=False, facetable=False),
    SearchField(name="header_3", type=SearchFieldDataType.String, sortable=False, filterable=False, facetable=False) 
]

parent_index_fields = [  
    SearchField(name="parent_id", type=SearchFieldDataType.String, key=True, sortable=True, filterable=True, facetable=True),  
    SearchField(name="title", type=SearchFieldDataType.String, searchable=True, filterable=True, sortable=False, facetable=True),  
    SearchField(name="content", type=SearchFieldDataType.String, searchable=True, filterable=False, sortable=False, facetable=False), 
    SearchField(name="metadata_storage_path", type=SearchFieldDataType.String, filterable=True, sortable=False, facetable=True)
]

  
# Configure the vector search configuration  
vector_search = VectorSearch(  
    algorithms=[  
        HnswAlgorithmConfiguration(name="myHnsw"),
    ],  
    profiles=[  
        VectorSearchProfile(  
            name="myHnswProfile",  
            algorithm_configuration_name="myHnsw",  
            vectorizer_name="myOpenAI",  
            compression_name="binaryQuantization"
        )
    ],  
    vectorizers=[  
        AzureOpenAIVectorizer(  
            vectorizer_name="myOpenAI",  
            kind="azureOpenAI",  
            parameters=AzureOpenAIVectorizerParameters(  
                resource_url=azure_openai_endpoint,  
                deployment_name=azure_openai_embedding_deployment,
                model_name=azure_openai_model_name,
                api_key=azure_openai_key,
            ),
        ),  
    ],
    compressions=[
        BinaryQuantizationCompression(compression_name="binaryQuantization")
    ]
)
  
semantic_config = SemanticConfiguration(  
    name="my-semantic-config",  
    prioritized_fields=SemanticPrioritizedFields(  
        content_fields=[SemanticField(field_name="chunk")],
        title_field=SemanticField(field_name="title")
    ),
)
  
# Create the semantic search with the configuration  
semantic_search = SemanticSearch(configurations=[semantic_config])  
  
# Create the search indexes
parent_index = SearchIndex(name=f"{index_namespace}-parent", fields=parent_index_fields)  
child_index = SearchIndex(name=f"{index_namespace}-child", fields=child_index_fields, vector_search=vector_search, semantic_search=semantic_search)
result = index_client.create_or_update_index(parent_index)  
print(f"{result.name} created")
result = index_client.create_or_update_index(child_index)  
print(f"{result.name} created")


index-and-chat-parent created
index-and-chat-child created


## Create a skillset

Skills drive integrated vectorization. [Text Split](https://learn.microsoft.com/azure/search/cognitive-search-skill-textsplit) provides data chunking. [AzureOpenAIEmbedding](https://learn.microsoft.com/azure/search/cognitive-search-skill-azure-openai-embedding) handles calls to Azure OpenAI, using the connection information you provide in the environment variables. An [indexer projection](https://learn.microsoft.com/azure/search/index-projections-concept-intro) specifies secondary indexes used for chunked data.

In [5]:
from azure.search.documents.indexes.models import (
    SplitSkill,
    InputFieldMappingEntry,
    OutputFieldMappingEntry,
    AzureOpenAIEmbeddingSkill,
    MergeSkill,
    SearchIndexerIndexProjection,
    SearchIndexerIndexProjectionSelector,
    SearchIndexerIndexProjectionsParameters,
    IndexProjectionMode,
    SearchIndexerSkillset,
    AIServicesAccountKey,
    AIServicesAccountIdentity,
    DocumentIntelligenceLayoutSkill
)

# Create a skillset name 
skillset_name = f"{index_namespace}-skillset"


layout_skill = DocumentIntelligenceLayoutSkill(
    description="Layout skill to read documents",
    context="/document",
    output_mode="oneToMany",
    markdown_header_depth="h3",
    inputs=[
        InputFieldMappingEntry(name="file_data", source="/document/file_data")
    ],
    outputs=[
        OutputFieldMappingEntry(name="markdown_document", target_name="markdownDocument")
    ]
)

split_skill = SplitSkill(  
    description="Split skill to chunk documents",  
    text_split_mode="pages",  
    context="/document/markdownDocument/*",  
    maximum_page_length=2000,  
    page_overlap_length=500,  
    inputs=[  
        InputFieldMappingEntry(name="text", source="/document/markdownDocument/*/content"),  
    ],  
    outputs=[  
        OutputFieldMappingEntry(name="textItems", target_name="pages")  
    ]
)

merge_skill = MergeSkill(
    description="Merge skill to get full document content",
    insert_pre_tag="",
    insert_post_tag="\n",
    context="/document",
    inputs=[
        InputFieldMappingEntry(name="itemsToInsert", source="/document/markdownDocument/*/content")
    ],
    outputs=[
        OutputFieldMappingEntry(name="mergedText", target_name="content")
    ]
)

embedding_skill = AzureOpenAIEmbeddingSkill(  
    description="Skill to generate embeddings via Azure OpenAI",  
    context="/document/markdownDocument/*/pages/*",  
    resource_url=azure_openai_endpoint,  
    deployment_name=azure_openai_embedding_deployment, 
    model_name=azure_openai_model_name,
    dimensions=azure_openai_model_dimensions,
    api_key=azure_openai_key,  
    inputs=[  
        InputFieldMappingEntry(name="text", source="/document/markdownDocument/*/pages/*"),  
    ],  
    outputs=[
        OutputFieldMappingEntry(name="embedding", target_name="vector")  
    ]
)

index_projections = SearchIndexerIndexProjection(  
    selectors=[  
        SearchIndexerIndexProjectionSelector(  
            target_index_name=child_index.name,  
            parent_key_field_name="parent_id",  
            source_context="/document/markdownDocument/*/pages/*",  
            mappings=[
                InputFieldMappingEntry(name="chunk", source="/document/markdownDocument/*/pages/*"),  
                InputFieldMappingEntry(name="title", source="/document/metadata_storage_name"),
                InputFieldMappingEntry(name="vector", source="/document/markdownDocument/*/pages/*/vector"),
                InputFieldMappingEntry(name="header_1", source="/document/markdownDocument/*/sections/h1"),
                InputFieldMappingEntry(name="header_2", source="/document/markdownDocument/*/sections/h2"),
                InputFieldMappingEntry(name="header_3", source="/document/markdownDocument/*/sections/h3"),
            ]
        )
    ],  
    parameters=SearchIndexerIndexProjectionsParameters(  
        projection_mode=IndexProjectionMode.INCLUDE_INDEXING_PARENT_DOCUMENTS  
    )  
)

skills = [layout_skill, split_skill, merge_skill, embedding_skill]

skillset = SearchIndexerSkillset(  
    name=skillset_name,  
    description="Skillset to chunk documents and generating embeddings",  
    skills=skills,  
    index_projection=index_projections,
    cognitive_services_account=AIServicesAccountKey(key=azure_ai_services_key, subdomain_url=azure_ai_services_endpoint) if azure_ai_services_key else AIServicesAccountIdentity(identity=None, subdomain_url=azure_ai_services_endpoint)
)

client = SearchIndexerClient(endpoint, credential)  
client.create_or_update_skillset(skillset)  
print(f"{skillset.name} created")  


index-and-chat-skillset created


## Create an indexer

In [6]:
from azure.search.documents.indexes.models import (
    SearchIndexer,
    IndexingParameters,
    IndexingParametersConfiguration,
    FieldMapping
)

# Create an indexer  
indexer_name = f"{index_namespace}-indexer"  

indexer_parameters = IndexingParameters(
    configuration=IndexingParametersConfiguration(
        allow_skillset_to_read_file_data=True,
        data_to_extract="storageMetadata",
        query_timeout=None))

indexer = SearchIndexer(  
    name=indexer_name,  
    description="Indexer to index documents and generate embeddings",  
    skillset_name=skillset_name,  
    target_index_name=parent_index.name,  
    data_source_name=data_source.name,
    parameters=indexer_parameters,
    field_mappings=[
        FieldMapping(source_field_name="metadata_storage_name", target_field_name="title"),
    ],
    output_field_mappings=[
        FieldMapping(source_field_name="/document/content", target_field_name="content"),
    ]
)  

indexer_client = SearchIndexerClient(endpoint, credential)  
indexer_result = indexer_client.create_or_update_indexer(indexer)  
  
# Run the indexer  
indexer_client.run_indexer(indexer_name)  
print(f' {indexer_name} is created and running. If queries return no results, please wait a bit and try again.')  


 index-and-chat-indexer is created and running. If queries return no results, please wait a bit and try again.


### Chat with your data

Below are the two strategies you can use to chat with your data

In [None]:
import asyncio
from typing import List
from azure.search.documents.aio import SearchClient
from azure.search.documents.models import VectorizableTextQuery
from openai import AsyncAzureOpenAI
from openai.types.chat import ChatCompletion, ChatCompletionSystemMessageParam, ChatCompletionUserMessageParam, ChatCompletionMessage, ChatCompletionMessageParam
from azure.identity.aio import DefaultAzureCredential, get_bearer_token_provider
from pydantic import BaseModel, Field
from typing import Optional

token_provider = get_bearer_token_provider(DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default")
parent_index_client = SearchClient(endpoint=endpoint, index_name=parent_index.name, credential=credential)
child_index_client = SearchClient(endpoint=endpoint, index_name=child_index.name, credential=credential)

client = AsyncAzureOpenAI(
    api_version=azure_openai_api_version,
    azure_endpoint=azure_openai_endpoint,
    api_key=azure_openai_key,
    azure_ad_token_provider=token_provider if not azure_openai_key else None
)

# This code can be customized to extract different entities from the query based on your requirements.
# See https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/structured-outputs for more information
# NOTE: Updating the tool definition with specific examples related to your data will help improve the accuracy.
class ExtractTitles(BaseModel):
    """Extracts titles from a query to use in a search filter."""
    titles: Optional[List[str]] = Field(..., description="List of titles extracted from the query. Complete file names are considered titles. If there are no titles in the query, provide an empty list. For example, in the query 'Find the report on sales and the summary of the meeting using 'myreport.pdf', the titles would be ['myreport.pdf']. If no titles are found, return an empty list.")

async def extract_titles(query: str) -> List[str]:
   response: ChatCompletion = await client.beta.chat.completions.parse(
      model=azure_openai_chat_deployment,
      messages=[
         ChatCompletionSystemMessageParam(role="system", content="You are a helpful assistant that extracts titles from user queries."),
         ChatCompletionUserMessageParam(role="user", content=f"Extract the titles from the following query: '{query}'"),
      ],
      response_format=ExtractTitles
   )

   return response.choices[0].message.parsed.titles

async def answer_query_documents(query: str, chat_history: Optional[List[ChatCompletionMessageParam]] = [], include_parent_documents: bool = True, include_child_documents: bool = True) -> List[ChatCompletionMessageParam]:
   if not include_parent_documents and not include_child_documents:
      raise ValueError("At least one of include_parent_documents or include_child_documents must be True.")

   formatted_results = ""
   titles = []

   if include_parent_documents:
      # Step 1: Extract titles from the query
      titles = await extract_titles(query)

      # Step 2: If we found titles, include them in the query of the parent index
      if titles:
         results = await parent_index_client.search(
            filter=" or ".join([f"title eq '{title}'" for title in titles]),  # Filter by titles, must be exact match
            top=len(titles),  # Limit to top results
            select=["title", "content"])
         formatted_results = "\n".join([f"{result['title']}\n{result['content']}" async for result in results])


   if len(formatted_results) == 0 and include_child_documents:
      # If no titles were found or no results were returned, search the child index with a vectorized query
      results = await child_index_client.search(
         search_text=query,
         vector_queries=[VectorizableTextQuery(text=query, k_nearest_neighbors=50, fields="vector")],  # Use vector search with k nearest neighbors
         query_type="semantic",
         semantic_configuration_name=child_index.semantic_search.configurations[0].name,  # Use the semantic configuration created earlier
         top=5,  # Limit to top 5 results
         select=["title", "chunk"]
      )

      # Format the results from the child index
      formatted_results = "\n".join([f"{result['title']}\n{result['chunk']}" async for result in results])

   assistant_system_message = "You are a helpful assistant that answers queries. You do not have access to the internet, but you can use documents in the chat history to answer the question. If the documents do not contain the answer, say 'I don't know'. You must cite your answer with the titles of the documents used. If you are unsure, say 'I don't know'."
   query_message = f"Answer the following query: {query}\nRelevant documents: {formatted_results}"
   messages = chat_history + [ ChatCompletionUserMessageParam(role="user", content=query_message) ] if chat_history else [
      ChatCompletionSystemMessageParam(role="system", content=assistant_system_message),
      ChatCompletionUserMessageParam(role="user", content=query_message),
   ] 
   
   response: ChatCompletion = await client.chat.completions.create(
      model=azure_openai_chat_deployment,
      messages=messages
   )

   message: ChatCompletionMessage = response.choices[0].message
   if titles:
      message.content += f"\nTitles used for the answer: {', '.join(titles)}"

   return messages + [message]


def get_last_answer(chat_history: List[ChatCompletionMessageParam]) -> Optional[str]:
    """Prints the last assistant message from the chat history."""
    if chat_history and chat_history[-1].role == "assistant":
        return chat_history[-1].content
    
    return None


In [None]:
async def run_query(messages: List[str]) -> List[str]:
    chat_history = None
    answers = []
    for message in messages:
        chat_history = await answer_query_documents(message, chat_history)
        answers.append(get_last_answer(chat_history))

    return answers

In [None]:
queries = [
    "Put your test cases here"
]

In [None]:
await extract_titles(queries[0][0])

In [None]:
import pandas as pd

# Create a semaphore that allows up to 3 concurrent run_query calls.
semaphore = asyncio.Semaphore(3)

rows = []
i = 1
tasks = []
async def task(i: int, query_group: List[str]):
    """Run the query and return the answers."""
    async with semaphore:
        answers = await run_query(query_group)
    return [{
        "Test Case": i + j,
        "Query": query.strip(),
        "Answer": answer.strip() if answer else "No answer provided"
    } for j, (query, answer) in enumerate(zip(query_group, answers))]

for i, query_group in enumerate(queries):
    t = task(i, query_group)
    tasks.append(t)

# Collect results from all tasks
results = await asyncio.gather(*tasks)
for result in results:
    for row in result:
        rows.append(row)

df = pd.DataFrame(rows)

In [None]:
with open("results_final.txt", "w", encoding="utf-8") as outfile:
    for _, row in df.iterrows():
        test_case = row.get("Test Case", "")
        question = row.get("Query", "")
        answer = row.get("Answer", "")
        outfile.write(f"test case {test_case}\n")
        outfile.write(f"{question}\n")
        outfile.write(f"{answer}\n\n")

#### Different Strategies

1. If you include the title of a PDF in your question, the search is automatically filtered to only include those PDFs. You can disable this behavior by passing `include_parent_documents=False` to `answer_query_documents`
2. If you don't include any titles, normal chunk search is used. You can disable this behavior by passing `include_child_documents=False` to `answer_query_documents`

In [8]:
chat_history = await answer_query_documents("Use Benefit_Options.pdf. What are the health insurance policies?")
get_last_answer(chat_history)


'The health insurance policies offered are:\n\n1. **Northwind Health Plus**: A comprehensive plan covering medical, vision, and dental services, prescription drug coverage (including generic, brand-name, and specialty drugs), mental health and substance abuse, preventive care, emergency services (both in-network and out-of-network), and more. It also includes routine physicals, well-child visits, immunizations, vision exams, glasses, contact lenses, dental exams, cleanings, fillings, hospital stays, doctor visits, lab tests, and X-rays.\n\n2. **Northwind Standard**: A basic plan covering medical, vision, and dental services, prescription drug coverage (only generic and brand-name drugs), and preventive care. It does not cover emergency services, mental health and substance abuse, or out-of-network services. It includes routine physicals, well-child visits, immunizations, vision exams, glasses, doctor visits, and lab tests.\n\nThese plans have different costs deducted from each paycheck

In [9]:
chat_history = await answer_query_documents("What are the benefits of the new health insurance policy?")
get_last_answer(chat_history)

'The new health insurance policy, Northwind Health Plus, provides comprehensive coverage for medical, vision, and dental services. Its benefits include:\n\n1. **Prescription Drug Coverage**: Northwind Health Plus covers a wide range of prescription drugs, including generic, brand-name, and specialty drugs.\n\n2. **Preventive Care**: This plan covers routine physicals, well-child visits, immunizations, mammograms, colonoscopies, and other cancer screenings.\n\n3. **Mental Health and Substance Abuse Coverage**: The plan includes coverage for mental health and substance abuse services.\n\n4. **Emergency Services**: Coverage is provided for both in-network and out-of-network emergency services.\n\n5. **Vision and Dental Services**: Northwind Health Plus covers vision exams, glasses, contact lenses, dental exams, cleanings, and fillings.\n\n6. **Out-of-Network Services**: Unlike Northwind Standard, Northwind Health Plus includes coverage for out-of-network services.\n\nThese benefits make N