### 1. Overview
Advanced Search Techniques with Azure AI Search: Keyword, Vector, and Hybrid Methods

This notebook demonstrates how to perform different types of searches using Azure AI Search, including keyword search, vector search, hybrid search, semantic ranking, and query rewriting.

### 2. Set Up Environment Variables
Just like for Journey 1, create the `.env` file in the same directory as this notebook and update the variables.
You can use the `.env.sample` file to see which variables are needed.

After setting up, the notebook will automatically load these values using dotenv.

### 3. Load Environment Variables

Run the following command to load environment variables from the .env file:

In [3]:
import os
from azure.core.credentials import AzureKeyCredential
from azure.identity import DefaultAzureCredential
from dotenv import load_dotenv

load_dotenv(override=True) # take environment variables from .env.

endpoint = os.environ["AZURE_SEARCH_ENDPOINT"]
index_name = os.environ["AZURE_SEARCH_INDEX_NAME"]
credential = AzureKeyCredential(os.getenv("AZURE_SEARCH_ADMIN_KEY")) if os.getenv("AZURE_SEARCH_ADMIN_KEY") else DefaultAzureCredential()

ModuleNotFoundError: No module named 'azure'

This will ensure all necessary credentials are available before setting up the API client.

### 4. Set Up API Client and Define the Display Function

Initialize the Azure AI Search Client for interacting with the Azure Search service and make the search results easier to read by defining a function that formats and displays results:

In [6]:
from azure.search.documents import SearchClient
import pandas as pd

search_client = SearchClient(endpoint, index_name, credential)

def display_results(results):
    df = pd.json_normalize(list(results)).dropna(axis=1, how='all')
    df["chunk"] = df["chunk"].apply(lambda c: c[:300] + '...' if len(c) > 300 else c)
    first_cols = ['title', 'chunk', '@search.score']
    df = df[first_cols + [col for col in df.columns if col not in first_cols]]

    df = df.style.set_properties(**{
        'max-width': '500px',
        'text-align': 'left',
        'white-space': 'normal',
        'word-wrap': 'break-word'
    }).hide(axis="index")


    return df


### 5. Perform Different Search Methods

#### Keyword Search

Execute a traditional keyword-based search:

In [7]:
results = search_client.search(search_text="What is Contoso", top=5, select=["title", "chunk"])

display_results(results)


title,chunk,@search.score
Northwind_Health_Plus_Benefits_Details.pdf,"the tips outlined above, you can help ensure that your request for services or treatments is approved in a timely manner and that you are receiving the most appropriate care. The Group And You OTHER INFORMATION ABOUT THIS PLAN The Group and You The Northwind Health Plus plan is a gro...",4.684504
Northwind_Standard_Benefits_Details.pdf,"At Contoso, we understand that medical costs can be intimidating and confusing, which is why we’ve partnered with Northwind Health to offer our employees the Northwind Standard plan. This plan provides a balance billing protection, meaning that you are protected from unexpected costs when visi...",4.370165
Northwind_Standard_Benefits_Details.pdf,"providers that are not available from participating providers. Additionally, in some cases, the health plan may cover non-participating providers’ charges if there are no participating providers in your area. Tips In order to avoid costly balance billing amounts, it is important to make sure...",3.893067
Northwind_Health_Plus_Benefits_Details.pdf,"about your health care. With the Northwind Health Plus Plan, you can take advantage of the coverage provided for these services and get the treatment you need. Substance Use Disorder Substance Use Disorder Coverage At Contoso, we are proud to offer our employees Northwind Health Plus, an i...",3.590473
Northwind_Standard_Benefits_Details.pdf,"• Understand any restrictions associated with any government-sponsored programs you may be enrolled in. • Your Northwind Standard plan does not cover certain services, such as emergency care, mental health and substance abuse coverage, or out-of-network services. Be sure to explore alternat...",3.263199


#### Vector Search

Retrieve documents using vector similarity search:

In [24]:
from azure.search.documents.models import VectorizableTextQuery

results = search_client.search(vector_queries=[VectorizableTextQuery(text="What is Contoso", k_nearest_neighbors=50, fields="text_vector")], top=5, select=["title", "chunk"])

display_results(results)

title,chunk,@search.score
employee_handbook.pdf,Contoso Electronics Employee Handbook This document contains information generated using a language model (Azure OpenAI). The information contained in this document is only for demonstration purposes and does not reflect the opinions or beliefs of Microsoft. Microsoft ...,0.834683
employee_handbook.pdf,as our partners and vendors. We may use this information to better understand our customers and improve our services. Contoso Electronics will not sell or rent your personal information to any third parties. Data Security and Protection Contoso Electronics is committed to protecting...,0.830409
employee_handbook.pdf,will respond promptly and appropriately. All incidents will be thoroughly investigated and the appropriate disciplinary action will be taken. Training and Education Contoso Electronics will provide regular training and education to all employees on workplace violence prevention and r...,0.82996
Benefit_Options.pdf,Contoso Electronics Plan and Benefit Packages This document contains information generated using a language model (Azure OpenAI). The information contained in this document is only for demonstration purposes and does not reflect the opinions or beliefs of Microsoft. Microsoft makes no represen...,0.811813
employee_handbook.pdf,"aerospace industry, providing advanced electronic components for both commercial and military aircraft. We specialize in creating cutting- edge systems that are both reliable and efficient. Our mission is to provide the highest quality aircraft components to our customers, while maintaining a c...",0.807292


#### Hybrid Search (Keyword + Vector Search)

Combine keyword and vector searches for better accuracy:

In [23]:
results = search_client.search(
    search_text="What is Contoso",
    vector_queries=[VectorizableTextQuery(text="What is Contoso", k_nearest_neighbors=50, fields="text_vector")],
    top=5,
    select=["title", "chunk"]
)

display_results(results)

title,chunk,@search.score
employee_handbook.pdf,Contoso Electronics Employee Handbook This document contains information generated using a language model (Azure OpenAI). The information contained in this document is only for demonstration purposes and does not reflect the opinions or beliefs of Microsoft. Microsoft ...,0.031592
employee_handbook.pdf,will respond promptly and appropriately. All incidents will be thoroughly investigated and the appropriate disciplinary action will be taken. Training and Education Contoso Electronics will provide regular training and education to all employees on workplace violence prevention and r...,0.030214
Northwind_Health_Plus_Benefits_Details.pdf,"the tips outlined above, you can help ensure that your request for services or treatments is approved in a timely manner and that you are receiving the most appropriate care. The Group And You OTHER INFORMATION ABOUT THIS PLAN The Group and You The Northwind Health Plus plan is a gro...",0.03018
employee_handbook.pdf,as our partners and vendors. We may use this information to better understand our customers and improve our services. Contoso Electronics will not sell or rent your personal information to any third parties. Data Security and Protection Contoso Electronics is committed to protecting...,0.02938
Northwind_Standard_Benefits_Details.pdf,"At Contoso, we understand that medical costs can be intimidating and confusing, which is why we’ve partnered with Northwind Health to offer our employees the Northwind Standard plan. This plan provides a balance billing protection, meaning that you are protected from unexpected costs when visi...",0.028893


In [None]:
#Semantic configuration name should be the name of your index + "-semantic-configuration" --> if you run into an error, verify the name of your semantic configuration
semantic_configuration_name=index_name + "-semantic-configuration"

#### Hybrid Search + Semantic Ranker

Enhance search results using a semantic ranker:

In [None]:
results = search_client.search(
    search_text="What is Contoso",
    vector_queries=[VectorizableTextQuery(text="What is Contoso", k_nearest_neighbors=50, fields="text_vector")],
    top=5,
    select=["title", "chunk"],
    query_type="semantic",
    semantic_configuration_name=semantic_configuration_name
)

display_results(results)

HttpResponseError: (InvalidRequestParameter) Unknown semantic configuration 'ragtime2-semantic-configuration'.
Parameter name: semanticConfiguration
Code: InvalidRequestParameter
Message: Unknown semantic configuration 'ragtime2-semantic-configuration'.
Parameter name: semanticConfiguration
Exception Details:	(UnknownSemanticConfiguration) Unknown semantic configuration 'ragtime2-semantic-configuration'.
	Code: UnknownSemanticConfiguration
	Message: Unknown semantic configuration 'ragtime2-semantic-configuration'.

#### Hybrid Search + Semantic Ranker + Query Rewriting

Use semantic ranking and query rewriting for improved relevance.

**Note**: Currently, query rewriting is in public preview stage and only available in a search service, Basic tier or higher, in **North Europe** or **Southeast Asia**.
More Info [here](https://learn.microsoft.com/en-us/azure/search/semantic-how-to-query-rewrite)!

In [None]:
# results = search_client.search(
#     search_text="What is Contoso",
#     vector_queries=[VectorizableTextQuery(text="What is Contoso", k_nearest_neighbors=50, fields="text_vector")],
#     top=5,
#     select=["title", "chunk"],
#     query_type="semantic",
#     semantic_configuration_name="ragtime2-semantic-configuration",
#     query_rewrites="generative",
#     query_language="en"
# )

# display_results(results)

TypeError: Session.request() got an unexpected keyword argument 'query_rewrites'

### 6. Challenge
Let's have a look at the data of our search index and try to think how users might ask questions - and with which search query type the relevant chunks would be retrieved best!

1. Review content of the PerksPlus.pdf
2. Formulate two questions that users might ask about this content
3. Make assumptions about which search method will perform better (focus on keyword search vs. vector search)
4. Test the assumption by executing both searches and comparing the retrieved results.



In [None]:
question = "..."


## Troubleshooting

- **Environment Variables Not Loaded:** Ensure you have correctly set the .env file or manually export them in your terminal before running the notebook.
- **Authentication Issues:** If using Managed Identity, make sure your Azure identity has proper role assignments.
- **Search Results Are Empty:** Ensure your Azure AI Search index contains vectorized data.
- **Query Rewriting Issues:** Ensure your search service supports semantic configurations and generative query rewrites.

## Summary

This notebook demonstrates different search techniques using Azure AI Search, including keyword search, vector search, hybrid search, semantic ranking, and query rewriting. The approach enhances search accuracy by leveraging vector embeddings and semantic understanding to retrieve the most relevant documents.

