# 3. Azure AI Search: Advanced Search Techniques  
  
In this notebook, we will demonstrate various ways to search the newly created and populated Azure AI Search index. We will explore different search techniques including keyword search, vector search, hybrid search (combining keyword and vector search), hybrid search with a semantic ranker, and filtered search. These techniques will help you leverage the full capabilities of Azure AI Search to retrieve relevant information from your indexed data.  

# 3.1 Import Libraries and Load Environment Variables

In [30]:
# Import necessary libraries  
from azure.core.credentials import AzureKeyCredential  
from dotenv import load_dotenv  
from openai import AzureOpenAI  
import os  
import json  
  
# Load environment variables from .env file  
load_dotenv()  
  
# Get the service name and admin key from environment variables  
service_name = os.getenv('AZURE_SERVICE_NAME')  
admin_key = os.getenv('AZURE_ADMIN_KEY')  
  
# Get the Azure OpenAI API details from environment variables  
azure_openai_endpoint = os.getenv('AZURE_OPENAI_ENDPOINT')
azure_openai_key = os.getenv('AZURE_OPENAI_KEY')
azure_openai_embedding_model = os.getenv('AZURE_OPENAI_EMBEDDING_MODEL_NAME')
azure_openai_embedding_deployment = os.getenv('AZURE_OPENAI_EMBEDDING_DEPLOYMENT')
azure_openai_api_version = os.getenv('AZURE_OPENAI_API_VERSION')
  
# Use the service name and admin key as before  
endpoint = f"https://{service_name}.search.windows.net"  
credential = AzureKeyCredential(admin_key)  


# 3.2 Initialize Azure AI Search and Azure OpenAI Clients

In [31]:
# Import the SearchClient from Azure SDK  
from azure.search.documents import SearchClient  
from azure.search.documents.models import VectorizedQuery  
  
# Initialize the SearchClient  
index_name = "example-index"  
search_client = SearchClient(endpoint=endpoint, index_name=index_name, credential=credential)  
  
# Initialize the Azure OpenAI Client  
client = AzureOpenAI(  
    azure_deployment=azure_openai_embedding_deployment,  
    api_version=azure_openai_api_version,  
    azure_endpoint=azure_openai_endpoint,  
    api_key=azure_openai_key  
)  


## 3.3 Function to Generate Embeddings

In [32]:
# Function to generate embeddings using Azure OpenAI API  
def get_embedding(text, client):  
    response = client.embeddings.create(input=text, model=azure_openai_embedding_model)  
    return response.data[0].embedding


## 3.4 Keyword Search

In [33]:
# Perform a keyword search  
search_text = "half-past Nine"  
results = search_client.search(search_text=search_text, top=5)  
  
print("Keyword Search Results:")  
for result in results:  
    truncated_content = result['content'][:200] + "..." if len(result['content']) > 200 else result['content']  
    truncated_content = truncated_content.replace("\n", " ")
    print(f"ID: {result['id']}, Title: {result['title']}, Score: {result['@search.score']}, Content: {truncated_content}")  


Keyword Search Results:
ID: 38994724-b8e0-4261-9b2a-445c1aabc5a1, Title: House of Commons 2024-05-24, Score: 15.526085, Content: House of Commons  Friday 24 May 2024  The House met at half-past Nine o’clock
ID: ab985f7a-a3bd-4b26-b3bf-08e54eafe7fd, Title:  Valedictory Debate 2024-05-24, Score: 8.8596945, Content: It has been the honour of my life to serve as Colchester’s Member of Parliament for the past nine years, the last four and a half of which I have been a Government Minister at the Department for Work ...
ID: 20e95ff9-9069-4e5a-91bf-982067148ffe, Title:  Valedictory Debate 2024-05-24, Score: 7.8185525, Content: I put on record my thanks to my constituents, who have been a source of comfort, support and enlightenment over the past nine years. I also thank my wonderful staff, past and present, who have been un...
ID: b050cedc-e47d-46db-8560-521252efe446, Title:  Valedictory Debate 2024-05-24, Score: 5.9735227, Content: 31,000 emails or pieces of casework I have received over the 

## 3.5 Vector Search

In [34]:
# Perform a vector search  
query = "Why is the UK revoking the Burundi sanctions regime?"  
embedding = get_embedding(query, client)  
vector_query = VectorizedQuery(vector=embedding, k_nearest_neighbors=3, fields="contentVector")  
  
results = search_client.search(
    search_text=None,  
    vector_queries= [vector_query]
)  
  
print("Vector Search Results:")  
for result in results:  
    truncated_content = result['content'][:200] + "..." if len(result['content']) > 200 else result['content']  
    truncated_content = truncated_content.replace("\n", " ")
    print(f"ID: {result['id']}, Title: {result['title']}, Score: {result['@search.score']}, Content: {truncated_content}")  


Vector Search Results:
ID: c39349af-f0fa-4740-b247-63e63f8b95b0, Title:  Sanctions 2024-05-24, Score: 0.92085105, Content: Finally, we are also revoking the Burundi sanctions regime. That will remove an empty regime from the statute books. The decision in 2019 not to transpose into UK law designations under the original 2...
ID: 3b3a71b7-f76f-4d3a-b7d7-06fa8e7368f2, Title:  Sanctions 2024-05-24, Score: 0.8837239, Content: UK financial sanctions.
ID: eea57c89-238f-49b4-b30a-78c0a2b257a2, Title:  Sanctions 2024-05-24, Score: 0.86044437, Content: sanctions strategy, the Government keep their regimes under review and respond to changing circumstances. We are committed to lifting a regime out of a specific measure or revoking a designation when ...


## 3.6 Hybrid Search (Keyword and Vector)

In [35]:
# Perform a hybrid search (combining keyword and vector search)  
query = "Why is the UK revoking the Burundi sanctions regime?"  
embedding = get_embedding(query, client)  
vector_query = VectorizedQuery(vector=embedding, k_nearest_neighbors=3, fields="contentVector")  
  
results = search_client.search(
    search_text=query,  
    vector_queries=[vector_query],
    top=3
)  
  
print("Vector Search Results:")  
for result in results:  
    truncated_content = result['content'][:200] + "..." if len(result['content']) > 200 else result['content']  
    truncated_content = truncated_content.replace("\n", " ")
    print(f"ID: {result['id']}, Title: {result['title']}, Chunk {result['chunk_id']},Score: {result['@search.score']}, Content: {truncated_content}")  


Vector Search Results:
ID: c39349af-f0fa-4740-b247-63e63f8b95b0, Title:  Sanctions 2024-05-24, Chunk 31,Score: 0.03333333507180214, Content: Finally, we are also revoking the Burundi sanctions regime. That will remove an empty regime from the statute books. The decision in 2019 not to transpose into UK law designations under the original 2...
ID: eea57c89-238f-49b4-b30a-78c0a2b257a2, Title:  Sanctions 2024-05-24, Chunk 32,Score: 0.032258063554763794, Content: sanctions strategy, the Government keep their regimes under review and respond to changing circumstances. We are committed to lifting a regime out of a specific measure or revoking a designation when ...
ID: 3b3a71b7-f76f-4d3a-b7d7-06fa8e7368f2, Title:  Sanctions 2024-05-24, Chunk 26,Score: 0.03109932318329811, Content: UK financial sanctions.


## 3.7 Hybrid Search with Semantic Ranker

In [36]:
# Perform a hybrid search with semantic ranker  
from azure.search.documents.models import QueryType, QueryCaptionType, QueryAnswerType

query = "Why is the UK revoking the Burundi sanctions regime?"  
embedding = get_embedding(search_text, client)  
vector_query = VectorizedQuery(vector=embedding, k_nearest_neighbors=3, fields="contentVector")  

results = search_client.search(
    search_text=query, 
    vector_queries=[vector_query], 
    query_type=QueryType.SEMANTIC, 
    semantic_configuration_name='my-semantic-config', 
    query_caption=QueryCaptionType.EXTRACTIVE, 
    query_answer=QueryAnswerType.EXTRACTIVE,
    top=3
)
 
print("Hybrid Search with Semantic Ranker Results:")  
for result in results:  
    truncated_content = result['content'][:200] + "..." if len(result['content']) > 200 else result['content']  
    truncated_content = truncated_content.replace("\n", " ")
    print(f"ID: {result['id']}, Title: {result['title']}, Chunk {result['chunk_id']},Score: {result['@search.score']}, Content: {truncated_content}")  


Hybrid Search with Semantic Ranker Results:
ID: c39349af-f0fa-4740-b247-63e63f8b95b0, Title:  Sanctions 2024-05-24, Chunk 31,Score: 0.01666666753590107, Content: Finally, we are also revoking the Burundi sanctions regime. That will remove an empty regime from the statute books. The decision in 2019 not to transpose into UK law designations under the original 2...
ID: eea57c89-238f-49b4-b30a-78c0a2b257a2, Title:  Sanctions 2024-05-24, Chunk 32,Score: 0.016129031777381897, Content: sanctions strategy, the Government keep their regimes under review and respond to changing circumstances. We are committed to lifting a regime out of a specific measure or revoking a designation when ...
ID: 56dcf115-ab12-4229-975c-d097820495c9, Title:  Sanctions 2024-05-24, Chunk 97,Score: 0.012500000186264515, Content: thing—but there is still more to be done to ensure that sanctions regimes work appropriately, so that those people who should not be able to have directorships or ownership, or to money launde