## 📚 Prerequisites

Before running this notebook, ensure you have configured Azure AI services, set the appropriate configuration parameters, and set up a Conda environment to ensure reproducibility. You can find the setup instructions and how to create a Conda environment in the [REQUIREMENTS.md](REQUIREMENTS.md) file.

## 📋 Table of Contents

This notebook explores different types of retrieval methods and applies evaluation metrics to quantify the "gain". 

1. [**Evaluation Methodology**](#evaluation-methodology) 📊
   - This section covers the evaluation metrics and methodologies used to quantify the effectiveness of different retrieval strategies. We will explore various metrics such as precision, recall, and mean reciprocal rank (MRR) to measure the "gain" from each approach.

2. [**Retrieval Strategies using Azure AI Search**](#retrieval-strategies) 🔍
   - This section delves into different retrieval strategies using Azure AI Search, including Hybrid Search and State-of-the-Art (SOTA) Rerank approaches. We will compare these strategies to determine their effectiveness in retrieving relevant information.


## Evaluation Methodology - Retrieval Exhaustive 



1. **Collect Queries and Relevant Documents**:
   - Gather a set of queries (questions) that you expect users to ask.
   - For each query, determine which documents (texts or chunks) should be retrieved and assign a relevance score (higher means more relevant).

2. **Generate Run Data**:
   - Run your retrieval system on the same set of queries.
   - Capture the documents retrieved by your system along with their scores.

3. **Evaluate**:
   - Use the evaluation function to compare your system's output (run data) against the ground truth (qrels).
   - Choose appropriate metrics (like nDCG, MRR, Precision, Recall, F1) based on your evaluation needs.

For more detailed information on retrieval metrics, please refer to the following resource: [NDCG Metric](https://www.evidentlyai.com/ranking-metrics/ndcg-metric)

In [137]:
from ranx import Qrels, Run, evaluate

# Ground Truth Relevance Judgments (Qrels)
# Explanation:
# For the query "¿Cuál es el voltaje disponible para el DVC6200?" (What is the voltage available for the DVC6200?):
# - Document "7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_18" is highly relevant with a score of 5.
# - Document "7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_16" is moderately relevant with a score of 3.
# - Document "7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_37" is slightly relevant with a score of 2.
# - Document "7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_5" is the least relevant with a score of 1.
qrels_dict = {
    "¿Cuál es el voltaje disponible para el DVC6200?": {
        "7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_18": 5,  # Highly relevant
        "7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_16": 3,  # Moderately relevant
        "7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_37": 2,  # Slightly relevant
        "7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_5": 1   # Least relevant
    },
}

search_query_english = "What is the voltage available for the DVC6200?"
search_query_spanish = "¿Cuál es el voltaje disponible para el DVC6200?"

# Define the vector query
vector_query = VectorizableTextQuery(text=search_query_spanish, k_nearest_neighbors=5, fields="content_vector")

# Execute the search query using the search client
results = search_client.search(
    search_text=None,
    vector_queries=[vector_query],
    select=["content", "id"]
)

# Explanation:
# This script initializes a run dictionary to store the results of a search query.
# It then extracts the results from the search client and populates the run dictionary with document IDs and their scores.
# The script also defines an example of ground truth relevance judgments (Qrels) for comparison.
# Finally, it compares the run dictionary against the ground truth and prints the results.

# Initialize the run dictionary to store the results
run_dict = {}
run_dict[search_query_spanish] = {}

# Extract the results and populate the run dictionary
for result in results:
    doc_id = result["id"]
    score = result["@search.score"]
    run_dict[search_query_spanish][doc_id] = score

# Print the run dictionary to see the results
print("Run Dictionary:")
print(run_dict)

Run Dictionary:
{'¿Cuál es el voltaje disponible para el DVC6200?': {'7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_16': 0.83756506, '7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_18': 0.8328596, '7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_8': 0.82773894, '7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_37': 0.825119, '7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_101': 0.82344735}}


In [138]:
# Create Qrels and Run objects
qrels = Qrels(qrels_dict)
run = Run(run_dict)

# Evaluate using various metrics
results = evaluate(qrels, run, ["ndcg@5", "mrr", "precision@5", "recall@5", "f1@5"])

# Print evaluation results
print(results)

{'ndcg@5': 0.8429183271429458, 'mrr': 1.0, 'precision@5': 0.6, 'recall@5': 0.75, 'f1@5': 0.6666666666666665}


## Retrieval Strategies using Azure AI Search

In [139]:
from dotenv import load_dotenv
import os
from azure.search.documents import SearchClient
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.models import VectorizedQuery
from azure.search.documents.models import VectorizableTextQuery

from src.aoai.azure_openai import AzureOpenAIManager

# Load environment variables from .env file
load_dotenv()
embedding_aoai_deployment_model = "foundational-canadaeast-ada"

model = os.environ["AZURE_AOAI_EMBEDDING_DEPLOYMENT_ID"]
aoai_client = AzureOpenAIManager(api_key=os.environ["AZURE_AOAI_API_KEY"],
                                 azure_endpoint=os.environ["AZURE_AOAI_API_ENDPOINT"], 
                                 api_version="2024-02-01", 
                                 embedding_model_name=embedding_aoai_deployment_model)

AZURE_SEARCH_INDEX_NAME = "complex-pdfs-traditional-rag" 
search_client = SearchClient(
    endpoint=os.environ["AZURE_AI_SEARCH_SERVICE_ENDPOINT"],
    index_name=AZURE_SEARCH_INDEX_NAME,
    credential=AzureKeyCredential(os.environ["AZURE_SEARCH_ADMIN_KEY"]),
)

In [125]:
search_query_english = "What is the voltage available for the DVC6200?"
search_query_spanish = "¿Cuál es el voltaje disponible para el DVC6200?"

In [142]:
from colorama import Fore, Style, init
from typing import List, Dict, Any
from ranx import Qrels, Run, evaluate

init(autoreset=True)

def display_search_results(results: List[Dict[str, Any]]) -> None:
    """
    Display search results with improved formatting and colors.

    Args:
        results (List[Dict[str, Any]]): List of search results.
    """
    for idx, result in enumerate(results):
        content = result["content"].replace("\n", " ")[:1000]
        score = result.get('@search.score', 'N/A')
        print(f"{Fore.CYAN}{'='*10} Result {idx + 1} {'='*10}{Style.RESET_ALL}")
        print(f"{Fore.YELLOW}Score: {score}{Style.RESET_ALL}")
        print(f"{Fore.GREEN}Content: {content}{Style.RESET_ALL}")
        print(f"{Fore.CYAN}{'=' * 40}{Style.RESET_ALL}")

def evaluate_search_results(
    search_results: List[Dict[str, Any]], 
    search_query: str, 
    qrels_dict: Dict[str, Dict[str, int]]
) -> Dict[str, float]:
    """
    Evaluate search results against ground truth relevance judgments.

    Args:
        search_results (List[Dict[str, Any]]): List of search results.
        search_query (str): The search query string.
        qrels_dict (Dict[str, Dict[str, int]]): Ground truth relevance judgments.

    Returns:
        Dict[str, float]: Evaluation metrics results.
    """
    run_dict = {search_query: {}}

    run_dict = {}
    run_dict[search_query] = {}

    # Extract the results and populate the run dictionary
    for result in search_results:
        doc_id = result["id"]
        score = result["@search.score"]
        run_dict[search_query][doc_id] = score

    qrels = Qrels(qrels_dict)
    run = Run(run_dict)

    evaluation_results = evaluate(qrels, run, ["ndcg@5", "mrr", "precision@5", "recall@5", "f1@5"])

    return evaluation_results

## Vector Search

### Vector Similarity Search Using a Vectorizable Text Query

Vector similarity search is a technique used to find items that are similar to a given query based on their vector representations. In this context, a vectorizable text query is transformed into a vector using techniques such as word embeddings or sentence embeddings. The search process involves comparing the query vector with the vectors of other items in the dataset to find the most similar ones.

#### How It Works:
1. **Vectorization**: The text query is converted into a numerical vector using embedding techniques.
2. **Similarity Measurement**: The similarity between the query vector and the vectors of other items is measured using metrics such as cosine similarity.
3. **Retrieval**: Items with the highest similarity scores are retrieved as the most relevant results.

#### Azure AI Search Specifics:
This method also uses the `@search.score` parameter but employs the HNSW (Hierarchical Navigable Small World) algorithm for scoring. The HNSW algorithm is an efficient method for nearest neighbor search in high-dimensional spaces. The scoring range is:
- **Cosine Similarity**: 0.333 - 1.00
- **Euclidean and DotProduct Similarities**: 0 to 1

In [147]:
# Define the vector query
# Pure Vector Search multi-lingual search
vector_query = VectorizableTextQuery(text=search_query_spanish, k_nearest_neighbors=5, fields="content_vector")

# Execute the search query and display the results
results = search_client.search(
    search_text=None,
    vector_queries=[vector_query],
    select=["content", "id"]
)
display_search_results(results)

# Execute the search query again for evaluation
results = search_client.search(
    search_text=None,
    vector_queries=[vector_query],
    select=["content", "id"]
)

# Evaluate the search results against the ground truth
evaluation_results = evaluate_search_results(results, search_query_spanish, qrels_dict)

# Print the evaluation results
print("\nEvaluation Results:")
for metric, value in evaluation_results.items():
    print(f"{metric}: {value:.4f}")

Score: 0.83756506
Content: There are several parameters that should be checked to ensure the control system is compatible with the DVC6200 digital valve controller.
Score: 0.8328596
Content: The voltage available at the DVC6200 digital valve controller must be at least 10 VDC. The voltage available at the instrument is not the actual voltage measured at the instrument when the instrument is connected. The voltage measured at the instrument is limited by the instrument and is typically less than the voltage available.   <!-- PageNumber="9" --> :unselected: <!-- PageHeader="Wiring Practices December 2022" -->   <!-- PageHeader="Instruction Manual D103605X012" -->   As shown in figure 2-2, the voltage available at the instrument depends upon:   · the control system compliance voltage   . if a filter, wireless THUM adapter, or intrinsic safety barrier is used, and   . the wire type and length.   The control system compliance voltage is the maximum voltage at the control system output termi

In [148]:
# This example demonstrates how to perform an exhaustive search on your vector index, whether it is an HNSW or ExhaustiveKNN index.
# This approach can be used to calculate ground-truth values for evaluation purposes.

vector_query = VectorizableTextQuery(text=search_query_spanish, k_nearest_neighbors=5, fields="content_vector", exhaustive=True)

# Execute the search query and display the results
results = search_client.search(
    search_text=None,
    vector_queries=[vector_query],
    select=["content", "id"]
)
display_search_results(results)

# Execute the search query again for evaluation
results = search_client.search(
    search_text=None,
    vector_queries=[vector_query],
    select=["content", "id"]
)

# Evaluate the search results against the ground truth
evaluation_results = evaluate_search_results(results, search_query_spanish, qrels_dict)

# Print the evaluation results
print("\nEvaluation Results:")
for metric, value in evaluation_results.items():
    print(f"{metric}: {value:.4f}")

Score: 0.8375646
Content: There are several parameters that should be checked to ensure the control system is compatible with the DVC6200 digital valve controller.
Score: 0.8328594
Content: The voltage available at the DVC6200 digital valve controller must be at least 10 VDC. The voltage available at the instrument is not the actual voltage measured at the instrument when the instrument is connected. The voltage measured at the instrument is limited by the instrument and is typically less than the voltage available.   <!-- PageNumber="9" --> :unselected: <!-- PageHeader="Wiring Practices December 2022" -->   <!-- PageHeader="Instruction Manual D103605X012" -->   As shown in figure 2-2, the voltage available at the instrument depends upon:   · the control system compliance voltage   . if a filter, wireless THUM adapter, or intrinsic safety barrier is used, and   . the wire type and length.   The control system compliance voltage is the maximum voltage at the control system output termin

In [None]:
# Perform a Cross-Field Vector Search
# This example demonstrates how to perform a cross-field vector search, allowing you to query multiple vector fields simultaneously.
# Ensure that the same embedding model was used for the vector fields you decide to query.

# Define the vectorizable text query with the desired fields and number of nearest neighbors
# vector_query = VectorizableTextQuery(
#     text=query, 
#     k_nearest_neighbors=3, 
#     fields="contentVector, titleVector"
# )

# vector_query = VectorizableTextQuery(text=spanish_search_query, k_nearest_neighbors=3, fields="content_vector")

# results = search_client.search(  
#     search_text=None,  
#     vector_queries= [vector_query],
#     select=["content"],
# )  


In [152]:
# Define the vector query
# Pure Vector Search multi-lingual search

# This query is for the English search query. It searches for the 3 nearest neighbors
# in the "content_vector" field based on the vector representation of the English text.
vector_query_1 = VectorizableTextQuery(text=search_query_english, k_nearest_neighbors=5, fields="content_vector")

# This query is for the Spanish search query. It searches for the 3 nearest neighbors
# in the "content_vector" field based on the vector representation of the Spanish text.
vector_query_2 = VectorizableTextQuery(text=search_query_spanish, k_nearest_neighbors=5, fields="content_vector")

# Execute the search query and display the results
results = search_client.search(
    search_text=None,
    vector_queries=[vector_query_1, vector_query_2],
    select=["content", "id"]
)
display_search_results(results)

# Execute the search query again for evaluation
results = search_client.search(
    search_text=None,
    vector_queries=[vector_query],
    select=["content", "id"]
)

# Evaluate the search results against the ground truth
evaluation_results = evaluate_search_results(results, search_query_spanish, qrels_dict)

# Print the evaluation results
print("\nEvaluation Results:")
for metric, value in evaluation_results.items():
    print(f"{metric}: {value:.4f}")

Score: 0.03306011110544205
Content: There are several parameters that should be checked to ensure the control system is compatible with the DVC6200 digital valve controller.
Score: 0.03306011110544205
Content: The voltage available at the DVC6200 digital valve controller must be at least 10 VDC. The voltage available at the instrument is not the actual voltage measured at the instrument when the instrument is connected. The voltage measured at the instrument is limited by the instrument and is typically less than the voltage available.   <!-- PageNumber="9" --> :unselected: <!-- PageHeader="Wiring Practices December 2022" -->   <!-- PageHeader="Instruction Manual D103605X012" -->   As shown in figure 2-2, the voltage available at the instrument depends upon:   · the control system compliance voltage   . if a filter, wireless THUM adapter, or intrinsic safety barrier is used, and   . the wire type and length.   The control system compliance voltage is the maximum voltage at the control 

In [160]:
# Define the vector query
# Pure Vector Search multi-lingual search

# This query is for the English search query. It searches for the 3 nearest neighbors
# in the "content_vector" field based on the vector representation of the English text.
# The weight of 2 means that this query will have a higher influence on the final search results.
vector_query_1 = VectorizableTextQuery(text=search_query_english, k_nearest_neighbors=5, fields="content_vector", weight=2)

# This query is for the Spanish search query. It searches for the 3 nearest neighbors
# in the "content_vector" field based on the vector representation of the Spanish text.
# The weight of 0.5 means that this query will have a lower influence on the final search results.
vector_query_2 = VectorizableTextQuery(text=search_query_spanish, k_nearest_neighbors=5, fields="content_vector", weight=0.5)

# Execute the search query and display the results
results = search_client.search(
    search_text=None,
    vector_queries=[vector_query_1, vector_query_2],
    select=["content", "id"]
)
display_search_results(results)

# Execute the search query again for evaluation
results = search_client.search(
    search_text=None,
    vector_queries=[vector_query_1, vector_query_2],
    select=["content", "id"]
)

# Evaluate the search results against the ground truth
evaluation_results = evaluate_search_results(results, search_query_spanish, qrels_dict)

# Print the evaluation results
print("\nEvaluation Results:")
for metric, value in evaluation_results.items():
    print(f"{metric}: {value:.4f}")

Score: 0.0415300577878952
Content: The voltage available at the DVC6200 digital valve controller must be at least 10 VDC. The voltage available at the instrument is not the actual voltage measured at the instrument when the instrument is connected. The voltage measured at the instrument is limited by the instrument and is typically less than the voltage available.   <!-- PageNumber="9" --> :unselected: <!-- PageHeader="Wiring Practices December 2022" -->   <!-- PageHeader="Instruction Manual D103605X012" -->   As shown in figure 2-2, the voltage available at the instrument depends upon:   · the control system compliance voltage   . if a filter, wireless THUM adapter, or intrinsic safety barrier is used, and   . the wire type and length.   The control system compliance voltage is the maximum voltage at the control system output terminals at which the control system can produce maximum loop current.   The voltage available at the instrument may be calculated from the following equation: 

In [37]:
# vector_query_1 = VectorizableTextQuery(text=search_query, k_nearest_neighbors=3, fields="content_vector", weight=2)
# vector_query_2 = VectorizableTextQuery(text=spanish_search_query, k_nearest_neighbors=3, fields="content_vector", weight=0.5)

# # Adding a filter to ensure "DVC6200" appears in the content
# filter_condition = "content eq 'DVC6200'"

# results = search_client.search(  
#     search_text=None,  
#     vector_queries=[vector_query_1, vector_query_2],
#     filter=filter_condition,
#     select=["content"],
# )  
# display_search_results(results)

## Hybrid Search

#### Hybrid Search with RRF Algorithm

**Hybrid Search**: Combines keyword-based search with vector-based search to leverage the strengths of both methods.

- **Vector Search**: Uses vector representations of text to find semantically similar documents.
- **Full-Text Search**: Uses keyword-based search to find documents containing specific terms.

**@search.score Parameter**: Used to score the search results.

**RRF (Reciprocal Rank Fusion) Algorithm**: A method for data fusion that combines the results of multiple queries. Each query contributes a maximum of approximately 1 to the RRF score, and the upper limit of the score is bounded by the number of queries being fused. For example, merging three queries would produce higher RRF scores than merging only two queries.

By using the RRF algorithm, the hybrid search approach can effectively combine the results of multiple queries, providing more relevant and accurate search results. This approach leverages both vector-based and full-text search to improve overall search performance.

In [161]:
# Define the vector query
# Pure Vector Search multi-lingual search

# This query is for the English search query. It searches for the 3 nearest neighbors
# in the "content_vector" field based on the vector representation of the English text.
# The weight of 2 means that this query will have a higher influence on the final search results.
vector_query_1 = VectorizableTextQuery(text=search_query_spanish, k_nearest_neighbors=3, fields="content_vector", weight=2)


# Execute the search query and display the results
results = search_client.search(  
    search_text=search_query_spanish,  
    vector_queries=[vector_query_1],
    select=["content", "id"],
    top=5
)
display_search_results(results)

# Execute the search query again for evaluation
results = search_client.search(  
    search_text=search_query_spanish,  
    vector_queries=[vector_query_1],
    select=["content", "id"],
    top=5
)

# Evaluate the search results against the ground truth
evaluation_results = evaluate_search_results(results, search_query_spanish, qrels_dict)

# Print the evaluation results
print("\nEvaluation Results:")
for metric, value in evaluation_results.items():
    print(f"{metric}: {value:.4f}")

Score: 0.04599156230688095
Content: There are several parameters that should be checked to ensure the control system is compatible with the DVC6200 digital valve controller.
Score: 0.04577157646417618
Content: #### Available Mounting   DVC6200 digital valve controller or DVC6215 feedback unit: Integral mounting to Fisher :selected: 657/667 or GX actuators :selected: Window mounting to Fisher rotary actuators :selected: Sliding-stem linear applications Quarter-turn rotary applications :selected: DVC6205 base unit for 2 inch pipestand or wall mounting (for remote-mount)   The DVC6200 digital valve controller or DVC6215 feedback unit can also be mounted on other actuators that comply with IEC 60534-6-1, IEC 60534-6-2, VDI/VDE 3845 and NAMUR mounting standards.
Score: 0.043096162378787994
Content: The voltage available at the DVC6200 digital valve controller must be at least 10 VDC. The voltage available at the instrument is not the actual voltage measured at the instrument when the instru

## Semantic Hybrid Search

In [162]:
from azure.search.documents.models import QueryType, QueryCaptionType, QueryAnswerType

In [164]:
from azure.search.documents.models import QueryType, QueryCaptionType, QueryAnswerType

vector_query = VectorizableTextQuery(text=search_query_spanish, k_nearest_neighbors=3, fields="content_vector", exhaustive=True)

results = search_client.search(  
    search_text=search_query_spanish,  
    vector_queries=[vector_query],
    select=["content", "id"],
    query_type=QueryType.SEMANTIC, semantic_configuration_name='index-fields-semantic-config', query_caption=QueryCaptionType.EXTRACTIVE, query_answer=QueryAnswerType.EXTRACTIVE,
    top=5
)

semantic_answers = results.get_answers()

for answer in semantic_answers:
    print("=" * 40)
    if answer.highlights:
        print(f"Semantic Answer: {answer.highlights}")
    else:
        print(f"Semantic Answer: {answer.text}")
    print(f"Semantic Answer Score: {answer.score}")
    print("=" * 40)

for result in results:
    print("=" * 40)
    print(f"ID: {result['id']}")
    print(f"Reranker Score: {result['@search.reranker_score']}")
    content = result['content'][:100] + '...' if len(result['content']) > 100 else result['content']
    print(f"Content: {content}")

    captions = result.get("@search.captions", [])
    if captions:
        caption = captions[0]
        if caption.highlights:
            print(f"Caption: {caption.highlights}")
        else:
            print(f"Caption: {caption.text}")
    print("=" * 40)

results = search_client.search(  
    search_text=search_query_spanish,  
    vector_queries=[vector_query],
    select=["content", "id"],
    query_type=QueryType.SEMANTIC, semantic_configuration_name='index-fields-semantic-config', query_caption=QueryCaptionType.EXTRACTIVE, query_answer=QueryAnswerType.EXTRACTIVE,
    top=5
)

# Evaluate the search results against the ground truth
evaluation_results = evaluate_search_results(results, search_query_spanish, qrels_dict)

# Print the evaluation results
print("\nEvaluation Results:")
for metric, value in evaluation_results.items():
    print(f"{metric}: {value:.4f}")

Semantic Answer: The voltage available at the DVC6200 digital valve controller must be<em> at least 10 VDC</em> The voltage available at the instrument is not the actual voltage measured at the instrument when the instrument is connected The voltage measured at the instrument is limited by the instrument and is typically less than the voltage available.   <!-- PageNumber="9...
Semantic Answer Score: 0.84130859375
ID: 7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_18
Reranker Score: 3.650735378265381
Content: The voltage available at the DVC6200 digital valve controller must be at least 10 VDC. The voltage a...
Caption: The<em> voltage available</em> at the<em> DVC6200</em> digital valve controller must be at least 10 VDC The<em> voltage available</em> at the instrument is not the actual<em> voltage</em> measured at the instrument when the instrument is connected The<em> voltage</em> measured at the instrument is limited by the instrument and is typically less than the<em> voltage available.</em>

In [None]:
# For the query "¿Cuál es el voltaje disponible para el DVC6200?" (What is the voltage available for the DVC6200?):
# - Document "7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_18" is highly relevant with a score of 5.
# - Document "7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_16" is moderately relevant with a score of 3.
# - Document "7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_37" is slightly relevant with a score of 2.
# - Document "7690b1eb-bb5c-42c3-b51a-2e3f78a1f58d_5" is the least relevant with a score of 1.