# Evaluating Search Relevance with Azure Al Search
Source: [step by step guide to measuring azure ai search relevance](https://farzzy.hashnode.dev/step-by-step-guide-to-measuring-azure-ai-search-relevance-the-hello-world-of-information-retrieval)

Dataset:
 - Guidance [train.doj_guidance.jsonl.xz](https://huggingface.co/datasets/pile-of-law/pile-of-law/resolve/main/data/train.doj_guidance.jsonl.xz?download=true)
 - Eurlex [train.eurlex.jsonl.xz](https://huggingface.co/datasets/pile-of-law/pile-of-law/resolve/main/data/train.eurlex.jsonl.xz?download=true)
 - Memos [train.irs_legal_advice_memos.jsonl.xz](https://huggingface.co/datasets/pile-of-law/pile-of-law/resolve/main/data/train.irs_legal_advice_memos.jsonl.xz?download=true)
 - Memos [train.olc_memos.jsonl.xz](https://huggingface.co/datasets/pile-of-law/pile-of-law/resolve/main/data/train.olc_memos.jsonl.xz?download=true)

In [1]:
# httpx==0.27.2 is needed to avoid an inconsistencies in the openai interface
%pip install azure-identity==1.23.0 azure-search-documents==11.5.2 openai==1.43.1 ranx==0.3.20 dotenv tenacity pandas httpx==0.27.2 voyageai==0.2.4

Note: you may need to restart the kernel to use updated packages.


# Step 1: Environment and Resources configuration

This step is composed of many substep:

1. Load environment variables

2. Open Al embeddings configuration

3. Azure Al Search confguration

4. Load data and configure dataset

## Step 1.1: Load environment variables

In [None]:
import os

from dotenv import load_dotenv

load_dotenv() # take environment variables from .env file
load_dotenv('.env.iva') # take environment variables from .env.iva file

True

## Step 1.2: Open Al embeddings configuration

In [3]:
from abc import ABC, abstractmethod  
from typing import List   
  
  
class BaseEmbeddingsClient(ABC):  
    """  
    Abstract base class for an Embeddings Client.  
    Child classes must implement the generate_embeddings method.  
    """  
  
    def __init__(self, embeddings_config):  
        self.embeddings_config = embeddings_config  
        self.model = self.initialize_model()  
  
    @abstractmethod  
    def initialize_model(self):  
        """  
        Abstract method to initialize and return the model instance.  
        Needs to be implemented by child classes.  
        """  
        pass  
  
    @abstractmethod  
    async def generate_embeddings(self, model_inputs: List[str], batch_size: int = 20):  
        """  
        Abstract method to generate embeddings.   
        Needs to be implemented by child classes.  
        """  
        pass

In [4]:
# Initialize OpenAI client
import voyageai
from openai import AsyncAzureOpenAI
from typing import List
from tenacity import retry, stop_after_attempt, wait_fixed


#async_credential = AsyncDefaultAzureCredential()
openai_api_key = os.getenv("AZURE_OPENAI_EMBEDDINGS_SERVICE_KEY")
voyage_api_key = os.getenv("VOYAGE_API_KEY")

openai_embeddings_ada2_config = {
    "name": "ada2",
    "service_name": os.getenv("AZURE_OPENAI_EMBEDDINGS_SERVICE_NAME"),
    "api_version": os.getenv("AZURE_OPENAI_EMBEDDINGS_ADA2_API_VERSION"),
    "deployment_model": os.getenv("AZURE_OPENAI_EMBEDDINGS_ADA2_DEPLOYMENT_MODEL"),
    "vector_dimensions": 1536
}

openai_embeddings_t3small_config = {
    "name": "t3small",
    "service_name": os.getenv("AZURE_OPENAI_EMBEDDINGS_SERVICE_NAME"), 
    "api_version": os.getenv("AZURE_OPENAI_EMBEDDINGS_T3SMALL_API_VERSION"),
    "deployment_model": os.getenv("AZURE_OPENAI_EMBEDDINGS_T3SMALL_DEPLOYMENT_MODEL"),
    "vector_dimensions": 1536
}

embeddings_voyage_config = {
    "name": "voyage-law-2",
    "input_type": "document",
    "vector_dimensions": 1024
}

class AzureEmbeddingsClient(BaseEmbeddingsClient):

    def initialize_model(self):  
        """  
        Initialize and return the Azure-specific AsyncAzureOpenAI model.  
        """  
        return AsyncAzureOpenAI(  
            api_version=self.embeddings_config["api_version"],
            api_key=openai_api_key,
            azure_endpoint=f'https://{self.embeddings_config["service_name"]}.openai.azure.com',  
            max_retries=2,  
        ) 
    
    @retry(
        stop_after_attempt(15), #Retry up to X times
        wait_fixed(10), # Wait X seconds between retries
    )
    async def generate_embeddings(self, model_inputs: List[str], batch_size: int = 20):
        responses = []
        # Generate embeddings in batches
        batch_count = 0
        for i in range(0, len(model_inputs), batch_size):
            j = i + batch_size if i + batch_size <= len(model_inputs) else len(model_inputs)
            batch = model_inputs[i:j]
            #print(f"[Embeddings] [{self.embeddings_config['name']}] Processing batch #{batch_count), Batch: (1) -> {j}")
            try:
                response = await self.model.embeddings.create(
                    model=self.embeddings_config["deployment_model"], 
                    input=batch,
                )
                responses.extend(i.embedding for i in response.data)
                batch_count += 1

            except Exception as e:
                print(f"[Embeddings][{self.embeddings_config['name']}] Error while computing embeddings: {e}. Retrying...")
                raise

        return responses


class VoyageEmbeddingsClient(BaseEmbeddingsClient):
    def initialize_model(self):  
        """  
        Initialize and return the SyncVoyage model.  
        """  
        return voyageai.Client()
    

    async def generate_embeddings(self, model_inputs):
        try:
            responses = self.model.embed(model_inputs, model=self.embeddings_config["name"], input_type=self.embeddings_config["input_type"]).embeddings
        except Exception as e:
            print(f"[Embeddings][{self.embeddings_config['name']}] Error while computing embeddings: {e}.")
            raise
        
        return responses

## Step 1.3: Azure Al Search configuration

In [5]:
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient

sync_credential = AzureKeyCredential(os.getenv("SEARCH_SERVICE_KEY"))

azure_search_config = {
    "service_name": os.getenv("SEARCH_SERVICE_NAME"),
    "index_name": os.getenv("SEARCH_INDEX_NAME"),
    "api_version": os.getenv("SEARCH_API_VERSION"),
    "service_endpoint": f'https://{os.getenv("SEARCH_SERVICE_NAME")}.search.windows.net'
}

## Step 1.4: Load data and configure dataset

In [6]:
import pandas as pd

docs_number = 10

# Load dataset queries and groundtruth
folder = "legal-docs"
documents_df = pd.read_csv(f"dataset/{folder}/documents.csv", sep="\t", index_col="doc_id", keep_default_na=False)
queries_df = pd.read_csv(f"dataset/{folder}/query.csv", sep="\t", index_col="query_id")
labels_df = pd.read_csv(f"dataset/{folder}/label.csv", sep="\t")

# Map ground truth labels to scores
relevancy_scores = {"Exact": 10, "Partial": 5, "Irrelevant": 0}
labels_df["score"] = labels_df["label"].map(relevancy_scores)

# Ensure query id and doc_id columns are of type string (object)
labels_df["query_id"] = labels_df["query_id"].astype(str)
labels_df["doc_id"] = labels_df["doc_id"].astype(str)

# Filter by the document number
contents = documents_df["content"].tolist()[:docs_number]
filtered_documents_df = documents_df[documents_df.index.astype(int) <= docs_number]
filtered_labels_df = labels_df.loc[labels_df['doc_id'].astype(int) <= docs_number]
filtered_queries_df = queries_df[queries_df.index.isin(filtered_labels_df['query_id'].astype(int))]

In [7]:
documents_df.head()

Unnamed: 0_level_0,title,doc_class,category,content,product_features,rating_count,average_rating,review_count
doc_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0,solid wood platform bed,Beds,Furniture / Bedroom Furniture / Beds & Headboa...,"good , deep sleep can be quite difficult to ha...",overallwidth-sidetoside:64.7|dsprimaryproducts...,15.0,4.5,15.0
1,all-clad 7 qt . slow cooker,Slow Cookers,Kitchen & Tabletop / Small Kitchen Appliances ...,"create delicious slow-cooked meals , from tend...",capacityquarts:7|producttype : slow cooker|pro...,100.0,2.0,98.0
2,all-clad electrics 6.5 qt . slow cooker,Slow Cookers,Kitchen & Tabletop / Small Kitchen Appliances ...,prepare home-cooked meals on any schedule with...,features : keep warm setting|capacityquarts:6....,208.0,3.0,181.0
3,all-clad all professional tools pizza cutter,"Slicers, Peelers And Graters",Browse By Brand / All-Clad,this original stainless tool was designed to c...,overallwidth-sidetoside:3.5|warrantylength : l...,69.0,4.5,42.0
4,baldwin prestige alcott passage knob with roun...,Door Knobs,Home Improvement / Doors & Door Hardware / Doo...,the hardware has a rich heritage of delivering...,compatibledoorthickness:1.375 '' |countryofori...,70.0,5.0,42.0


In [8]:
filtered_queries_df.head()

Unnamed: 0_level_0,query,query_class
query_id,Unnamed: 1_level_1,Unnamed: 2_level_1
1,smart coffee table,Coffee & Cocktail Tables
9,coffee table fire pit,Outdoor Fireplaces
14,beds that have leds,Beds
18,chrome bathroom 4 light vanity light,Vanity Lighting
24,wood coffee table set by storage,Living Room Table Sets


In [9]:
labels_df

Unnamed: 0,id,query_id,doc_id,label,score
0,0,0,25434,Exact,10
1,1,0,12088,Irrelevant,0
2,2,0,42931,Exact,10
3,3,0,2636,Exact,10
4,4,0,42923,Exact,10
...,...,...,...,...,...
233443,234010,478,15439,Partial,5
233444,234011,478,451,Partial,5
233445,234012,478,30764,Irrelevant,0
233446,234013,478,16796,Partial,5


In [10]:
filtered_labels_df.head()

Unnamed: 0,id,query_id,doc_id,label,score
8012,8012,62,1,Exact,10
8013,8013,62,2,Partial,5
9635,9635,76,4,Irrelevant,0
9901,9901,78,7,Partial,5
9973,9973,78,5,Partial,5


# Step 2: Prepare the code to run the evaluation

This is composed of many substeps:

1. Generate embeddings

2. Create/update a search index and upload data

3. Set-up code for searching

4. Gather search data (score)

5. Set-up evaluation tool (ranx)

## Step 2.1: Generate embeddings

In [11]:
async def generate_embeddings(embeddings_client, contents):
    content_embeddings = await embeddings_client.generate_embeddings(contents)
    return content_embeddings

## Step 2.2: Create/update a search index and upload data

In [12]:
from azure.search.documents.indexes.models import (
    HnswAlgorithmConfiguration,
    HnswParameters,
    SearchField,
    SearchFieldDataType,
    SearchIndex,
    SimpleField,
    VectorSearch,
    VectorSearchAlgorithmKind,
    VectorSearchAlgorithmMetric,
    VectorSearchProfile
)

def create_or_update_index(
    azure_search_config, index_name, vector_field_type, vector_dimensions
):
    search_index_client = SearchIndexClient(endpoint=azure_search_config["service_endpoint"], credential=sync_credential)
    # Define the search index fields based on your product schema
    fields = [
        SimpleField(name="doc_id", type=SearchFieldDataType.String, key=True), 
        SearchField(name="category", type=SearchFieldDataType.String, searchable=True, filterable=True),
        SearchField(name="title", type=SearchFieldDataType.String, searchable=True, filterable=True),
        SearchField(name="content", type=SearchFieldDataType.String, searchable=True),
        SearchField(name="content_vector", type=vector_field_type, vector_search_dimensions=vector_dimensions, vector_search_profile_name="my-vector-config"),
    ]
    
    # Vector search configuration with HNSW algorithm and query vectorizer
    vector_search = VectorSearch(
        profiles=[
            VectorSearchProfile( 
                name="my-vector-config", 
                algorithm_configuration_name="my-hnsw", 
            )
        ],
        algorithms=[
            HnswAlgorithmConfiguration(
                name="my-hnsw", 
                kind=VectorSearchAlgorithmKind.HNSW, 
                parameters=HnswParameters(metric=VectorSearchAlgorithmMetric.COSINE),
            )
        ]
    )

    index= SearchIndex(name=index_name, fields=fields, vector_search=vector_search)
    search_index_client.create_or_update_index(index=index)
    print(f"[SearchIndexClient][{index_name}] Created or updated index.")

In [13]:
from azure.search.documents import SearchIndexingBufferedSender

def upload_embeddings_to_index(service_endpoint, index_name, documents_df, content_embeddings, batch_size=100): 
    documents = []
    # Prepare documents with embeddings
    for i, content_embedding in enumerate(content_embeddings):
        document = {
            "doc_id": str(documents_df.index[i]),
            "category": documents_df["category"][i],
            "title": documents_df["title"][i], 
            "content": documents_df["content"][i], 
            "content_vector": content_embedding,
        } 
        documents.append(document)

        # Initialize SearchIndexingBufferedSender for batch uploads 
    with SearchIndexingBufferedSender(
        endpoint=service_endpoint,
        index_name=index_name,
        credential=sync_credential,
        auto_flush_interval=60,  # Automatically flush every 60 seconds
        initial_batch_action_count=batch_size # Batch size for actions
    ) as batch_client:
        # Upload documents in batches
        for doc_batch in [documents[i:i + batch_size] for i in range(0, len(documents), batch_size)]:
            batch_client.upload_documents(documents=doc_batch)

        print(f"[SearchIndexClient][{index_name}] Uploaded {len(documents)} documents using buffered sender.")

    #Ensure all documents are flushed
    batch_client.flush()

## Step 2.3: Set-up code for searching

In [14]:
async def search(search_client, embeddings_client, query_text: str, vector_fields: str, top: int): 
    query_vector = (await embeddings_client.generate_embeddings((query_text)))[0]

    vector_query = {
        "kind": "vector", 
        "vector": query_vector, 
        "fields": vector_fields,
        "k": top, 
    }
    response = search_client.search(search_text=None, vector_queries=[vector_query], top=top)

    return response

## Step 2.4: Gather search data (score)

In [15]:
from collections import defaultdict

async def gather_search_data(search_client, embeddings_client, queries_df, field, top): 
    run_dict = defaultdict(dict)

    for index, row in queries_df.iterrows():
        query_text = row["query"] 
        print(f"[SearchClient][{search_client._index_name}] Searching query {index}. Query: {query_text}")

        # Perform vector search using the Azure AI Search client
        results = await search(search_client, embeddings_client, query_text, vector_fields=field, top=top)
        

        query_id = f"{index}"  # Ensure query id matches what's in qrels
        count = 0
        # Use the actual product id from the search results instead of generating a 'doc 
        for result in results: 
            print(f"[SearchClient][{search_client._index_name}] - Searching query {index}. Result {count}: {result}")
            doc_id = result['doc_id'] 
            score = result['@search.score']

            # Populate the run dict using product id and score
            run_dict[str(query_id)][str(doc_id)] = score
            count += 0
        
        print(f"[SearchClient][{search_client._index_name}] - run_dict[{query_id}]: {run_dict[query_id]}")
    
    return run_dict

## Step 3: Execute the evaluation process

In [16]:
from ranx import Run

async def evaluation_process(azure_search_config, embeddings_config, documents_df, queries_df, data, k=3):
    # create search index
    index_name = f'{azure_search_config["index_name"]}-{embeddings_config["name"]}'

    create_or_update_index(
        azure_search_config,
        index_name=index_name,
        vector_field_type="Collection(Edm.Single)",  # OpenAI embedding storage format dim: 1536
        vector_dimensions=embeddings_config["vector_dimensions"]
    )
    
    # Generate the embeddings
    if "input_type" in embeddings_config:
        embeddings_client = VoyageEmbeddingsClient(embeddings_config)
    else:
        embeddings_client = AzureEmbeddingsClient(embeddings_config)
    content_embeddings = await generate_embeddings(
        embeddings_client,
        data["contents"]
    )

    # Upload embeddings to respective indexes
    upload_embeddings_to_index(
        azure_search_config["service_endpoint"],
        index_name,
        documents_df,
        content_embeddings
    )

    # Perform search
    search_client = SearchClient(
        endpoint=azure_search_config["service_endpoint"],
        index_name=index_name,
        credential=sync_credential, 
        api_version=azure_search_config["api_version"]
    )
    
    model_name = embeddings_config["name"]

    content_run_dict = await gather_search_data(search_client, embeddings_client, queries_df, "content_vector", top=k)

    # create runs for ranx
    content_run = Run(content_run_dict, name=f"{model_name}_content")
    
    return {
        "dict": (content_run_dict),
        "runs": (content_run)
    }

# Step 4: Compare the results

In [17]:
from ranx import compare


def compare_runs(qrels, *runs, result_folder, k=3):
    # Compare search relevance metrics across different models
    report = compare(
        qrels=qrels,
        runs=[
            *runs
        ],
        metrics=[
            f"precision@{k}", 
            f"recall@{k}", 
            f"mrr@{k}", 
            f"dcg@{k}", 
            f"ndcg@{k}"
        ],
        make_comparable=True # Ensure that qrels and runs have matching query IDs
    )

    # Convert the report to a DataFrame and display it
    results_df = report.to_dataframe()

    # Optionally, export results to a CSV
    results_df.to_csv(f"results/{result_folder}/comparison_results_k{k}.csv", index=False)
    return results_df


In [18]:
from ranx import Qrels

# Create qrels from labels after converting dtypes
qrels = Qrels.from_df(filtered_labels_df, q_id_col="query_id", doc_id_col="doc_id", score_col="score")

results_dfs = {}
for k in [3, 5, 10]:    # k being the number of top results to retrieve
    data = {
        "contents": contents
    }

    ada2_results = await evaluation_process(azure_search_config, openai_embeddings_ada2_config, filtered_documents_df, filtered_queries_df, data, k=k)
    t3small_results = await evaluation_process(azure_search_config, openai_embeddings_t3small_config, filtered_documents_df, filtered_queries_df, data, k=k)
    voyage_results = await evaluation_process(azure_search_config, embeddings_voyage_config, filtered_documents_df, filtered_queries_df, data, k=k)
    
    # this saves the results to a csv file
    results_dfs[k] = compare_runs(qrels, ada2_results["runs"], t3small_results["runs"], voyage_results["runs"], result_folder=folder, k=k)

[SearchIndexClient][ranx-index1-ada2] Created or updated index.
[SearchIndexClient][ranx-index1-ada2] Uploaded 10 documents using buffered sender.
[SearchClient][ranx-index1-ada2] Searching query 1. Query: smart coffee table
[SearchClient][ranx-index1-ada2] - Searching query 1. Result 0: {'content': 'vanity has an extra thick marble top with a bullnose edge on 3 sides spread for your faucet and white ceramic inset sink .', 'category': 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Bathroom Vanities / All Bathroom Vanities / Modern & Contemporary Bathroom Vanities', 'doc_id': '7', 'title': "36 '' single bathroom vanity", '@search.score': 0.83464867, '@search.reranker_score': None, '@search.highlights': None, '@search.captions': None}
[SearchClient][ranx-index1-ada2] - Searching query 1. Result 0: {'content': 'in the endless world of sofa style options , the chesterfield design is timeless . a style that fits the widest array of interior designs from semi modern to casual far

  scores[i] = _precision(qrels[i], run[i], k, rel_lvl)


[SearchIndexClient][ranx-index1-ada2] Created or updated index.
[SearchIndexClient][ranx-index1-ada2] Uploaded 10 documents using buffered sender.
[SearchClient][ranx-index1-ada2] Searching query 1. Query: smart coffee table
[SearchClient][ranx-index1-ada2] - Searching query 1. Result 0: {'content': 'vanity has an extra thick marble top with a bullnose edge on 3 sides spread for your faucet and white ceramic inset sink .', 'category': 'Home Improvement / Bathroom Remodel & Bathroom Fixtures / Bathroom Vanities / All Bathroom Vanities / Modern & Contemporary Bathroom Vanities', 'doc_id': '7', 'title': "36 '' single bathroom vanity", '@search.score': 0.83464867, '@search.reranker_score': None, '@search.highlights': None, '@search.captions': None}
[SearchClient][ranx-index1-ada2] - Searching query 1. Result 0: {'content': 'in the endless world of sofa style options , the chesterfield design is timeless . a style that fits the widest array of interior designs from semi modern to casual far

# 5. Comparison

In [22]:
results_df = pd.read_csv(f"results/{folder}/comparison_results_k3.csv")
results_df

Unnamed: 0,model_names,precision@3,recall@3,mrr@3,dcg@3,ndcg@3
0,ada2_content,0.24031,0.418605,0.395349,3.160275,0.397763
1,t3small_content,0.248062,0.44186,0.418605,3.31947,0.424281
2,voyage-law-2_content,0.255814,0.465116,0.465116,3.478664,0.45812


In [20]:
results_df = pd.read_csv(f"results/{folder}/comparison_results_k5.csv")
results_df

Unnamed: 0,model_names,precision@5,recall@5,mrr@5,dcg@5,ndcg@5
0,ada2_content,0.153488,0.465116,0.406977,3.260433,0.417794
1,t3small_content,0.153488,0.465116,0.424419,3.369548,0.434297
2,voyage-law-2_content,0.153488,0.465116,0.465116,3.478664,0.45812


In [21]:
results_df = pd.read_csv(f"results/{folder}/comparison_results_k10.csv")
results_df

Unnamed: 0,model_names,precision@10,recall@10,mrr@10,dcg@10,ndcg@10
0,ada2_content,0.076744,0.465116,0.406977,3.260433,0.417794
1,t3small_content,0.076744,0.465116,0.424419,3.369548,0.434297
2,voyage-law-2_content,0.076744,0.465116,0.465116,3.478664,0.45812
