![](./lab%20header%20image.png)

<div style="text-align: center;">
    <h3>Assignment No. 04</h3>
</div>

<img src="./Student%20Information.png" style="width: 100%;" alt="Student Information">

<div style="border: 1px solid #ccc; padding: 8px; background-color: #f0f0f0; text-align: start;">
    <strong>Q. Implement advanced information retrieval techniques by developing and evaluating Learning to Rank (LTR) models using machine learning methods. Your task is to rank documents based on their relevance to a given query, leveraging features such as term frequency, document length, and other query-document characteristics. Evaluate the performance of the developed models using standard evaluation metrics such as Normalized Discounted Cumulative Gain (NDCG) and Mean Reciprocal Rank (MRR). Document the process, including feature extraction, model training, and evaluation results, and compare the performance of different LTR models.</strong>
</div>

Learning to Rank (LTR) refers to the application of machine learning to build models that rank items (in this case, documents) based on their relevance to a particular query. It is widely used in information retrieval systems like search engines, recommendation systems, and question-answering systems.

LTR models aim to order a set of documents such that the most relevant documents appear higher in the ranked list for a given query. These models are trained on labeled data, where the relevance of documents is provided.

LTR methods can be broadly classified into three categories:

**1. Pointwise Approach**: In this method, the model learns to predict the relevance score of each document independently. The predicted score is then used to rank the documents.

**2. Pairwise Approach**: In the pairwise approach, the model learns to compare pairs of documents and determine which one is more relevant to a query. The model focuses on minimizing the number of incorrectly ordered pairs.

**3. Listwise Approach**: Here, the model directly optimizes the ranking of a list of documents. It evaluates the entire list at once, using ranking metrics such as NDCG.

#### Features for Learning to Rank
Typical features for ranking documents include:

- **Term Frequency (TF)**: Frequency of a query term in the document.
- **Inverse Document Frequency (IDF)**: Importance of a term across all documents.
- **Document Length**: Total number of terms in a document.
- **BM25 Score**: A popular ranking function based on term frequency and document length.
- **Query-Document Features**: Measures like the number of query terms found in the document, and the proximity of query terms.

#### Evaluation Metrics
- **Normalized Discounted Cumulative Gain (NDCG)**: Measures how well a ranking matches the ideal order. It discounts the relevance score based on the position of the document in the ranking. Higher ranks contribute more to the score.

- **Mean Reciprocal Rank (MRR)**: Measures the rank of the first relevant document in the result set. The reciprocal rank of a query is the inverse of the rank at which the first relevant document is found.

In [1]:
import xgboost as xgb
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import ndcg_score, average_precision_score

# Generating synthetic data
def generate_synthetic_data(num_queries=1000, num_docs_per_query=10):
    np.random.seed(42)
    data = []
    for qid in range(num_queries):
        for did in range(num_docs_per_query):
            tf = np.random.randint(1, 100)  # Term Frequency
            doc_len = np.random.randint(100, 1000)  # Document Length
            relevance = np.random.choice([0, 1, 2], p=[0.7, 0.2, 0.1])  # Relevance score
            feature_vector = [tf, doc_len, tf / doc_len]  # Features
            data.append([qid] + feature_vector + [relevance])
    
    columns = ['qid', 'term_frequency', 'doc_length', 'tf_doc_len_ratio', 'relevance']
    df = pd.DataFrame(data, columns=columns)
    return df

# Load synthetic data
df = generate_synthetic_data()

# Split features and labels
X = df[['term_frequency', 'doc_length', 'tf_doc_len_ratio']].values
y = df['relevance'].values
qid = df['qid'].values

# Split the data into training and test sets
X_train, X_test, y_train, y_test, qid_train, qid_test = train_test_split(X, y, qid, test_size=0.2, random_state=42)

# Prepare the data for XGBoost
train_group = np.unique(qid_train, return_counts=True)[1]
test_group = np.unique(qid_test, return_counts=True)[1]

dtrain = xgb.DMatrix(X_train, label=y_train)
dtrain.set_group(train_group)

dtest = xgb.DMatrix(X_test, label=y_test)
dtest.set_group(test_group)

# Parameters for Learning to Rank
params = {
    'objective': 'rank:pairwise',  # Pairwise ranking objective
    'eta': 0.1,  # Learning rate
    'gamma': 1.0,  # Minimum loss reduction
    'min_child_weight': 0.1,  # Minimum sum of instance weight (hessian)
    'max_depth': 6,  # Maximum depth of the trees
    'eval_metric': 'ndcg',  # Evaluation metric
}

# Train the model
rank_model = xgb.train(params, dtrain, num_boost_round=100)

# Predict on test data
y_pred = rank_model.predict(dtest)

# Evaluate using NDCG and MRR
def calculate_mrr(y_true, y_pred, qid):
    reciprocal_ranks = []
    for q in np.unique(qid):
        true_relevance = y_true[qid == q]
        predicted_scores = y_pred[qid == q]
        ranked_indices = np.argsort(predicted_scores)[::-1]
        sorted_true_relevance = true_relevance[ranked_indices]
        relevant_doc_indices = np.where(sorted_true_relevance > 0)[0]
        if len(relevant_doc_indices) > 0:
            reciprocal_ranks.append(1 / (relevant_doc_indices[0] + 1))
        else:
            reciprocal_ranks.append(0)
    return np.mean(reciprocal_ranks)

# NDCG evaluation
ndcg_score_value = ndcg_score([y_test], [y_pred], k=10)
print(f"NDCG Score: {ndcg_score_value:.4f}")

# MRR evaluation
mrr_score_value = calculate_mrr(y_test, y_pred, qid_test)
print(f"MRR Score: {mrr_score_value:.4f}")


NDCG Score: 0.2303
MRR Score: 0.4007


#### Explanation:

**1. Data Generation**: We simulate a dataset with query IDs and document features such as term frequency, document length, and relevance. Each query is associated with multiple documents.

**2. Model Training**: We train an XGBoost model using the pairwise ranking objective `(rank:pairwise)`. XGBoost is suitable for LTR as it supports ranking objectives.

**3. Evaluation**:

    - We use the NDCG metric to evaluate the quality of the ranked list.
    - We compute MRR by evaluating the position of the first relevant document in the ranked list.

<div style="float: right; border: 1px solid black; display: inline-block; padding: 10px; text-align: center">
    <br>
    <br>
    <span style="font-weight: bold;">Signature of Lab Incharge</span>
    <br>
    <span>(Prof. Rupali Sharma)</span> 
</div>