# Setting Up the Python Path

This snippet appends the root directory to ensure that the corresponding Python module can be imported and accessed in the current session.

The following code adds the root directory to Python's module search path to enable importing other modules or packages within the project.



In [None]:
import os
import sys

# Get the current working directory
current_dir = os.getcwd()

# Get the parent directory of the current directory
root_dir = os.path.abspath(os.path.join(current_dir, os.pardir))
src_dir = os.path.join(root_dir, 'src')

if root_dir not in sys.path:
    sys.path.append(root_dir)

if src_dir not in sys.path:
    sys.path.append(src_dir)

# Installation of Dependencies

run `pip install -r requirements.txt`

The command pip install -r requirements.txt installs all the necessary Python packages listed in the requirements.txt file, which defines the dependencies needed for the project to function correctly.

# Importing Modules from the Rapyton Package

In this section, we import essential modules from the `src.rapython` package. This allows us to access various functionalities for evaluation, semi-supervised learning, supervised learning, and unsupervised learning.

- `evaluation`: Contains functions and classes for performance evaluation.
- `semi`: Implements semi-supervised rank aggregation algorithms.
- `supervised`: Includes supervised rank aggregation algorithms.
- `unsupervised`: Provides unsupervised rank aggregation algorithms.


In [None]:
import glob

from src.rapython.evaluation import *
from src.rapython.semi import *
from src.rapython.supervised import *
from src.rapython.unsupervised import *

 # Running Unsupervised RA Methods

The following code sets the path to the dataset file and the output file directory required for the unsupervised approach

In [2]:
input_file_path = '..\\test\\full_lists\\data\\simulation_test.csv'
output_base_path = '..\\test\\full_lists\\results'

This code snippet sequentially calls various functions related to the evaluation of ranking methods, processing each one while printing status messages to the console. For each function, it constructs the output file path based on the specified `output_base_path` and the name of the function being called, then executes the function with the `input_file_path` and the constructed output path. After each function call, a message indicating its completion is printed. Finally, a summary message confirms that all functions have been processed successfully.


In [None]:


print("Starting to process all functions...\n")

print("Calling bordacount()...")
output_file_path = os.path.join(output_base_path, f"{bordacount.__name__}.csv")
bordacount(input_file_path, output_file_path)
print("Finished bordacount()\n")

print("Calling borda_score()...")
output_file_path = os.path.join(output_base_path, f"{borda_score.__name__}.csv")
borda_score(input_file_path, output_file_path)
print("Finished borda_score()\n")

print("Calling cg()...")
output_file_path = os.path.join(output_base_path, f"{cg.__name__}.csv")
cg(input_file_path, output_file_path)
print("Finished cg()\n")

print("Calling combanz()...")
output_file_path = os.path.join(output_base_path, f"{combanz.__name__}.csv")
combanz(input_file_path, output_file_path)
print("Finished combanz()\n")

print("Calling combmax()...")
output_file_path = os.path.join(output_base_path, f"{combmax.__name__}.csv")
combmax(input_file_path, output_file_path)
print("Finished combmax()\n")

print("Calling combmed()...")
output_file_path = os.path.join(output_base_path, f"{combmed.__name__}.csv")
combmed(input_file_path, output_file_path)
print("Finished combmed()\n")

print("Calling combmin()...")
output_file_path = os.path.join(output_base_path, f"{combmin.__name__}.csv")
combmin(input_file_path, output_file_path)
print("Finished combmin()\n")

print("Calling combsum()...")
output_file_path = os.path.join(output_base_path, f"{combsum.__name__}.csv")
combsum(input_file_path, output_file_path)
print("Finished combsum()\n")

print("Calling dibra()...")
output_file_path = os.path.join(output_base_path, f"{dibra.__name__}.csv")
dibra(input_file_path, output_file_path, InputType.RANK)
print("Finished dibra()\n")

print("Calling dowdall()...")
output_file_path = os.path.join(output_base_path, f"{dowdall.__name__}.csv")
dowdall(input_file_path, output_file_path)
print("Finished dowdall()\n")

print("Calling er()...")
output_file_path = os.path.join(output_base_path, f"{er.__name__}.csv")
er(input_file_path, output_file_path, InputType.RANK)
print("Finished er()\n")

print("Calling hpa()...")
output_file_path = os.path.join(output_base_path, f"{hpa.__name__}.csv")
hpa(input_file_path, output_file_path, InputType.RANK)
print("Finished hpa()\n")

print("Calling irank()...")
output_file_path = os.path.join(output_base_path, f"{irank.__name__}.csv")
irank(input_file_path, output_file_path, InputType.RANK)
print("Finished irank()\n")

print("Calling markovchainmethod()...")
output_file_path = os.path.join(output_base_path, f"{markovchainmethod.__name__}.csv")
markovchainmethod(input_file_path, output_file_path, mc_type=McType.MC1)
print("Finished markovchainmethod()\n")

print("Calling mean()...")
output_file_path = os.path.join(output_base_path, f"{mean.__name__}.csv")
mean(input_file_path, output_file_path)
print("Finished mean()\n")

print("Calling median()...")
output_file_path = os.path.join(output_base_path, f"{median.__name__}.csv")
median(input_file_path, output_file_path)
print("Finished median()\n")

print("Calling mork_heuristic()...")
output_file_path = os.path.join(output_base_path, f"{mork_heuristic.__name__}.csv")
mork_heuristic(input_file_path, output_file_path)
print("Finished mork_heuristic()\n")

print("Calling postndcg()...")
output_file_path = os.path.join(output_base_path, f"{postndcg.__name__}.csv")
postndcg(input_file_path, output_file_path, input_type=InputType.RANK)
print("Finished postndcg()\n")

print("Calling rrf()...")
output_file_path = os.path.join(output_base_path, f"{rrf.__name__}.csv")
rrf(input_file_path, output_file_path)
print("Finished rrf()\n")

print("All functions processed successfully!")


# Run Supervised RA Methods

In this code snippet, file paths for the training and testing datasets are defined. The `train_file_path` variable points to the CSV file containing the training data, while the `train_rel_path` variable specifies the corresponding relevance data for the training set. Similarly, `test_file_path` is assigned the path to the test dataset, and `test_rel_path` points to its relevance data. These file paths will be used later in the script for model training and evaluation.

In [None]:
train_file_path = '..\\test\\full_lists\\data\\simulation_train.csv'
train_rel_path = '..\\test\\full_lists\\data\\simulation_train_rel.csv'

test_file_path = '..\\test\\full_lists\\data\\simulation_test.csv'
test_rel_path = '..\\test\\full_lists\\data\\simulation_test_rel.csv'

This code snippet initializes and executes various ranking methods, including AggRankDE, CRF, Weighted Borda, and different IRA methods.

1. **AggRankDE**: The code first initializes the AggRankDE method and then trains it using the training data provided by `train_file_path` and `train_rel_path`. After training, it tests the model on the test dataset, saving the results to `aggrankde.csv`.

2. **CRF**: Next, the CRF method is initialized and trained for a specified number of epochs (2 in this case). It is then tested on the test dataset, with results saved to `crf.csv`.

3. **Weighted Borda**: The Weighted Borda method is also initialized and trained using the same training data. The results of the testing phase are saved to `weightedborda.csv`.

4. **IRA Methods**: The snippet also calls two variations of the IRA method: `IRA_RANK` and `IRA_SCORE`. Each method is executed with the test dataset, and results are saved to their respective CSV files (`ira_rank.csv` and `ira_score.csv`).

5. **QI_IRA Method**: Finally, the QI_IRA method is called with its own parameters, and results are saved to `qi_ira.csv`.

Throughout the execution, the process is logged to the console to indicate the initialization, training, and testing stages of each method.


In [None]:
print("Initializing AggRankDE...")
aggRankDE = AggRankDE()
print("AggRankDE initialized.")

print("Training AggRankDE...")
aggRankDE.train(train_file_path, train_rel_path, InputType.RANK)
print("AggRankDE training completed.")

print("Testing AggRankDE...")
test_output_path = os.path.join(output_base_path, 'aggrankde.csv')
aggRankDE.test(test_file_path, test_output_path)
print("AggRankDE testing completed.\n")

print("Initializing CRF...")
crf = CRF()
print("CRF initialized.")

print("Training CRF...")
crf.train(train_file_path, train_rel_path, InputType.RANK, epoch=2)
print("CRF training completed.")

print("Testing CRF...")
test_output_path = os.path.join(output_base_path, 'crf.csv')
crf.test(test_file_path, test_output_path)
print("CRF testing completed.\n")

weightborda = WeightedBorda()
print("Training WeightedBorda method...")
weightborda.train(train_file_path, train_rel_path)
print("WeightedBorda training completed.")

print("Testing WeightedBorda...")
test_output_path = os.path.join(output_base_path, 'weightedborda.csv')
weightborda.test(test_file_path, test_output_path)
print("WeightedBorda testing completed.\n")

print("Calling IRA method (IRA_RANK)...")
test_output_path = os.path.join(output_base_path, 'ira_rank.csv')
ira(test_file_path, test_output_path, test_rel_path, 3, 2, 0.02, MethodType.IRA_RANK, InputType.RANK)
print("IRA method (IRA_RANK) completed.")

print("Calling IRA method (IRA_SCORE)...")
test_output_path = os.path.join(output_base_path, 'ira_score.csv')
ira(test_file_path, test_output_path, test_rel_path, 3, 2, 0.02, MethodType.IRA_SCORE, InputType.RANK)
print("IRA method (IRA_SCORE) completed.\n")

print("Calling QI_IRA method...")
test_output_path = os.path.join(output_base_path, 'qi_ira.csv')
qi_ira(test_file_path, test_output_path, test_rel_path, 3, 2, 0.02, InputType.RANK)
print("QI_IRA method completed.")


# Run Semi-Supervised RA Methods

This code snippet demonstrates the initialization, training, and testing of the SSRA (Semi-Supervised Rank Aggregation) method.

1. **Initialization**: The SSRA method is initialized by creating an instance of the `SSRA` class. A message confirming the successful initialization is printed to the console.

2. **Training**: The method is then trained using the specified training dataset located at `train_file_path` and the corresponding relevance information at `train_rel_path`. The training is conducted with the input type set to rank (`InputType.RANK`). Upon completion, a message is printed to indicate that the training process is finished.

3. **Testing**: After training, the SSRA method is tested with the test dataset found at `test_file_path`. The results are saved to a CSV file named `ssra.csv` in the specified output directory. A message confirming the successful completion of the testing process is printed to the console.


In [None]:
print("Initializing SSRA...")
ssra = SSRA()
print("SSRA initialized.")

print("Training SSRA...")
ssra.train(train_file_path, train_rel_path, InputType.RANK)
print("SSRA training completed.")

print("Testing SSRA...")
test_output_path = os.path.join(output_base_path, 'ssra.csv')
ssra.test(test_file_path, test_output_path)
print("SSRA testing completed.")


# Evaluation of Algorithm Results

This code snippet sets up the evaluation of ranking results from CSV files stored in a specified folder.

1. **Folder Path Definition**: The variable `folder_path` is initialized with the path to the directory containing the result CSV files.

2. **Retrieving CSV Files**: Using the `glob` module, the code retrieves all files with a `.csv` extension from the specified folder. The file paths are stored in the `csv_files` list for further processing.

3. **Loading Relevance Scores**: The relevance scores for the test dataset are loaded from a CSV file located at `..\\test\\full_lists\\data\\simulation_test_rel.csv`. This file contains the relevance information that will be used in the evaluation.

4. **Evaluation Setup**: An instance of the `Evaluation` class is created to facilitate the assessment of the ranking results.

5. **Results DataFrame Initialization**: An empty DataFrame named `results_df` is created with specified columns: 'File Name', 'mAP@10', 'NDCG@10', and 'Rank@1'. This DataFrame will be used to store the evaluation results of the ranking methods applied to the CSV files.


In [None]:
folder_path = '..\\test\\full_lists\\results'

# 获取文件夹中的所有 .csv 文件
csv_files = glob.glob(os.path.join(folder_path, '*.csv'))

# 加载相关性分数文件
rel_data = pd.read_csv('..\\test\\full_lists\\data\\simulation_test_rel.csv', header=None)

evaluation = Evaluation()

# 创建一个空的 DataFrame 来存储评估结果
results_df = pd.DataFrame(columns=['File Name', 'mAP@10', 'NDCG@10', 'Rank@1'])

This code iterates over a list of CSV files containing ranking results, evaluates each file using multiple metrics, and stores the results in both a DataFrame and an Excel file.

1. **File Processing Loop**: For each CSV file in `csv_files`, the code extracts the file name (without the extension) for identification in the evaluation report.

2. **Evaluation Metrics Calculation**: The code calculates three evaluation metrics:
   - `mAP@10` (Mean Average Precision at 10)
   - `NDCG@10` (Normalized Discounted Cumulative Gain at 10)
   - `Rank@1` (Rank position of the first relevant item)
   These metrics assess the accuracy and relevance of the ranking output based on `rel_data`.

3. **Results Storage**: Each file's evaluation metrics are printed and appended as a new row in `results_df`, which consolidates results across all processed files.

4. **Excel Output**: Once all files are evaluated, the accumulated results in `results_df` are saved as an Excel file at `..\\test\\full_lists\\results\\evaluation_results.xlsx`, enabling easy review and analysis of the evaluation results.


In [None]:
for file in csv_files:
    # 获取不带后缀的文件名
    file_name = os.path.splitext(os.path.basename(file))[0]

    # 加载csv文件
    result_data = pd.read_csv(file, header=None)

    # 计算评估结果
    map_at_10 = evaluation.eval_mean_average_precision(result_data, rel_data, 10)
    ndcg_at_10 = evaluation.eval_ndcg(result_data, rel_data, 10)
    rank_at_1 = evaluation.eval_rank(result_data, rel_data, 1)

    # 打印评估结果
    print(f"{file_name}: mAP@10: {map_at_10}, NDCG@10: {ndcg_at_10}, Rank@1: {rank_at_1}")

    # 将评估结果作为一行添加到 DataFrame
    new_row = pd.DataFrame({
        'File Name': [file_name],
        'mAP@10': [map_at_10],
        'NDCG@10': [ndcg_at_10],
        'Rank@1': [rank_at_1]
    })

    results_df = pd.concat([results_df, new_row], ignore_index=True)

# 将评估结果写入 Excel 文件
output_excel_path = '..\\test\\full_lists\\results\\evaluation_results.xlsx'
results_df.to_excel(output_excel_path, index=False, sheet_name='sheet1')

print(f"Evaluation results have been written to {output_excel_path}")
