This task is to develop a new ranking method, and the requirements are as follows: 
1. Take the the provided example programs, such as 
(1) Lecture_7_Beginning of Learning-to-Rank.ipynb 
(2) Lecture_8_Ranknet.ipynb 
as a reference 

2. Using the provided dataset (Manaba -> Lecture-6), namely using vali_as_train.txt as the training data, and using test.txt as the test data 

3. Develop your own ranking method. Using PyTorch is recommended, but it is not a must. 

4. Please compute the nDCG score of your method based on the test data: test.txt 

5. If you used some reference papers, please cite them in the end. 

Note: 
(1) Please submit it as a Jupyter Notebook file, and add some necessary descriptions. 
(2) If two students submit the duplicate files, there will be no grade for all of them. 
(3) Please make sure that there is no bug within your Jupyter Notebook file.

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [3]:
# The function for loading data
def load_LETOR4(file, num_features=46):
    '''
    :param file: the input file
    :param num_features: the number of features
    :return: the list of tuples, each tuple consists of qid, doc_reprs, doc_labels
    '''
  
    feature_cols = [str(f_index) for f_index in range(1, num_features + 1)]

    df = pd.read_csv(file, sep=" ", header=None)
    df.drop(columns=df.columns[[-2, -3, -5, -6, -8, -9]], axis=1, inplace=True)  # remove redundant keys
    assert num_features == len(df.columns) - 5

    for c in range(1, num_features +2): # remove keys per column from key:value
        df.iloc[:, c] = df.iloc[:, c].apply(lambda x: x.split(":")[1])

    df.columns = ['rele_truth', 'qid'] + feature_cols + ['#docid', 'inc', 'prob']

    for c in ['rele_truth'] + feature_cols:
        df[c] = df[c].astype(np.float32)

    df['rele_binary'] = (df['rele_truth'] > 0).astype(np.float32)  # additional binarized column for later filtering

    list_Qs = []
    qids = df.qid.unique()
    np.random.shuffle(qids)
    for qid in qids:
        sorted_qdf = df[df.qid == qid].sort_values('rele_truth', ascending=False)

        doc_reprs = sorted_qdf[feature_cols].values
        doc_labels = sorted_qdf['rele_truth'].values

        list_Qs.append((qid, doc_reprs, doc_labels))

    #if buffer: pickle_save(list_Qs, file=perquery_file)

    return list_Qs

In [4]:
# local jupyter notebook
train_file = './vali_as_train.txt'
test_file = './test.txt'

train_list_Qs = load_LETOR4(file=train_file)
test_list_Qs = load_LETOR4(file=test_file)