# Example: How to call recall_at_k for a binary classification ranking model

In this example we create a model to rank LETOR04 MQ2008 dataset using xgboost XGBRanker. The below model utilizes code from https://github.com/dmlc/xgboost/tree/master/demo/rank

We start with installing metriks

In [1]:
!pip install metriks



In [2]:
import random
import numpy as np
import xgboost as xgb
from sklearn.datasets import load_svmlight_file
from sklearn.utils.extmath import softmax
from xgboost import DMatrix
from metriks.ranking import recall_at_k

We download MQ2008 data, unzip the same and copy the train, test and valid from Fold1 to the current folder.

In [3]:
!wget https://s3-us-west-2.amazonaws.com/xgboost-examples/MQ2008.rar
!unrar x MQ2008.rar
!mv -f MQ2008/Fold1/*.txt .

--2020-10-02 01:46:53--  https://s3-us-west-2.amazonaws.com/xgboost-examples/MQ2008.rar
Resolving s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)... 52.218.250.248
Connecting to s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)|52.218.250.248|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 15448795 (15M) [application/x-rar-compressed]
Saving to: ‘MQ2008.rar’


2020-10-02 01:46:55 (9.01 MB/s) - ‘MQ2008.rar’ saved [15448795/15448795]


UNRAR 5.50 freeware      Copyright (c) 1993-2017 Alexander Roshal


Extracting from MQ2008.rar

Creating    MQ2008                                                    OK
Creating    MQ2008/Fold1                                              OK
Extracting  MQ2008/Fold1/test.txt                                          0%  1%  2%  OK 
Extracting  MQ2008/Fold1/train.txt                                         2%  3%  4%  5%  6%  7%  8%  OK 
Extracting  MQ2008/Fold1/vali.tx

In [4]:
#confirm that train, test and vali are copied
!ls

MQ2008	MQ2008.rar  sample_data  test.txt  train.txt  vali.txt


MQ2008 - Million Query Track of TREK 2008 (https://arxiv.org/abs/1306.2597)
Data structure: 
Each row signifies query document pair (how relevant a document is for a query)
 - Column 1: relavance label
 - Column 2: Query id (We will call this as group)
 - rest of the columns: features that have been normalized
 - last column: comment

## Data grouping
Train, valid and test data are grouped as 
X => contains features
y => contains labels (relevance label)
group => contains query ids

In [5]:
#The below code is from xgboost
def save_data(group_data,output_feature,output_group):
    if len(group_data) == 0:
        return

    output_group.write(str(len(group_data))+"\n")
    for data in group_data:
        # only include nonzero features
        feats = [ p for p in data[2:] if float(p.split(':')[1]) != 0.0 ]        
        output_feature.write(data[0] + " " + " ".join(feats) + "\n")

def transform(input_file_name, output_file_name, output_group_name):

    fi = open(input_file_name)
    output_feature = open(output_file_name,"w")
    output_group = open(output_group_name,"w")
    
    group_data = []
    group = ""
    for line in fi:
        if not line:
            break
        if "#" in line: ##docid = GX004-93-7097963 i
            line = line[:line.index("#")]
        splits = line.strip().split(" ")
        if splits[1] != group:
            save_data(group_data,output_feature,output_group)
            group_data = []
        group = splits[1]
        group_data.append(splits)

    save_data(group_data,output_feature,output_group)

    fi.close()
    output_feature.close()
    output_group.close()

*transform* separates group related details from other details
group file contains number of items in each group. For instance if the entries are as below

0, group1, features
1, group1, features
0, group1, features
0, group2, features
1, group2, features

then group file will contain values [3,2] (3 entries of group1, 2 entries of group2)

In [6]:
transform("train.txt", "mq2008.train", "mq2008.train.group")
transform("vali.txt","mq2008.valid", "mq2008.valid.group")
transform("test.txt","mq2008.test", "mq2008.test.group")

relevance labels are separated as y values

In [7]:
x_train, y_train = load_svmlight_file("mq2008.train")
x_valid, y_valid = load_svmlight_file("mq2008.valid")
x_test, y_test = load_svmlight_file("mq2008.test")

metrix recall method is relevant for binary classification ranking. However, relevance labels in data set range from 0 to 2. Relevance label signifies how relevant the document is for the query. 0 signifies no relevance and anything greater than 0 is relevant. So for our purpose, we convert all values of label 2 to 1 to signify relevance.

In [8]:
y_train = np.where(y_train == 2, 1, y_train)
y_valid = np.where(y_valid == 2, 1, y_valid)
y_test = np.where(y_test == 2, 1, y_test)

Now datastructure for group is constructed as below

In [9]:
group_train = []
with open("mq2008.train.group", "r") as f:
    data = f.readlines()
    for line in data:
        group_train.append(int(line.split("\n")[0]))
group_valid = []
with open("mq2008.valid.group", "r") as f:
    data = f.readlines()
    for line in data:
        group_valid.append(int(line.split("\n")[0]))

group_test = []
with open("mq2008.test.group", "r") as f:
    data = f.readlines()
    for line in data:
        group_test.append(int(line.split("\n")[0]))

## Model
We use XGBRanker to create a model. Note that XGBRanker requires the group information to be sent as well. Most of the hyper parameters are retained as is from xgboost documentation

In [10]:
params = {'objective': 'rank:ndcg', 'learning_rate': 0.1,
          'gamma': 0.1, 'min_child_weight': 0.1,
          'max_depth': 6, 'n_estimators': 300, 'early_stopping_rounds': 5}
model = xgb.sklearn.XGBRanker(**params)
model.fit(x_train, y_train, group_train, verbose=True,
          eval_set=[(x_valid, y_valid)], eval_group=[group_valid])

[0]	eval_0-map:0.711908
[1]	eval_0-map:0.715253
[2]	eval_0-map:0.711601
[3]	eval_0-map:0.71427
[4]	eval_0-map:0.716693
[5]	eval_0-map:0.718969
[6]	eval_0-map:0.713003
[7]	eval_0-map:0.713006
[8]	eval_0-map:0.710418
[9]	eval_0-map:0.713416
[10]	eval_0-map:0.710826
[11]	eval_0-map:0.709178
[12]	eval_0-map:0.708738
[13]	eval_0-map:0.710649
[14]	eval_0-map:0.709609
[15]	eval_0-map:0.712088
[16]	eval_0-map:0.71396
[17]	eval_0-map:0.717295
[18]	eval_0-map:0.715476
[19]	eval_0-map:0.714976
[20]	eval_0-map:0.716271
[21]	eval_0-map:0.716302
[22]	eval_0-map:0.716862
[23]	eval_0-map:0.721199
[24]	eval_0-map:0.721671
[25]	eval_0-map:0.720256
[26]	eval_0-map:0.719076
[27]	eval_0-map:0.72268
[28]	eval_0-map:0.72272
[29]	eval_0-map:0.72302
[30]	eval_0-map:0.725206
[31]	eval_0-map:0.725265
[32]	eval_0-map:0.725606
[33]	eval_0-map:0.723357
[34]	eval_0-map:0.727385
[35]	eval_0-map:0.725793
[36]	eval_0-map:0.725981
[37]	eval_0-map:0.72468
[38]	eval_0-map:0.725571
[39]	eval_0-map:0.726212
[40]	eval_0-map:

XGBRanker(base_score=0.5, booster='gbtree', colsample_bylevel=1,
          colsample_bynode=1, colsample_bytree=1, early_stopping_rounds=5,
          gamma=0.1, learning_rate=0.1, max_delta_step=0, max_depth=6,
          min_child_weight=0.1, missing=None, n_estimators=300, n_jobs=-1,
          nthread=None, objective='rank:ndcg', random_state=0, reg_alpha=0,
          reg_lambda=1, scale_pos_weight=1, seed=None, silent=None, subsample=1,
          verbosity=1)

Interestingly XGBRanker model does not expect group data for test. Once prediction is done, it is important to look at the data by group because the predictions are relevant only within that group

In [11]:
pred = model.predict(x_test)

print(pred[0:group_test[0]])

[ 1.487788   -4.6278214   1.1964784   1.3164926  -0.68694735 -1.1854128
 -3.9207067  -4.8447275 ]


## recall_at_k metrics
Now it is time to call recall_at_k. 
- Note that each group has varied set of documents so the recall can be made only by group.
- Also predictions can range from negative to positive since it reflects the rank of documents. In order to get the recall we need probabilities, hence we softmax the prediction output
- We also used k=6 when calling recall_at_k. You can call any number of relevant documents as it seems logical 

In [12]:
start = 0
recall = []
for group in group_test:
  softmax_pred = softmax([pred[start:start+group]])
  y_true = np.expand_dims(y_test[start:start+group],0)
  recall.append(recall_at_k(y_true,softmax_pred,6))

Now look at the output for random 5 groups

In [13]:
import random
rand_num = random.sample(range(len(recall)) ,5)
for i in rand_num:
  print(f'recall for group{i+1} is: {recall[i]}')

recall for group46 is: 0.6666666666666666
recall for group37 is: 1.0
recall for group2 is: 0.10526315789473684
recall for group155 is: 0.2857142857142857
recall for group16 is: 1.0
