## RankLib: Learning-to-Rank Library

RankLib is a Java library used for learning-to-rank algorithms. It provides various ranking algorithms for training and evaluation.

### Training and Testing Models

The following code snippets train and test different learning-to-rank models with specific feature sets and evaluation metrics:

1. `linguistic_sentiment` model
2. `sbert` model
3. `sentiment_sarcasm` model

```bash
!java -jar ../ranking-model/RankLib-2.18.jar -train <train_data> -test <test_data> -tvs .2 -ranker 8 -metric2t NDCG@10 -metric2T ERR@10 -save <model_path>

In [15]:
!java -jar ../ranking-model/RankLib-2.18.jar


Usage: java -jar RankLib.jar <Params>
Params:
  [+] Training (+ tuning and evaluation)
	-train <file>		Training data
	-ranker <type>		Specify which ranking algorithm to use
				0: MART (gradient boosted regression tree)
				1: RankNet
				2: RankBoost
				3: AdaRank
				4: Coordinate Ascent
				6: LambdaMART
				7: ListNet
				8: Random Forests
				9: Linear regression (L2 regularization)
	[ -feature <file> ]	Feature description file: list features to be considered by the learner, each on a separate line
				If not specified, all features will be used.
	[ -metric2t <metric> ]	Metric to optimize on the training data.  Supported: MAP, NDCG@k, DCG@k, P@k, RR@k, ERR@k (default=ERR@10)
	[ -gmax <label> ]	Highest judged relevance label. It affects the calculation of ERR (default=4, i.e. 5-point scale {0,1,2,3,4})
	[ -qrel <file> ]	TREC-style relevance judgment file. It only affects MAP and NDCG (default=unspecified)
	[ -silent ]		Do not print progress messages (which are printed by default)


# Train 2020 Models 

In [16]:
!java -jar ../ranking-model/RankLib-2.18.jar -train ../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2020.svmrank -test ../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2021.svmrank -tvs .2 -ranker 6 -metric2t NDCG@10 -metric2T ERR@10 -save ../Data/trained_models_2020_2021/linguistic_sentiment_models/linguistic_sentiment__model_2020_LambdaMart.txt


Discard orig. features
Training data:	../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2020.svmrank
Test data:	../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2021.svmrank
Train-Validation split: 0.2
Feature vector representation: Dense.
Ranking method:	LambdaMART
Feature description file:	Unspecified. All features will be used.
Train metric:	NDCG@10
Test metric:	ERR@10
Highest relevance label (to compute ERR): 4
Feature normalization: No
Model file: ../Data/trained_models_2020_2021/linguistic_sentiment_models/linguistic_sentiment__model_2020_LambdaMart.txt

[+] LambdaMART's Parameters:
No. of trees: 1000
No. of leaves: 10
No. of threshold candidates: 256
Min leaf support: 1
Learning rate: 0.1
Stop early: 100 rounds without performance gain on validation data

Reading feature file [../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2020.svmrank]... [Done.]            
(49 ranked lists, 1806 entries read)
Reading feature file [

In [17]:
!java -jar ../ranking-model/RankLib-2.18.jar -train ../Data/svm_format_2020_2021/sbert_merged/merged_data_2020.svmrank -test ../Data/svm_format_2020_2021/sbert_merged/merged_data_2021.svmrank -tvs .2 -ranker 6 -metric2t NDCG@10 -metric2T ERR@10 -save ../Data/trained_models_2020_2021/sbert_models/sbert_model_2020_LambdaMart.txt


Discard orig. features
Training data:	../Data/svm_format_2020_2021/sbert_merged/merged_data_2020.svmrank
Test data:	../Data/svm_format_2020_2021/sbert_merged/merged_data_2021.svmrank
Train-Validation split: 0.2
Feature vector representation: Dense.
Ranking method:	LambdaMART
Feature description file:	Unspecified. All features will be used.
Train metric:	NDCG@10
Test metric:	ERR@10
Highest relevance label (to compute ERR): 4
Feature normalization: No
Model file: ../Data/trained_models_2020_2021/sbert_models/sbert_model_2020_LambdaMart.txt

[+] LambdaMART's Parameters:
No. of trees: 1000
No. of leaves: 10
No. of threshold candidates: 256
Min leaf support: 1
Learning rate: 0.1
Stop early: 100 rounds without performance gain on validation data

Reading feature file [../Data/svm_format_2020_2021/sbert_merged/merged_data_2020.svmrank]... [Done.]            
(49 ranked lists, 3612 entries read)
Reading feature file [../Data/svm_format_2020_2021/sbert_merged/merged_data_2021.svmrank]... [Done

In [18]:
!java -jar ../ranking-model/RankLib-2.18.jar -train ../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2020.svmrank -test ../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2021.svmrank -tvs .2 -ranker 6 -metric2t NDCG@10 -metric2T ERR@10 -save ../Data/trained_models_2020_2021/sentiment_sarcasm_models/sentiment_sarcasm_model_2020_LambdaMart.txt


Discard orig. features
Training data:	../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2020.svmrank
Test data:	../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2021.svmrank
Train-Validation split: 0.2
Feature vector representation: Dense.
Ranking method:	LambdaMART
Feature description file:	Unspecified. All features will be used.
Train metric:	NDCG@10
Test metric:	ERR@10
Highest relevance label (to compute ERR): 4
Feature normalization: No
Model file: ../Data/trained_models_2020_2021/sentiment_sarcasm_models/sentiment_sarcasm_model_2020_LambdaMart.txt

[+] LambdaMART's Parameters:
No. of trees: 1000
No. of leaves: 10
No. of threshold candidates: 256
Min leaf support: 1
Learning rate: 0.1
Stop early: 100 rounds without performance gain on validation data

Reading feature file [../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2020.svmrank]... [Done.]            
(49 ranked lists, 1806 entries read)
Reading feature file [../Data/svm_form

## Rank 2021 Data

In [19]:
!java -jar ../ranking-model/RankLib-2.18.jar -load ../Data/trained_models_2020_2021/linguistic_sentiment_models/linguistic_sentiment__model_2020.txt -rank ../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2021.svmrank -score ../Data/scorefiles_2020_2021/linguistic_sentiment_scorefiles/argument2021_scorefile_LambdaMart.txt


Discard orig. features
Model file:	../Data/trained_models_2020_2021/linguistic_sentiment_models/linguistic_sentiment__model_2020.txt
Feature normalization: No
Model:		Random Forests
Reading feature file [../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2021.svmrank]... [Done.]            
(50 ranked lists, 3442 entries read)


In [20]:
!java -jar ../ranking-model/RankLib-2.18.jar -load ../Data/trained_models_2020_2021/sbert_models/sbert_model_2020.txt -rank ../Data/svm_format_2020_2021/sbert_merged/merged_data_2021.svmrank -score ../Data/scorefiles_2020_2021/sbert_scorefiles/argument2021_scorefile_LambdaMart.txt


Discard orig. features
Model file:	../Data/trained_models_2020_2021/sbert_models/sbert_model_2020.txt
Feature normalization: No
Model:		Random Forests
Reading feature file [../Data/svm_format_2020_2021/sbert_merged/merged_data_2021.svmrank]... [Done.]            
(50 ranked lists, 6884 entries read)


In [21]:
!java -jar ../ranking-model/RankLib-2.18.jar -load ../Data/trained_models_2020_2021/sentiment_sarcasm_models/sentiment_sarcasm_model_2020.txt -rank ../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2021.svmrank -score ../Data/scorefiles_2020_2021/sentiment_sarcasm_scorefiles/argument2021_scorefile_LambdaMart.txt


Discard orig. features
Model file:	../Data/trained_models_2020_2021/sentiment_sarcasm_models/sentiment_sarcasm_model_2020.txt
Feature normalization: No
Model:		Random Forests
Reading feature file [../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2021.svmrank]... [Done.]            
(50 ranked lists, 3442 entries read)


# Train 2021 Models

In [22]:
!java -jar ../ranking-model/RankLib-2.18.jar -train ../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2021.svmrank -test ../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2020.svmrank -tvs .2 -ranker 6 -metric2t NDCG@10 -metric2T ERR@10 -save ../Data/trained_models_2020_2021/linguistic_sentiment_models/linguistic_sentiment__model_2021__LambdaMart.txt


Discard orig. features
Training data:	../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2021.svmrank
Test data:	../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2020.svmrank
Train-Validation split: 0.2
Feature vector representation: Dense.
Ranking method:	LambdaMART
Feature description file:	Unspecified. All features will be used.
Train metric:	NDCG@10
Test metric:	ERR@10
Highest relevance label (to compute ERR): 4
Feature normalization: No
Model file: ../Data/trained_models_2020_2021/linguistic_sentiment_models/linguistic_sentiment__model_2021__LambdaMart.txt

[+] LambdaMART's Parameters:
No. of trees: 1000
No. of leaves: 10
No. of threshold candidates: 256
Min leaf support: 1
Learning rate: 0.1
Stop early: 100 rounds without performance gain on validation data

Reading feature file [../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2021.svmrank]... [Done.]            
(50 ranked lists, 3442 entries read)
Reading feature file 

In [23]:
!java -jar ../ranking-model/RankLib-2.18.jar -train ../Data/svm_format_2020_2021/sbert_merged/merged_data_2021.svmrank -test ../Data/svm_format_2020_2021/sbert_merged/merged_data_2020.svmrank -tvs .2 -ranker 6 -metric2t NDCG@10 -metric2T ERR@10 -save ../Data/trained_models_2020_2021/sbert_models/sbert_model_2021__LambdaMart.txt


Discard orig. features
Training data:	../Data/svm_format_2020_2021/sbert_merged/merged_data_2021.svmrank
Test data:	../Data/svm_format_2020_2021/sbert_merged/merged_data_2020.svmrank
Train-Validation split: 0.2
Feature vector representation: Dense.
Ranking method:	LambdaMART
Feature description file:	Unspecified. All features will be used.
Train metric:	NDCG@10
Test metric:	ERR@10
Highest relevance label (to compute ERR): 4
Feature normalization: No
Model file: ../Data/trained_models_2020_2021/sbert_models/sbert_model_2021__LambdaMart.txt

[+] LambdaMART's Parameters:
No. of trees: 1000
No. of leaves: 10
No. of threshold candidates: 256
Min leaf support: 1
Learning rate: 0.1
Stop early: 100 rounds without performance gain on validation data

Reading feature file [../Data/svm_format_2020_2021/sbert_merged/merged_data_2021.svmrank]... [Done.]            
(50 ranked lists, 6884 entries read)
Reading feature file [../Data/svm_format_2020_2021/sbert_merged/merged_data_2020.svmrank]... [Don

In [24]:
!java -jar ../ranking-model/RankLib-2.18.jar -train ../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2021.svmrank -test ../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2020.svmrank -tvs .2 -ranker 6 -metric2t NDCG@10 -metric2T ERR@10 -save ../Data/trained_models_2020_2021/sentiment_sarcasm_models/sentiment_sarcasm_model_2021__LambdaMart.txt


Discard orig. features
Training data:	../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2021.svmrank
Test data:	../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2020.svmrank
Train-Validation split: 0.2
Feature vector representation: Dense.
Ranking method:	LambdaMART
Feature description file:	Unspecified. All features will be used.
Train metric:	NDCG@10
Test metric:	ERR@10
Highest relevance label (to compute ERR): 4
Feature normalization: No
Model file: ../Data/trained_models_2020_2021/sentiment_sarcasm_models/sentiment_sarcasm_model_2021__LambdaMart.txt

[+] LambdaMART's Parameters:
No. of trees: 1000
No. of leaves: 10
No. of threshold candidates: 256
Min leaf support: 1
Learning rate: 0.1
Stop early: 100 rounds without performance gain on validation data

Reading feature file [../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2021.svmrank]... [Done.]            
(50 ranked lists, 3442 entries read)
Reading feature file [../Data/svm_for

## Rank 2020 Data

In [25]:
!java -jar ../ranking-model/RankLib-2.18.jar -load ../Data/trained_models_2020_2021/linguistic_sentiment_models/linguistic_sentiment__model_2021.txt -rank ../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2020.svmrank -score ../Data/scorefiles_2020_2021/linguistic_sentiment_scorefiles/argument2020_scorefile_LambdaMart.txt


Discard orig. features
Model file:	../Data/trained_models_2020_2021/linguistic_sentiment_models/linguistic_sentiment__model_2021.txt
Feature normalization: No
Model:		Random Forests
Reading feature file [../Data/svm_format_2020_2021/linguistic_sentiment_merged/merged_data_2020.svmrank]... [Done.]            
(49 ranked lists, 1806 entries read)


In [26]:
!java -jar ../ranking-model/RankLib-2.18.jar -load ../Data/trained_models_2020_2021/sbert_models/sbert_model_2021.txt -rank ../Data/svm_format_2020_2021/sbert_merged/merged_data_2020.svmrank -score ../Data/scorefiles_2020_2021/sbert_scorefiles/argument2020_scorefile_LambdaMart.txt


Discard orig. features
Model file:	../Data/trained_models_2020_2021/sbert_models/sbert_model_2021.txt
Feature normalization: No
Model:		Random Forests
Reading feature file [../Data/svm_format_2020_2021/sbert_merged/merged_data_2020.svmrank]... [Done.]            
(49 ranked lists, 3612 entries read)


In [27]:
!java -jar ../ranking-model/RankLib-2.18.jar -load ../Data/trained_models_2020_2021/sentiment_sarcasm_models/sentiment_sarcasm_model_2021.txt -rank ../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2020.svmrank -score ../Data/scorefiles_2020_2021/sentiment_sarcasm_scorefiles/argument2020_scorefile_LambdaMart.txt


Discard orig. features
Model file:	../Data/trained_models_2020_2021/sentiment_sarcasm_models/sentiment_sarcasm_model_2021.txt
Feature normalization: No
Model:		Random Forests
Reading feature file [../Data/svm_format_2020_2021/sentiment_sarcasm_merged/merged_data_2020.svmrank]... [Done.]            
(49 ranked lists, 1806 entries read)
