# (Py)FLAGR

Fuse, Learn, AGgregate, Rerank

FLAGR is a high performing, modular library for rank aggregation. To ensure the highest possible performance, the core FLAGR library is written in C++ and implements a wide collection of unsupervised rank aggregation methods. Its modular design allows third-party programmers to implement their own algorithms and easily rebuild the entire library. FLAGR can be built as a standard application, or as a shared library (`so` or `dll`). In the second case, it can be linked from other C/C++ programs, or even from programs written in other languages (e.g. Python, PHP, etc.).

In this context, PyFLAGR is a Python library that links to FLAGR and allows a developer to exploit the efficient FLAGR implementations from a standard Python program.


## Web site

The library is fully documented at [http://flagr.mywork.gr/](http://flagr.mywork.gr/).

## Importing and using PyFLAGR

PyFLAGR groups its supported rank aggregation methods in 6 modules:

1. `Linear`: This module contains the `CombSUM`, `CombMNZ`, `Borda` and `SimpleBorda` classes. `CombSUM` and `CombMNZ` support five normalization methods (see Renda et. al, 2003). `Borda` and `SimpleBorda` are just wrappers of `CombSUM` with `borda` and `simple-borda` normalization.
2. `Majoritarian`: Includes `CondorcetWinners`, `CopelandWinners` and `OutrankingApproach` (Outranking Approach of Farah and Vanderpooten 2007).
3. `MarkovChains`: The fourth and most popular method (termed `MC4`) based on Markov Chains is implemented (Dwork et. al, 2001). Future releases of FLAGR will include the other three implementations.
4. `Kemeny`: Includes `KemenyOptimal` (Kemeny Optimal Aggregation).
5. `RRA`: Includes `RobustRA` (Robust Rank Aggregation of Kolde et. al, 2012 in two variants).
6. `Weighted`: This module implements several self-weighting rank aggregation methods. These methods automatically identify the expert voters and include:
 1. The Preference Relations Graph method of Desarkar et. al, 2016.
 2. The Agglomerative method of Chatterjee et. al, 2018.
 3. The Iterative, Distance-Based method of Akritidis et. al, 2022.

The following statements demonstrate the imports of all PyFLAGR rank aggregation methods in a typical jupyter notebook.

In [1]:
import pyflagr.Linear as Linear
import pyflagr.Majoritarian as Majoritarian
import pyflagr.MarkovChains as MarkovChains
import pyflagr.Kemeny as Kemeny
import pyflagr.RRA as RRA
import pyflagr.Weighted as Weighted


In [2]:
# Code snippet for displaying dataframes side by side
from IPython.display import display_html
from itertools import chain,cycle
def display_side_by_side(*args,titles=cycle([''])):
    html_str=''
    x = 1
    for df,title in zip(args, chain(titles,cycle(['</br>'])) ):
        html_str+='<th style="text-align:center"><td style="vertical-align:top">'
        html_str+=f'<h2 style="text-align: center;">{title}</h2>'
        html_str+=df.to_html().replace('table','table style="display:inline"')
        html_str+='</td></th>'
    display_html(html_str,raw=True)

All PyFLAGR rank aggregation methods include:
* a standard class constructor: several hyper-parameters of the corresponding algorithm  and other execution arguments can be passed through the constructor. All the constructor inputs have default values, therefore, they are considered optional. This means that all constructors can be called *any* argument at all.
* an `aggregate` method that runs the algorithm on the selected input and (optionally) evaluates the generated aggregate list. In all algorithms, `aggregate` method accepts the following arguments:

| Parameter    | Type                                         | Default Value  | Values  |
| :----------- | :--------------------------------------------| :--------------| :------ |
| `input_file` | String - Required, unless `input_df` is set. | Empty String   | A CSV file that contains the input lists to be aggregated. |
| `input_df` | Pandas DataFrame - Required, unless `input_file` is set. | `None` | A Pandas DataFrame that contains the input lists to be aggregated. **Note:** If both `input_file` and `input_df` are set, only the former is used; the latter is ignored. |
| `rels_file`  | String, Optional. | Empty String | A CSV file that contains the relevance judgements of the involved list elements. If such a file is passed, FLAGR will evaluate the generated aggregate list/s by computing several retrieval effectiveness evaluation measures. The results of the evaluation will be stored in the `eval_df` DataFrame. Otherwise, no evaluation will take place and `eval_df` will be empty. Read more on the evaluation of rank aggregation quality. |
| `rels_df`    | Pandas DataFrame, Optional. | `None` | A Pandas DataFrame that contains the relevance judgements of the involved list elements. If such a dataframe is passed, FLAGR will evaluate the generated aggregate list/s by computing several retrieval effectiveness evaluation measures. The results of the evaluation will be stored in the `eval_df` DataFrame. Otherwise, no evaluation will take place and `eval_df` will be empty. Read more on the evaluation of rank aggregation quality. **Note:** If both `rels_file` and `rels_df` are set, only the former is used; the latter is ignored. |
| `output_dir` | String, Optional. | Temporary directory (OS-specific) | The directory where the output files (aggregate lists and evaluation) will be stored. If it is not set, the default location will be used. |


## Input/Output files

Please refer to [this article](http://flagr.mywork.gr/docs/38/input-and-output-files).

## Code examples

The following examples demonstrate the usage of all PyFLAGR rank aggregation methods.

In [3]:
#lists = '/media/leo/B65266EC5266B1331/phd_Research/08 - Datasets/TREC/Synthetic/MOSO.csv'
#qrels = '/media/leo/B65266EC5266B1331/phd_Research/08 - Datasets/TREC/Synthetic/MOSO_qrels.csv'

lists = 'D:/phd_Research/08 - Datasets/TREC/Synthetic/MOSO.csv'
qrels = 'D:/phd_Research/08 - Datasets/TREC/Synthetic/MOSO_qrels.csv'


### Linear methods: CombSUM

Member of `pyflagr.Linear`.

The `CombSUM` constructor supports the following parameters:


| Parameter      | Type        | Default Value  | Values |
| :------------- | :---------- | :--------------| :------|
| `eval_pts` | Integer, Optional. Considered only if `rels_file` or `rels_df` is set. | 10 | Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for `eval_pts=10` FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$. |
| `norm` | String, Optional. | `borda` | Rank or score normalization methods:<ul><li>`borda`: The aggregation is performed by normalizing the element *rankings* according to the Borda normalization method. Equivalent to the `BordaCount` function.</li><li>`rank`: The aggregation is performed by normalizing the element *rankings* according to the Rank normalization method.</li><li>`score`: The aggregation is performed by normalizing the element *scores* according to the Score normalization method.</li><li>`z-score`: The aggregation is performed by normalizing the element *scores* according to the Z-Score normalization method.</li><li>`simple-borda`: Similar to `borda` normalization but no partial score is assigned to an element if it is not ranked by a voter.</li></ul> |


In [4]:
csum = Linear.CombSUM(norm='rank', eval_pts=5)

# In this case, rels_file has been specified, so PyFLAGR returns two non-blank dataframes:
# * df_out contains the aggregate list produced by the selected algorithm
# * df_eval contains the effectiveness evaluation based on the relevance judgments in the rels_file
df_out, df_eval = csum.aggregate(input_file=lists, rels_file=qrels)

display_side_by_side(df_out.head(20), df_eval, titles=['Aggregate list','Evaluation'])


Unnamed: 0,Query,Voter,ItemID,Score
0,1,PyFLAGR,Q1-E39,13.466667
1,1,PyFLAGR,Q1-E48,12.933333
2,1,PyFLAGR,Q1-E23,11.933333
3,1,PyFLAGR,Q1-E85,11.4
4,1,PyFLAGR,Q1-E95,11.366667
5,1,PyFLAGR,Q1-E94,11.3
6,1,PyFLAGR,Q1-E33,11.0
7,1,PyFLAGR,Q1-E63,10.7
8,1,PyFLAGR,Q1-E100,10.666667
9,1,PyFLAGR,Q1-E5,10.433333

Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,N@1,N@2,N@3,N@4,N@5
0,Topic 1,0.533976,1.0,1.0,0.666667,0.5,0.4,1.0,1.0,0.765361,0.636682,0.553146
1,Topic 2,0.539673,0.0,0.5,0.666667,0.5,0.4,0.0,0.386853,0.530721,0.441492,0.383566
2,Topic 3,0.442181,1.0,1.0,0.666667,0.5,0.4,1.0,1.0,0.765361,0.636682,0.553146
3,Topic 4,0.427771,0.0,0.0,0.0,0.25,0.2,0.0,0.0,0.0,0.168128,0.146068
4,Topic 5,0.656236,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
5,Topic 6,0.496233,0.0,0.5,0.333333,0.25,0.4,0.0,0.386853,0.296082,0.246302,0.345191
6,Topic 7,0.413981,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.131205
7,Topic 8,0.487902,1.0,0.5,0.666667,0.75,0.6,1.0,0.613147,0.703918,0.753698,0.654809
8,Topic 9,0.415008,0.0,0.5,0.333333,0.5,0.4,0.0,0.386853,0.296082,0.41443,0.360055
9,Topic 10,0.480995,0.0,0.5,0.333333,0.5,0.4,0.0,0.386853,0.296082,0.41443,0.360055


In [5]:
csum = Linear.CombSUM(norm='score')

# In this case, rels_file has NOT been specified, so PyFLAGR returns two dataframes,
# * df_out contains the aggregate list produced by the selected algorithm
# * df_eval is blank
df_out, df_eval = csum.aggregate(input_file=lists)

display_side_by_side(df_out.head(20), df_eval, titles=['Aggregate list','Evaluation'])


Unnamed: 0,Query,Voter,ItemID,Score
0,1,PyFLAGR,Q1-E39,13.206937
1,1,PyFLAGR,Q1-E48,12.517291
2,1,PyFLAGR,Q1-E23,11.551781
3,1,PyFLAGR,Q1-E95,11.137948
4,1,PyFLAGR,Q1-E85,11.103479
5,1,PyFLAGR,Q1-E94,10.965562
6,1,PyFLAGR,Q1-E33,10.689677
7,1,PyFLAGR,Q1-E63,10.379323
8,1,PyFLAGR,Q1-E100,10.344844
9,1,PyFLAGR,Q1-E5,10.241416


### Linear methods: BordaCount and SimpleBordaCount

Member of `pyflagr.Linear`.

`BordaCount` is equivalent to `CombSUM` with `borda` normalization. Its constructor supports the following parameters:

| Parameter      | Type        | Default Value  | Values |
| :------------- | :---------- | :--------------| :------|
| `eval_pts` | Integer, Optional. Considered only if `rels_file` or `rels_df` is set. | 10 | Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for `eval_pts=10` FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$. |


In [6]:
borda = Linear.BordaCount(eval_pts=7)

df_out, df_eval = borda.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.48367,0.5,0.475,0.416667,0.425,0.42,0.45,0.442857,0.5,0.480657,0.438268,0.44024,0.434961,0.45275,0.447917


In [7]:
sborda = Linear.SimpleBordaCount(eval_pts=7)

df_out, df_eval = sborda.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.485053,0.55,0.5,0.466667,0.4375,0.44,0.483333,0.464286,0.55,0.511315,0.485196,0.462466,0.46083,0.48661,0.474093


In [8]:
# Equivalent code for Borda Count: This one produces the same results as the previous code block
csum = Linear.CombSUM(norm='borda', eval_pts=7)

df_out, df_eval = csum.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.48367,0.5,0.475,0.416667,0.425,0.42,0.45,0.442857,0.5,0.480657,0.438268,0.44024,0.434961,0.45275,0.447917


### Linear methods: CombMNZ

Member of `pyflagr.Linear`.

The `CombMNZ` constructor supports the following parameters:


| Parameter      | Type        | Default Value  | Values |
| :------------- | :---------- | :--------------| :------|
| `eval_pts` | Integer, Optional. Considered only if `rels_file` or `rels_df` is set. | 10 | Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for `eval_pts=10` FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$. |
| `norm` | String, Optional. | `borda` | Rank or score normalization methods:<ul><li>`borda`: The aggregation is performed by normalizing the element *rankings* according to the Borda normalization method. Equivalent to the `BordaCount` function.</li><li>`rank`: The aggregation is performed by normalizing the element *rankings* according to the Rank normalization method.</li><li>`score`: The aggregation is performed by normalizing the element *scores* according to the Score normalization method.</li><li>`z-score`: The aggregation is performed by normalizing the element *scores* according to the Z-Score normalization method.</li><li>`simple-borda`: Similar to `borda` normalization but no partial score is assigned to an element if it is not ranked by a voter.</li></ul> |


In [9]:
cmnz = Linear.CombMNZ(norm='rank', eval_pts=7)

df_out, df_eval = cmnz.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.485324,0.5,0.5,0.416667,0.4375,0.41,0.45,0.45,0.5,0.5,0.44134,0.451202,0.431364,0.454931,0.454479


### Majoritarian methods: Concorcet Winners

Member of `pyflagr.Majoritarian`.

The `CondorcetWinners` constructor supports the following parameters:

| Parameter      | Type        | Default Value  | Values |
| :------------- | :---------- | :--------------| :------|
| `eval_pts` | Integer, Optional. Considered only if `rels_file` or `rels_df` is set. | 10 | Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for `eval_pts=10` FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$. |


In [10]:
condorcet = Majoritarian.CondorcetWinners(eval_pts=7)

df_out, df_eval = condorcet.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.479701,0.5,0.475,0.416667,0.4125,0.43,0.466667,0.471429,0.5,0.480657,0.438268,0.431834,0.440778,0.46333,0.46669


### Majoritarian methods: Copeland Winners

Member of `pyflagr.Majoritarian`.

The `CopelandWinners` constructor supports the following parameters:

| Parameter      | Type        | Default Value  | Values |
| :------------- | :---------- | :--------------| :------|
| `eval_pts` | Integer, Optional. Considered only if `rels_file` or `rels_df` is set. | 10 | Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for `eval_pts=10` FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$. |


In [11]:
copeland = Majoritarian.CopelandWinners(eval_pts=7)

df_out, df_eval = copeland.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.481844,0.5,0.45,0.416667,0.4,0.4,0.45,0.464286,0.5,0.461315,0.435196,0.420872,0.418134,0.448516,0.457814


### Majoritarian methods: Outranking Approach

Member of `pyflagr.Majoritarian`.

The `OutrankingApproach` constructor supports the following parameters:

| Parameter      | Type        | Default Value  | Values |
| :------------- | :---------- | :--------------| :------|
| `eval_pts` | Integer, Optional. Considered only if `rels_file` or `rels_df` is set. | 10 | Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for `eval_pts=10` FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$. |
| `pref` | Hyperparameter, Float, Optional. | 0    | Preference threshold.  |
| `veto` | Hyperparameter, Float, Optional. | 0.75 | Veto threshold.        |
| `conc` | Hyperparameter, Float, Optional. | 0    | Concordance threshold. |
| `disc` | Hyperparameter, Float, Optional. | 0.25 | Discordance threshold. |


In [12]:
outrank = Majoritarian.OutrankingApproach(eval_pts=7)

df_out, df_eval = outrank.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.481749,0.5,0.425,0.433333,0.4125,0.44,0.441667,0.45,0.5,0.441972,0.443856,0.428076,0.444073,0.444712,0.449778


### Markov Chain methods: MC4

Member of `pyflagr.MarkovChains`.

The `MC4` constructor supports the following parameters:

| Parameter      | Type        | Default Value  | Values |
| :------------- | :---------- | :--------------| :------|
| `eval_pts` | Integer, Optional. Considered only if `rels_file` or `rels_df` is set. | 10 | Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for `eval_pts=10` FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$. |
| `ergodic_number` | Hyperparameter, Float, Optional. | 0.15    | The ergodic number.  |
| `delta` | Hyperparameter, Float, Optional. | 0.00000001 | The $\delta$ hyperparameter.        |
| `max_iterations` | Hyperparameter, Integer, Optional. | 200    | Maximum number of iterations. |


In [13]:
mch4 = MarkovChains.MC4(eval_pts=7, max_iterations=50)

df_out, df_eval = mch4.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.496523,0.6,0.55,0.5,0.5125,0.52,0.516667,0.507143,0.6,0.561315,0.523464,0.527925,0.530822,0.527499,0.520399


### Robust Rank Aggregation

Member of `pyflagr.RRA`.

The `RRA` constructor supports the following parameters:

| Parameter      | Type        | Default Value  | Values |
| :------------- | :---------- | :--------------| :------|
| `eval_pts` | Integer, Optional. Considered only if `rels_file` or `rels_df` is set. | 10 | Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for `eval_pts=10` FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$. |
| `exact` | Hyperparameter, Boolean, Optional. | `False` | Determines whether exact p-value correction algorithm of Stuart will be applied.  |


In [14]:
robust = RRA.RRA(eval_pts=7, exact=True)

df_out, df_eval = robust.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.437938,0.05,0.15,0.183333,0.2625,0.28,0.316667,0.335714,0.55,0.588685,0.544412,0.536945,0.518977,0.516932,0.510799


### Weighted methods: Preferelence Relations Graph

Member of `pyflagr.Weighted`.

The `PreferenceRelationsGraph` constructor supports the following parameters:

| Parameter      | Type        | Default Value  | Values |
| :------------- | :---------- | :--------------| :------|
| `eval_pts` | Integer, Optional. Considered only if `rels_file` or `rels_df` is set. | 10 | Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for `eval_pts=10` FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$. |
| `alpha`| Hyperparameter, Float, Optional. | 0.1 | The $\alpha$ hyper-parameter of the algorithm.  |
| `beta` | Hyperparameter, Float, Optional. | 0.5 | The $\beta$ hyper-parameter of the algorithm.  |


In [15]:
prf_graph = Weighted.PreferenceRelationsGraph(alpha=0.1, beta=0.5, eval_pts=7)

df_out, df_eval = prf_graph.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.461636,0.3,0.3,0.333333,0.3375,0.34,0.408333,0.407143,0.3,0.3,0.323464,0.327925,0.330822,0.376005,0.378203


### Weighted methods: Agglomerative Aggregation

Member of `pyflagr.Weighted`.

The `Agglomerative` constructor supports the following parameters:

| Parameter      | Type        | Default Value  | Values |
| :------------- | :---------- | :--------------| :------|
| `eval_pts` | Integer, Optional. Considered only if `rels_file` or `rels_df` is set. | 10 | Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for `eval_pts=10` FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$. |
| `c1` | Hyperparameter, Float, Optional. | 2.5 | The $c_1$ hyper-parameter of the algorithm.  |
| `c2` | Hyperparameter, Float, Optional. | 1.5 | The $c_2$ hyper-parameter of the algorithm.  |

In [16]:
agg = Weighted.Agglomerative(c1=0.1, c2=0.2, eval_pts=7)

df_out, df_eval = agg.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.255622,0.45,0.4,0.4,0.4125,0.46,0.458333,0.464286,0.45,0.411315,0.40866,0.41561,0.446363,0.446755,0.451634


### Weighted methods: Iterative Distance-Based Aggregation

Member of `pyflagr.Weighted`.

The `DIBRA` constructor supports the following parameters:

| Parameter      | Type        | Default Value  | Values |
| :------------- | :---------- | :--------------| :------|
| `eval_pts` | Integer, Optional. Considered only if `rels_file` or `rels_df` is set. | 10 | Determines the elements in the aggregate list on which the evaluation measures (i.e. Precision, and nDCG) will be computed. For example, for `eval_pts=10` FLAGR will compute $P@1, P@2, ... P@10$, and $N@1, N@2, ..., N@10$. |
| `aggregator` | Hyperparameter, String, Optional. | `combsum:borda` | The baseline aggregation method. An extended weighted variant of the baseline method is applied internally by plugging the computed voter weights.<br> The list of the supported values includes:<br><ul><li>`combsum:borda`: CombSUM with Borda rank normalization.</li><li>`combsum:rank`: CombSUM with rankings normalization.</li><li>`combsum:score`: CombSUM with score min-max normalization.</li><li>`combsum:z-score`: CombSUM with score z-normalization.</li><li>`combmnz:borda`: CombMNZ with Borda rank normalization.</li><li>`combmnz:rank`: CombMNZ with rankings normalization.</li><li>`combmnz:score`: CombMNZ with score min-max normalization.</li><li>`combmnz:z-score`: CombMNZ with score z-normalization.</li><li>`condorcet`: The Condorcet Winners method.</li><li>`outrank`: The Outranking Approach.</li></ul>|
| `w_norm` | Hyperparameter, String, Optional. | `minmax` | The voter weights normalization method. The list of the supported values includes:<br><ul><li>`none`: The voter weights will not be normalized.</li><li>`minmax`: The voter weights will be normalized with min-max scaling.</li><li>`z`: The voter weights will be z-normalized</li></ul> |
| `dist` | Hyperparameter, String, Optional. | `cosine` | The metric that is used to measure the distance between an input list and the temporary aggregate list. The list of the supported values includes:<br><ul><li>`rho`: The Spearman's $\rho$ correlation coefficient.</li><li>`cosine`: Cosine similarity of the lists' vector representations.</li><li>`tau`: The Kendall's $\tau$ correlation coefficient.</li><li>`footrule`: A scaled variant of Spearman's Footrule distance.</li></ul> |
| `gamma` | Hyperparameter, Float, Optional. | 1.50 | Regulates the weight convergence speed. |
| `prune` | Hyperparameter, Boolean, Optional. | `False` | Triggers a weight-dependant list pruning mechanism. |
| `d1`    | Hyperparameter, Float, Optional. Used only when `prune=True` | 0.4 | The hyperparameter $\delta_1$ of the weight-dependant list pruning mechanism. |
| `d2`    | Hyperparameter, Float, Optional. Used only when `prune=True` | 0.1 | The hyperparameter $\delta_2$ of the weight-dependant list pruning mechanism. |
| `tol`    | Hyperparameter, Float, Optional. | 0.01 | Controls the convergence precision. This tolerance threshold represents the minimum precision of the difference between the voter weight in an iteration and the voter weight of the previous iteration.|
| `max_iter` | Hyperparameter, Integer, Optional. | 50 | Controls the maximum number of iterations. FLAGR will stop the execution of DIBRA if the requested number of iterations have been performed, even if the voter weights have not fully converged.|
| `pref` | Hyperparameter, Float, Optional. | 0    | Preference threshold.  |
| `veto` | Hyperparameter, Float, Optional. | 0.75 | Veto threshold.        |
| `conc` | Hyperparameter, Float, Optional. | 0    | Concordance threshold. |
| `disc` | Hyperparameter, Float, Optional. | 0.25 | Discordance threshold. |


In [17]:
method_1 = Weighted.DIBRA(aggregator='combsum:rank', eval_pts=7)

df_out, df_eval = method_1.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.494008,0.55,0.575,0.516667,0.5125,0.47,0.466667,0.464286,0.55,0.569343,0.529608,0.52463,0.495158,0.49029,0.486599


In [18]:
method_2 = Weighted.DIBRA(eval_pts=7, gamma=1.5, prune=True, d1=0.3, d2=0.05)

df_out, df_eval = method_2.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.503105,0.55,0.55,0.583333,0.5875,0.61,0.591667,0.557143,0.55,0.55,0.573464,0.577925,0.593942,0.583816,0.562393


In [19]:
method_3 = Weighted.DIBRA(eval_pts=7, aggregator="outrank")

df_out, df_eval = method_3.aggregate(input_file=lists, rels_file=qrels)
df_eval.tail(1)


Unnamed: 0,Query,AvgPrecision,P@1,P@2,P@3,P@4,P@5,P@6,P@7,N@1,N@2,N@3,N@4,N@5,N@6,N@7
20,MEAN,0.49362,0.55,0.525,0.533333,0.5,0.47,0.458333,0.471429,0.55,0.530657,0.535196,0.512466,0.491149,0.481324,0.487617
