`Longformer for MS MARCO document ranking task`

About

We employ Longformer, a BERT-like model for long documents, on the MS MARCO document re-ranking dataset. More details about our model and experimental setting can be found in our paper.

Learning setting

Due to the computing limitations, the hyperparameters were not optimised. We default to the following hyperparameters:

--lr=3e-05
--max_seq_len=4096
--num_warmup_steps=2500

For each query, we randomly sample 10 negative documents from the top 100 documents retrieved in the initial retrieval step.

Training the model

To train the model, first download all of the necessary data, as described in data/README.md. File names should match the filenames in MarcoDataset.py.

You can then train with:

python run_longformer_marco.py

You can check all available hyperparameters with:

python run_longformer_marco.py --help

Results

	Dev	Test
MRR@100	0.3366	0.307

The work is done by Ivan Sekulic (Università della Svizzera italiana), Amir Soleimani (University of Amsterdam), Mohammad Aliannejadi (University of Amsterdam), and Fabio Crestani (Università della Svizzera italiana).

Citing

Please consider citing our paper if you use our code or models:

@misc{sekuli2020longformer,
title={Longformer for MS MARCO Document Re-ranking Task},
author={Ivan Sekulić and Amir Soleimani and Mohammad Aliannejadi and Fabio Crestani},
year={2020},
eprint={2009.09392},
archivePrefix={arXiv},
primaryClass={cs.IR}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`Longformer for MS MARCO document ranking task`

About

Learning setting

Training the model

Results

Citing

About

Releases

Packages

Contributors 3

Languages

isekulic/longformer-marco

Folders and files

Latest commit

History

Repository files navigation

Longformer for MS MARCO document ranking task

About

Learning setting

Training the model

Results

Citing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

`Longformer for MS MARCO document ranking task`

Packages