Skip to content

diazf/ranking-meta-evaluation-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ranking-meta-evaluation-data

Introduction

Rankings are a common system response in a variety of learning tasks like search, recommendation, and NLP. Developing evaluation metrics for rankings often involves meta-evaluation data consisting of rankings from a variety of systems for a shared set of system inputs. This repository simplifies gathering meta-evaluation data across several retrieval and recommendation domains.

Ranking Data

Given a set of systems, for each request in the domain, we have,

  • an incomplete set of item labels, and
  • a set of top-k rankings, one per systems.

Format

All data are in TREC format with relevance judgments (qrels) in standard four-column trec_eval format,

   <query id><subtopic id><document id><relevance grade>

where subtopic id is a base 1 identifier (or "0"/"Q0" if there are no subtopic labels); if you are using this outside of the TREC context, you can just use "0" for the second column. The relevance grade is ordinal with <=0 indicating nonrelevance. If a query has no documents with relevance grade >0, it is removed from the evaluation.

System runs are assumed to be in standard six-column trec_eval format, with one system per file,

   <query id><iteration><document id><rank><score>[<run id>]   

where iteration, rank, and run id are often ignored, with the rank re-computed from the score.

Downloading the Data

IMPORTANT: You must have a license to access NIST's TREC data. For more information see here.

Set environment variables TREC_RESULTS_USER and TREC_RESULTS_PASSWORD and then run make. Data will be in the qrels and runs directories.

Statistics

Retrieval

domain tag requests systems rel/request items/request reference
legal/2006 39 34 110.85 4835.07 paper, www
legal/2007 43 68 101.023 22240.30 paper, www
core/2017 50 75 180.04 8853.11 paper, www
core/2018 50 72 78.96 7102.61 www
deep-docs/2019 43 38 153.42 623.77 paper, www
deep-docs/2020 45 64 39.27 99.55 paper, www
deep-docs/2021 57 66 189.63 98.83 paper, www
deep-docs/2022 76 42 1245.62 98.86 paper, www
deep-docs/2023 82 5 75.10 100 paper, www
deep-pass/2019 43 37 95.40 892.51 paper, www
deep-pass/2020 54 59 66.78 978.01 paper, www
deep-pass/2021 53 63 191.96 99.95 paper, www
deep-pass/2022 76 100 628.145 97.5 paper, www
deep-pass/2023 82 35 49.87 99.90 paper, www
web/2009 50 48 129.98 925.31 paper, www
web/2010 48 32 187.63 7013.21 paper, www
web/2011 50 61 167.56 8325.07 paper, www
web/2012 50 48 187.36 6719.53 paper, www
web/2013 50 61 182.42 7174.38 paper, www
web/2014 50 30 212.58 6313.98 paper, www
robust/2004 249 110 69.93 913.82 paper, www

Recommendation

domain tag requests systems rel/request items/request reference
movielens/2018 6005 21 18.87 100.00 paper, www
libraryThing/2018 7227 21 13.15 100.00 paper, www
beerAdvocate/2018 17564 21 13.66 99.39 paper, www

Contact

Fernando Diaz

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published