GitHub - julian-urbano/ictir2021-metric: Reproduce results from the ICTIR 2021 paper "How do Metric Score Distributions affect the Type I Error Rate of Statistical Significance Tests in Information Retrieval?"

This repository contains the data and source code for the following paper:

J. Urbano, M. Corsi and A. Hanjalic, "How do Metric Score Distributions affect the Type I Error Rate of Statistical Significance Tests in Information Retrieval?", ACM SIGIR International Conference on the Theory of Information Retrieval, 2021.

A single ZIP file can be downloaded as well.

Project Structure

data/ Input data files (from the SIGIR 2019 paper).
output/ Generated output files and figures.
R/ Source code in R.
scratch/ Temporary files generated in the process.

All code is written in R. If you want to run it yourself, you will need the following packages installed from CRAN: dplyr, rio, glue, emmeans, doParallel, parallel, tidyr, VineCopula, simIReff, ggplot2, forcats and moments.

How to reproduce the results in the paper

The source files in R/ need to be run in order. You can run each file individually by running Rscript R/<file>.R. They will store intermediate data in scratch/ and the final data in out/.

It is important that you always run from the base directory.

R/01-emm1.R computes the estimated marginal means across copulas, margins and sample sizes, as well as confidence intervals.
R/02-skew.R computes the skewness of the original TREC data and the simulated data.
R/03-emm2.R computes the estimated marginal means across skewness levels, as well as confidence intervals.
R/99-paper.R generates all figures and stores them in output/

It takes some time to run all the code, so it is ready to run in parallel. Most of the above code parallelizes using function foreach in R's package doParallel. In particular, it will use all available cores in the machine. Edit file R/common.R to modify this behavior and other parameters.

Note that the script R/99-paper.R is intended to generate the figures in the paper, plus all other tests and metrics not reported there. If you customize something and want a similar analysis, you will need to extend this script yourself.

License

The TREC results in data/ are anonymized and posted here with permission from the organizers.
Databases and their contents are distributed under the terms of the Creative Commons Attribution-ShareAlike 4.0 International License.
Software is distributed under the terms of the MIT License.

When using this archive, please cite the above paper:

@inproceedings{urbano2021metric,
  author = {Urbano, Juli\'{a}n and Corsi, Matteo and Hanjalic, Alan},
  booktitle = {ACM SIGIR International Conference on the Theory of Information Retrieval},
  title = {{How do Metric Score Distributions affect the Type I Error Rate of Statistical Significance Tests in Information Retrieval?}},
  year = {2021},
  pages = {xxx--xxx}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
R		R
data		data
output		output
scratch		scratch
LICENSE-DATA		LICENSE-DATA
LICENSE-SOFTWARE		LICENSE-SOFTWARE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Repository files navigation

Project Structure

How to reproduce the results in the paper

License

About

Licenses found

Releases 1

Packages

Languages

License

Licenses found

julian-urbano/ictir2021-metric

Folders and files

Latest commit

History

Repository files navigation

Project Structure

How to reproduce the results in the paper

License

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages