Skip to content

Data and code of the ESA 2018 Experiment on the objectivity of peer review

Notifications You must be signed in to change notification settings

ad-freiburg/esa2018-experiment

Repository files navigation

Data, code, slides and blog post of the ESA 2018 Experiment

If you have any questions about this repository, please feel free to open an issue or send me an email at bast@cs.uni-freiburg.de with subject "ESA 2018 experiment".

The ESA 2018 Experiment was an in-depth analysis of two parallel program committees reviewing the complete set of submissions independently. This repository provides the (anonymized) data behind the experiment, we well as a Python script to analyze and visualize the data in various ways. It also contains the blog post published at BLOG@CACM: https://github.com/ad-freiburg/esa2018-experiment/blob/master/BLOGPOST.md The slides from the report presented at the business meeting of the conference can be found here: http://ad-publications.informatik.uni-freiburg.de/ESA_experiment_Bast_2018.pdf

Data

The anonymized data is given in six files, one for each PC and each reviewing phase. For example, scores-phase1-pc2.tsv contains a snapshot of the scores from PC2 after Phase 1. There is one line per submission, and one submission has the same line number in all files. The first eight columns are pairs of review score and confidence score. Most submissions received three reviews, in which case the seventh and eigth column are empty. The voting results after Phase 3 are recorded in an additional ninth and tenth column (the average score and confidence from the votes).

Explanations of some details from the blog post

Here are explanation of a few details from the blog post. They also explain by example how the analyze.py script works. Some of the explanations refer to the slides from the link above.

  1. If a fraction pi of the papers are accepted with probability ai, then the expected overlap is Σi pi ai2 / Σi pi ai. For the simple model, where each paper is accepted independently from the others with the same fixed acceptance rate, the expected overlap is simply that acceptance rate.

  2. See slide 5 for the exact semantics of the scores. See slide 9 for the detailed scores of the 9 papers, which were a "clear accept" in at least one PC.

  3. Run python3 analyze.py to see the definition of the l5 score for each paper (a single rule-based score from -2, -1, 0, +1, +2). It is also explained on slide 10. To see the confusion matrix between the two PCs after each phase, run python3 analyze.py l5 --confusion-pcs. To see how often which score was given by which PC, run python3 analyze.py l5 --print and execute the produced gnuplot script. The bars on the left show the clear rejects after each reviewing phase.

  4. To compute the Kendall tau correlation of the upper part of the ranking of the two PCs, run python3 analyze.py avt. The script also explains the avt score (the l5 score, but set to zero if not at least one reviewer gave the paper a +2). To compute the p-values of an R-test, run python3 analyze.py avt --rtest.

About

Data and code of the ESA 2018 Experiment on the objectivity of peer review

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published