Why Did You Not Compare With That? Identifying Papers for Use as Baselines

This repository contains the supplementary material for the paper titled "Why Did You Not Compare With That? Finding Papers for Use as Baselines", accepted for publication at the 44th European Conference on Information Retrieval (ECIR 2022). A pre-print of the paper is available on arXiv.

We used ACL Anthology Reference Corpus (ARC) [1] as the base data source for preparing the annotated dataset for our study. The ARC corpus consists of scholarly papers published at various Computational Linguistics up to December 2015. The corpus consists of 22; 875 articles and provides the original PDFs, extracted text and logical document structure (section information) of the papers, and parsed citations using the ParsCit tool [2].

The repository provides the annotated data for the baseline classification task. The main data file (baseline_labels.pkl) provides following information about the 2075 papers annotated by the human evaluators.

ARC Corpus ID of the paper;
a list of all the references in the paper extracted by ParsCit;
for each extracted reference, a binary label indicating whether the reference in the paper corresponds to a baseline or not. 1 indicates that the reference has been used as a baseline in the paper and 0 indicates a non-baseline reference.

References

Bird, S., Dale, R., Dorr, B.J., Gibson, B.R., Joseph, M.T., Kan, M.Y., Lee, D., Powley, B., Radev, D.R., Tan, Y.F.: The acl anthology reference corpus: A reference dataset for bibliographic research in computational linguistics. In: LREC. European Language Resources Association (2008)
Councill, I.G., Giles, C.L., Kan, M.Y.: Parscit: an open-source crf reference string parsing package. In: LREC. European Language Resources Association (2008)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md
baseline_labels.pkl		baseline_labels.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

baseline_labels.pkl

baseline_labels.pkl

Repository files navigation

Why Did You Not Compare With That? Identifying Papers for Use as Baselines

About

Releases

Packages

License

sumit-research/baseline-search

Folders and files

Latest commit

History

Repository files navigation

Why Did You Not Compare With That? Identifying Papers for Use as Baselines

About

Resources

License

Stars

Watchers

Forks