Benchmarking the Generation of Fact Checking Explanations

This repository contains the code necessary to recreate the datasets employed in Benchmarking the Generation of Fact Checking Explanations, namely LIAR++ and FullFact datasets. If you use these datasets or any partial sections of them in your work, we kindly request to cite our original paper.

LIAR++
FullFact

LIAR++

We created LIAR++ starting from the LIAR-PLUS dataset Alhindi et al., EMNLP 2018. LIAR-PLUS contains some entries in which the verdict was artificially created by extracting the last five sentences from the body of the article (silver data). In all the other cases, verdicts were extracted from a specific section of web pages, usually titled Our ruling or Summing up (gold data). LIAR-PLUS is publicly available and can be found here. This dataset does not contain the articles' text, and POLITIFACT APIs are no longer available. To retrieve this information we leveraged the data IDs and scraped from scratch the articles. In the liar_plus_plus folder we provide a list of all the LIAR-PLUS IDs as well as the list of the IDs of the gold data employed in our experiments (divided in train, validation, and test sets). We also provide the code employed for scraping the articles from PolitiFact website.

FullFact

With a similar procedure to that used for LIAR++, we created a new dataset starting from the FullFact website. This dataset contains data spanning from 2010 to 2021 and covers several different topics, such as health, economy, crime, law, and education. In FullFact the verdict is always present as a separate element in the web page so there was no need to filter the data. This dataset accounts for 1838 claim-article-verdict triples. Data were obtained by scraping from the website pages the necessary information. In fullfact folder, we provide the list of the urls to the FullFact articles used for building our dataset (divided into train, validation, and test sets).

BibTex Citation

@article{10.1162/tacl_a_00601,
    author = {Russo, Daniel and Tekiroğlu, Serra Sinem and Guerini, Marco},
    title = "{Benchmarking the Generation of Fact Checking Explanations}",
    journal = {Transactions of the Association for Computational Linguistics},
    volume = {11},
    pages = {1250-1264},
    year = {2023},
    month = {10},
    issn = {2307-387X},
    doi = {10.1162/tacl_a_00601},
    url = {https://doi.org/10.1162/tacl\_a\_00601},
    eprint = {https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl\_a\_00601/2163489/tacl\_a\_00601.pdf},
}

Citation

Daniel Russo, Serra Sinem Tekiroğlu, Marco Guerini; Benchmarking the Generation of Fact Checking Explanations. Transactions of the Association for Computational Linguistics 2023; 11 1250–1264. doi: https://doi.org/10.1162/tacl_a_00601

For any questions or inquiries, please contact drusso@fbk.eu

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
fullfact/urls		fullfact/urls
img		img
liar_plus_plus		liar_plus_plus
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fullfact/urls

fullfact/urls

img

img

liar_plus_plus

liar_plus_plus

README.md

README.md

Repository files navigation

Benchmarking the Generation of Fact Checking Explanations

LIAR++

FullFact

BibTex Citation

Citation

About

Releases

Packages

Languages

LanD-FBK/benchmark-gen-explanations

Folders and files

Latest commit

History

Repository files navigation

Benchmarking the Generation of Fact Checking Explanations

LIAR++

FullFact

BibTex Citation

Citation

About

Resources

Stars

Watchers

Forks

Languages