Skip to content

LanD-FBK/benchmark-gen-explanations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Benchmarking the Generation of Fact Checking Explanations

This repository contains the code necessary to recreate the datasets employed in Benchmarking the Generation of Fact Checking Explanations, namely LIAR++ and FullFact datasets. If you use these datasets or any partial sections of them in your work, we kindly request to cite our original paper.


LIAR++

We created LIAR++ starting from the LIAR-PLUS dataset Alhindi et al., EMNLP 2018. LIAR-PLUS contains some entries in which the verdict was artificially created by extracting the last five sentences from the body of the article (silver data). In all the other cases, verdicts were extracted from a specific section of web pages, usually titled Our ruling or Summing up (gold data). LIAR-PLUS is publicly available and can be found here. This dataset does not contain the articles' text, and POLITIFACT APIs are no longer available. To retrieve this information we leveraged the data IDs and scraped from scratch the articles. In the liar_plus_plus folder we provide a list of all the LIAR-PLUS IDs as well as the list of the IDs of the gold data employed in our experiments (divided in train, validation, and test sets). We also provide the code employed for scraping the articles from PolitiFact website.

FullFact

With a similar procedure to that used for LIAR++, we created a new dataset starting from the FullFact website. This dataset contains data spanning from 2010 to 2021 and covers several different topics, such as health, economy, crime, law, and education. In FullFact the verdict is always present as a separate element in the web page so there was no need to filter the data. This dataset accounts for 1838 claim-article-verdict triples. Data were obtained by scraping from the website pages the necessary information. In fullfact folder, we provide the list of the urls to the FullFact articles used for building our dataset (divided into train, validation, and test sets).


BibTex Citation

@article{10.1162/tacl_a_00601,
    author = {Russo, Daniel and Tekiroğlu, Serra Sinem and Guerini, Marco},
    title = "{Benchmarking the Generation of Fact Checking Explanations}",
    journal = {Transactions of the Association for Computational Linguistics},
    volume = {11},
    pages = {1250-1264},
    year = {2023},
    month = {10},
    issn = {2307-387X},
    doi = {10.1162/tacl_a_00601},
    url = {https://doi.org/10.1162/tacl\_a\_00601},
    eprint = {https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl\_a\_00601/2163489/tacl\_a\_00601.pdf},
}

Citation

Daniel Russo, Serra Sinem Tekiroğlu, Marco Guerini; Benchmarking the Generation of Fact Checking Explanations. Transactions of the Association for Computational Linguistics 2023; 11 1250–1264. doi: https://doi.org/10.1162/tacl_a_00601

For any questions or inquiries, please contact drusso@fbk.eu

About

Codes for "Benchmarking the Generation of Fact Checking Explanations"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages