WikiFactCheck-English

This repository contains the data to accompany 'Automated Fact-Checking of Claims from Wikipedia'.

Contents as follows:

.
│
├── wikifactcheck-en_full0.jsonl
├── wikifactcheck-en_full1.jsonl
├── wikifactcheck-en_full2.jsonl
├── wikifactcheck-en_full3.jsonl
├── wikifactcheck-en_full4.jsonl
│
├── wikifactcheck-en_test.jsonl
└── wikifactcheck-en_train.jsonl

As explained in the paper, the annotated portion of the corpus is split into train and test sets. The entirety of the data (including annotated as well as non-annotated) is contained in the full sets, split into 5 for space constraints.

You may want to make use of the provided loading script to make use of the dataset in your code. Tip: add the repository directory to your PATH so that you can use the script in your project folder.

usage: loadwfc-en.py [-h] [-d] [-f]
                     [-r [{train,test,full} [{train,test,full} ...]]]
                     [-n NUMLINES] [-t {json,python}]

optional arguments:
  -h, --help            show this help message and exit
  -d, --download        download dataset
  -f, --force           force re-download?
  -r [{train,test,full} [{train,test,full} ...]], --read [{train,test,full} [{train,test,full} ...]]
                        read from particular datasets (default: all)
  -n NUMLINES, --numlines NUMLINES
                        numlines to read from each one
  -t {json,python}, --fmt {json,python}
                        output format for --read option

Citation:

@InProceedings{wikifactchkeng:2020:LREC,
  author    = {Sathe, Aalok  and  Ather, Salar  and  Le, Tuan Manh  and  Perry, Nathan  and  Park, Joonsuk},
  title     = {Automated Fact-Checking of Claims from Wikipedia},
  booktitle      = {Proceedings of The 12th Language Resources and Evaluation Conference},
  month          = {May},
  year           = {2020},
  address        = {Marseille, France},
  publisher      = {European Language Resources Association},
  pages     = {6874--6882},
  url       = {https://www.aclweb.org/anthology/2020.lrec-1.849}
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
wikifactcheck		wikifactcheck
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
loadwfc-en.py		loadwfc-en.py
transformers		transformers
wfc_dev.tsv		wfc_dev.tsv
wfc_train.tsv		wfc_train.tsv
wikifactcheck-english_full0.jsonl		wikifactcheck-english_full0.jsonl
wikifactcheck-english_full1.jsonl		wikifactcheck-english_full1.jsonl
wikifactcheck-english_full2.jsonl		wikifactcheck-english_full2.jsonl
wikifactcheck-english_full3.jsonl		wikifactcheck-english_full3.jsonl
wikifactcheck-english_full4.jsonl		wikifactcheck-english_full4.jsonl
wikifactcheck-english_test.jsonl		wikifactcheck-english_test.jsonl
wikifactcheck-english_train.jsonl		wikifactcheck-english_train.jsonl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wikifactcheck

wikifactcheck

.gitignore

.gitignore

LICENSE.md

LICENSE.md

README.md

README.md

loadwfc-en.py

loadwfc-en.py

transformers

transformers

wfc_dev.tsv

wfc_dev.tsv

wfc_train.tsv

wfc_train.tsv

wikifactcheck-english_full0.jsonl

wikifactcheck-english_full0.jsonl

wikifactcheck-english_full1.jsonl

wikifactcheck-english_full1.jsonl

wikifactcheck-english_full2.jsonl

wikifactcheck-english_full2.jsonl

wikifactcheck-english_full3.jsonl

wikifactcheck-english_full3.jsonl

wikifactcheck-english_full4.jsonl

wikifactcheck-english_full4.jsonl

wikifactcheck-english_test.jsonl

wikifactcheck-english_test.jsonl

wikifactcheck-english_train.jsonl

wikifactcheck-english_train.jsonl

Repository files navigation

WikiFactCheck-English

About

Releases

Packages

Languages

License

wikifactcheck-english/wikifactcheck-english

Folders and files

Latest commit

History

Repository files navigation

WikiFactCheck-English

About

Resources

License

Stars

Watchers

Forks

Languages