VENCE

Code for our AAAI 2023 paper: Converge to the Truth: Factual Error Correction via Iterative Constrained Editing.

Dependencies

pip install -r requirements.txt

Download dataset

The raw dataset can be downloaded from this Google Drive folder. This is the FEVER intermediate annotation data released by Evidence-based Factual Error Correction.

Download processed data

The processed data can be downloaded from this Google Drive folder. The data is organized as follows:

.
├── t5
│   └── data
│       ├── entity_loc.jsonl
│       ├── t5_data.json
│       ├── t5_test.json
│       ├── t5_train.json
│       └── t5_val.json
└── verfication
    └── data
        ├── test.json
        ├── train.json
        └── valid.json

Data processing

Same as verification/data_process.py, the claim and evidence data for the proposal model should be processed as this:

Set prefix as Please recover the part of the claim that was masked according to the evidence..
Organize the claim and evidence as substituted entity : evidence : {evidence} claim : {claim} or substituted one word : evidence : {evidence} claim : {claim}

Download pre-trained checkpoints

Pre-trained checkpoints for T5 and verification models are stored here.

Note that the verification model is off-the-shelf and could be replaced with other fact verification models.

Training the T5 model

export PJ_HOME=YOUR_PATH_TO/VENCE/

python t5/finetune.py \
--model_name_or_path t5-base \
--do_train \
--do_eval \
--train_file $train_file \
--validation_file $valid_file \
--output_dir ${PJ_HOME}/t5/model \
--overwrite_output_dir \
--per_device_train_batch_size=4 \
--per_device_eval_batch_size=4 \
--predict_with_generate \
--text_column input \
--summary_column output

Training the verfication model

export PJ_HOME=YOUR_PATH_TO/VENCE/

python verfication/finetune.py \
--model_name_or_path xlm-roberta-base \
--do_train \
--do_eval \
--train_file $train_file \
--validation_file $valid_file \
--per_device_train_batch_size=16 \
--per_device_eval_batch_size=8 \
--max_seq_length 512 \
--learning_rate 2e-5 \
--num_train_epochs 3 \
--output_dir ${PJ_HOME}/verfication/model \
--overwrite_output_dir

Running VENCE：

python main/main.py \
--iter_num 15 \
--es_lm 0.08 \
--es_ver 100 \
--es_dis 8

Citation

If you find our work useful to your research, please kindly cite our paper (pre-print, official bibtex coming soon.):

@article{chen2022converge,
  title={Converge to the Truth: Factual Error Correction via Iterative Constrained Editing},
  author={Chen, Jiangjie and Xu, Rui and Zeng, Wenxuan and Sun, Changzhi and Li, Lei and Xiao, Yanghua},
  journal={arXiv preprint arXiv:2211.12130},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
main		main
ner		ner
t5		t5
verfication		verfication
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
img.png		img.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

main

main

ner

ner

t5

t5

verfication

verfication

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

img.png

img.png

requirements.txt

requirements.txt

Repository files navigation

VENCE

Dependencies

Download dataset

Download processed data

Data processing

Download pre-trained checkpoints

Training the T5 model

Training the verfication model

Running VENCE：

Citation

About

Releases

Packages

Languages

License

jiangjiechen/VENCE

Folders and files

Latest commit

History

Repository files navigation

VENCE

Dependencies

Download dataset

Download processed data

Data processing

Download pre-trained checkpoints

Training the T5 model

Training the verfication model

Running VENCE：

Citation

About

Resources

License

Stars

Watchers

Forks

Languages