Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model

This paper has been accepted by EMNLP2023.

Requirements

python==3.7
pytorch==1.11.0
transformers==4.28.1
scipy==1.7.3
scikit-learn==1.0.2
numpy==1.21.5

Prepare datasets

Download the benchmark datasets and put them under the directory ./data. Modify corresponding paths in load_dataset.py if necessary.

Setting	Dataset	Val	Test	Source	Link
Inconsistency Detection (SUMMAC Benchmark)	CoGenSum	1281	400	C	https://github.com/tingofurro/summac
	SummEval	850	850	C
	FRANK	671	1575	C+X
	Polytope	634	634	C
	FactCC	931	503	C
	XSumFaith	1250	1250	C
Faithfulness Rating	FRANKCNN	-	1250	C	https://github.com/NJUNLP/CoP
	QAGSCNN	-	235	C	https://github.com/NJUNLP/CoP
	SummEval	-	1600	C	https://github.com/Yale-LILY/SummEval
	FRANKXSUM	-	996	X	https://github.com/NJUNLP/CoP
	QAGSXSUM	-	239	X	https://github.com/NJUNLP/CoP

Probability Caculation

Calculate the probabilities based on a foundation language model by:

CUDA_VISIBLE_DEVICES=0 python3 main.py

The results will be saved under the directory ./output, or can be downloaded with this link.

FFLM

Then, the summary-level and system-level performances of FFLM can be calculated as follows:

python3 summary-level-evaluation.py --file_path xxx
python3 system-level-evaluation.py --file_path xxx

Citation

@article{jia2023fflm,
  title={Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model},
  author={Qi Jia, Siyu Ren, Yizhu Liu, Kenny Q. Zhu},
  jbooktitle={Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing},
  year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

output

output

scorers

scorers

README.md

README.md

load_dataset.py

load_dataset.py

main.py

main.py

summary-level-evaluation.py

summary-level-evaluation.py

system-level-evaluation.py

system-level-evaluation.py

utils.py

utils.py

Repository files navigation

Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model

Requirements

Prepare datasets

Probability Caculation

FFLM

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
output		output
scorers		scorers
README.md		README.md
load_dataset.py		load_dataset.py
main.py		main.py
summary-level-evaluation.py		summary-level-evaluation.py
system-level-evaluation.py		system-level-evaluation.py
utils.py		utils.py

JiaQiSJTU/FaithEval-FFLM

Folders and files

Latest commit

History

Repository files navigation

Zero-shot Faithfulness Evaluation for Text Summarization with Foundation Language Model

Requirements

Prepare datasets

Probability Caculation

FFLM

Citation

About

Resources

Stars

Watchers

Forks

Languages