Summarization evaluation tool for content selection evaluation.

This tool provides a score for a system summary given a set of human writtent reference summary. It provides a more robust alternative to ROUGE for the evaluation of summarization systems.

It has been explicitely trained to exhibit high-correlation with human judgments at the summary-level (sometimes refer to as the micro-level).

If you reuse this software, please use the following citation:

@inproceedings{TUD-CS-2017-0202,
	title = {{Learning to Score System Summaries for Better Content Selection Evaluation}},
	author = {Peyrard, Maxime and Botschen, Teresa and Gurevych, Iryna},
	booktitle = {{Proceedings of the EMNLP workshow {\"}New Frontiers in Summarization{\"}}},
	pages = {(to appear)},
	month = sep,
	year = {2017},
	location = {Copenhaguen, Denmark}
}

Abstract: The evaluation of summaries is a challenging but crucial task of the summarization field. In this work, we propose to learn an automatic scoring metric based on the human judgements available as part of classical summarization datasets like TAC-2008 and TAC-2009. Any existing automatic scoring metrics can be included as features, the model learns the combination exhibiting the best correlation with human judgments. The reliability of the new metric is tested in a further manual evaluation where we ask humans to evaluate summaries covering the whole scoring spectrum of the metric. We release the trained metric as an open-source tool.

Contact person: Maxime Peyrard, peyrard@aiphes.tu-darmstadt.de

http://www.ukp.tu-darmstadt.de/

Requirements

Python 2.7
Numpy 1.12.1 (http://www.numpy.org)
nltk 3.2.1 (http://www.nltk.org)
scipy 0.19.0 (https://www.scipy.org)

Usage

The trained models for english and german are available under Data/models/. We used the dependency based word embeddings for english: https://levyomer.wordpress.com/2014/04/25/dependency-based-word-embeddings/ and the following german embeddings (with min count 5): https://www.ukp.tu-darmstadt.de/research/ukp-in-challenges/germeval-2014/

To test the installation just run:

python example.py [embedding_path] [model_folder]

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
S3		S3
models		models
LICENSE.txt		LICENSE.txt
NOTICE.txt		NOTICE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Summarization evaluation tool for content selection evaluation.

Requirements

Usage

About

Releases

Packages

Languages

License

UKPLab/emnlp-ws-2017-s3

Folders and files

Latest commit

History

Repository files navigation

Summarization evaluation tool for content selection evaluation.

Requirements

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages