SAMSA Evaluation Metric and Human Evaluation Benchmark for Text Simplification
If you use the metric or the benchmark, please cite the following paper:
Semantic Structural Evaluation for Text Simplification Elior Sulem, Omri Abend and Ari Rappoport Proc. of NAACL 2018
Human evaluation scores given by 5 annotators for the 4 elicitation questions described in the paper. Each annotator scored the same 700 (input,output) pairs.
The source sentences and the system outputs can be found at http://homepages.inf.ed.ac.uk/snaraya2/data/simplification-2016.tgz.
./SAMSA_score.py: The SAMSA metric
./SAMSA_abl_score.py: The SAMSA metric without the non-splitting penalty