Gold-Standard Sentence Splitting Corpus
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
HSplit
README.md

README.md

HSplit-corpus

Gold-Standard Sentence Splitting Corpus

If you use the corpus, please cite the following paper:

  BLEU is Not Suitable for the Evaluation of Text Simplification
  Elior Sulem, Omri Abend and Ari Rappoport
  Proc. of EMNLP 2018

./HSplit

Gold-standard Sentence Splitting Corpus composed by the generations made by 4 annotators, given the complex side of the test corpus of Xu et al., 2016, following the sentence splitting guidelines. HSplit 1 and 2 correspond to Set 1 guidelines. HSplit 3 and 4 correspond to Set 2 guidelines.

Uniform tokenization and truecasing styles are obtained using the Moses toolkit (Koehn et al., 2007).

License

Attribution-ShareAlike 3.0 Unported license