Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
dev
 
 
 
 
 
 
 
 

README.md

JFLEG (JHU FLuency-Extended GUG) corpus

Last updated: December 7th, 2018

(Make sure to download and use the latest version.)

link to the paper


Data

.
├── EACL_exp      # experiments in the EACL paper
│   ├── m2converter # script to create m2 format from plain texts
│   ├── mturk     # mechanical turk experiments
│   │   ├── sample.csv
│   │   ├── pairwise.csv
│   │   └── template.html
│   └── manual_eval # manual analysis of 100 sentences
│       ├── README.md
│       └── coded_sentences.csv
├── README.md     # This file
├── EACLshort037.pdf
├── dev           # dev set (754 sentences originally from the GUG **test** set)
│   ├── dev.ref0
│   ├── dev.ref1
│   ├── dev.ref2
│   ├── dev.ref3
│   ├── dev.spellchecked.src (spellchecked by enchant)
│   └── dev.src   # source (This should be the input for your system.)
├── eval
│   └── gleu.py   # evaluation script (sentence-level GLEU score)
└── test          # test set (747 sentenses ogirinally from the GUG **dev** set)
    ├── test.ref0
    ├── test.ref1
    ├── test.ref2
    ├── test.ref3
    ├── test.spellchecked.src (spellchecked by enchant)
    └── test.src   # source (This should be the input for your system.)

Evaluation

e.g. python ./eval/gleu.py -r ./dev/dev.ref[0-3] -s ./dev/dev.src --hyp YOUR_SYSTEM_OUTPUT

Leader Board (published results)

N.B. Sytems with asterisk (*) are tuned on different data.

System GLEU (dev) GLEU (test)
Ge et al. (2018) N/A 62.42
Grundkiewicz and Junczys-Dowmunt (2018) N/A 61.50
Junczys-Dowmunt et al. (2018) N/A 59.90
Chollampatt and Ng (2018) 52.48 57.47
Chollampatt and Ng (2017) 51.01 56.78
Xie et al. (2018)* N/A 56.20
Sakaguchi et al. (2017) 49.82 53.98
Ji et al. (2017)* 48.93 53.41
Yuan and Briscoe (2016)* 47.20 52.05
Junczys-Dowmunt and Grundkiewicz (2016) 49.74 51.46
Chollampatt et al. (2016)* 46.27 50.13
Felice et al. (2014)* 42.81 46.04
=================================== ========== ==========
SOURCE 38.21 40.54
REFERENCE 55.26 62.37
  • If you want to add your score, please send an e-mail to keisuke[at]cs.jhu.edu a link to your paper and system outputs.
  • The reference scores are computed by averaging each reference.

Reference

The following paper should be cited in any publications that use this dataset:

Courtney Napoles, Keisuke Sakaguchi and Joel Tetreault. (EACL 2017): JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Valencia, Spain. April 03-07, 2017.

Michael Heilman, Aoife Cahill, Nitin Madnani, Melissa Lopez, Matthew Mulholland, and Joel Tetreault. (ACL 2014): Predicting Grammaticality on an Ordinal Scale. In Proceedings of the Association for Computational Linguistics. Baltimore, MD, USA. June 23-25, 2014.

bibtex information:

@InProceedings{napoles-sakaguchi-tetreault:2017:EACLshort,
  author    = {Napoles, Courtney  and  Sakaguchi, Keisuke  and  Tetreault, Joel},
  title     = {JFLEG: A Fluency Corpus and Benchmark for Grammatical Error Correction},
  booktitle = {Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers},
  month     = {April},
  year      = {2017},
  address   = {Valencia, Spain},
  publisher = {Association for Computational Linguistics},
  pages     = {229--234},
  url       = {http://www.aclweb.org/anthology/E17-2037}
}

@InProceedings{heilman-EtAl:2014:P14-2,
  author    = {Heilman, Michael  and  Cahill, Aoife  and  Madnani, Nitin  and  Lopez, Melissa  and  Mulholland, Matthew  and  Tetreault, Joel},
  title     = {Predicting Grammaticality on an Ordinal Scale},
  booktitle = {Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
  month     = {June},
  year      = {2014},
  address   = {Baltimore, Maryland},
  publisher = {Association for Computational Linguistics},
  pages     = {174--180},
  url       = {http://www.aclweb.org/anthology/P14-2029}
}

Questions

  • Please e-mail Courtney Napoles (napoles[at]cs.jhu.edu) and Keisuke Sakaguchi (keisuke[at]cs.jhu.edu).

License

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

About

JFLEG (JHU FLuency-Extended GUG) corpus for Grammatical Error Correction Evaluation

Resources

Releases

No releases published

Packages

No packages published
You can’t perform that action at this time.