Skip to content

sheffieldnlp/mlqe-pe

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
May 10, 2021 21:33
October 1, 2020 15:43

mlqe-pe

Multilingual Quality Estimation and Automatic Post-editing Dataset. This is an updated version of the MLQE dataset to include post-editing data, as well Ru-En data. Please refer to the MLQE repo for the NMT models that generated the data. The multilingual NMT models used to generate translations for the zero-shot language pairs can be found here: mBART50 (many-to-one for Ps-En and Km-En, and one-to-many for En-Cs and En-Ja).

Citation

If you use this data in your work, please cite:

@article{fomicheva2020mlqepe,
    title={{MLQE-PE}: A Multilingual Quality Estimation and Post-Editing Dataset}, 
    author={Marina Fomicheva and Shuo Sun and Erick Fonseca and Fr\'ed\'eric Blain and Vishrav Chaudhary and Francisco Guzm\'an and Nina Lopatina and Lucia Specia and Andr\'e F.~T.~Martins},
    year={2020},
    journal={arXiv preprint arXiv:2010.04480}
}
@article{tacl2020,
    title = {Unsupervised Quality Estimation for Neural Machine Translation},
    author = {Fomicheva, Marina and Sun, Shuo and Yankovskaya, Lisa and Blain, Frédéric and Guzmán, Francisco and Fishel, Mark and Aletras, Nikolaos and Chaudhary, Vishrav and Specia, Lucia},
    journal = {Transactions of the Association for Computational Linguistics},
    volume = {8},
    pages = {539-555},
    year = {2020}
}

About

Multilingual Quality Estimation and Automatic Post-editing Dataset

Resources

License

Stars

Watchers

Forks

Packages

No packages published