Human evaluation results and translation output for the Translator Human Parity Data release
Microsoft Translator Human Parity Data (v1.0)
This release contains the following data:
Two new references for newstest2017, one based on human translation from scratch (Reference-HT), the other based on human post-editing (Reference-PE).
Human parity translations generated by our research systems Combo-4, Combo-5, and Combo-6; Output from online machine translation service Online-A-1710, collected on October 16, 2017.
We release all data points collected in our human evaluation campaigns. This includes annotations for Subset-1, Subset-2, Subset-3, and Subset-4.
We share the (anonymized) annotator IDs, segment IDs, system IDs, type ID (either TGT or CHK, the second being a repeated judgment for the first), raw scores r in [0,100], as well as annotation start and end times.
Additionally, we share the combined data for Meta-1 campaign on Subset-1.
When using this data we require that you cite our paper.
- URL: https://www.microsoft.com/en-us/download/details.aspx?id=56724
- Date: March 14, 2018
- Description: Initial release.
Microsoft Research Data License Agreement
For Translator Human Parity Data This Microsoft Research Data License Agreement ("Agreement") is a legal agreement between you and Microsoft Corporation (or based on where you live, one of its affiliates). Please read them. They apply to the Microsoft Research dataset named above, which may include any associated materials, text or speech files, associated media and "online" or electronic documentation and any updates we provide in our discretion (together, the "Dataset"). The terms also apply to any Microsoft (i) updates, (ii) supplements, and (iii) internet-based services, and (iv) support services for this Dataset, unless other terms accompany those items. If so, those terms apply. By agreeing to this Agreement or by using the Dataset, you accept these terms. If you do not accept them, do not use the Dataset. If you comply with these terms, you have the rights below. 1. SCOPE OF RIGHTS. You may use, copy, modify, and create derivative works of the Dataset: i. for non-commercial or research purposes only. Examples of non-commercial uses are teaching, academic research, public demonstrations and personal experimentation; ii. for analyzing and testing purposes; and iii. to publish (or present papers or articles) on your results from using such Dataset, provided that no material portion of the Dataset is included in any such publication or presentation. 2. DISTRIBUTION RESTRICTIONS. You may not: (a) distribute the Dataset; (b) alter any copyright, trademark or patent notice in the Dataset; (c) use Microsoft's trademarks in a way that suggests your derivative works or modifications come from or are endorsed by Microsoft; or (d) include the Dataset in malicious, deceptive or unlawful programs. 3. OWNERSHIP. Microsoft retains all right, title, and interest in and to the Dataset. 4. FEEDBACK. If you give feedback about the Dataset to Microsoft, you give to Microsoft, without charge, the right to use, share and commercialize your feedback in any way and for any purpose. You also give to third parties, without charge, any patent rights needed for their products, technologies and services to use or interface with any specific parts of a Microsoft dataset or service that includes the feedback. You will not give feedback that is subject to a license that requires Microsoft to license its Dataset or documentation to third parties because we include your feedback in them. These rights survive this Agreement. 5. TERM; TERMINATION The term of this Agreement will commence upon your acceptance of this Agreement and will continue unless terminated as provided herein. If you breach this Agreement or if you sue anyone over patents that you think may apply to or read on the Dataset or anyone's use of the Dataset, this Agreement (and your license and rights obtained herein) terminate automatically. If this agreement is terminated, you must return or destroy all full or partial copies of the Dataset in your possession immediately. Any sections that are intended to survive termination of this Agreement shall survive. 6. EXPORT RESTRICTIONS. The Dataset is subject to United States export laws and regulations. You must comply with all domestic and international export laws and regulations that apply to the Dataset. These laws include restrictions on destinations, end users and end use. For additional information, see www.microsoft.com/exporting. 7. ENTIRE AGREEMENT. This Agreement, and the terms for any supplements or updates that you use, are the entire agreement for the Dataset. 8. SEVERABILITY. If any court of competent jurisdiction determines that any provision of this Agreement is illegal, invalid or unenforceable, the remaining provisions will remain in full force and effect. 9. GOVERNING LAW AND VENUE. This Agreement is governed by and construed in accordance with the laws of the state of Washington, without reference to its choice of law principles to the contrary. Each party hereby consents to the jurisdiction and venue of the state and federal courts located in King County, Washington, with regard to any suit or claim arising under or by reason of this Agreement. 10. LEGAL EFFECT. This Agreement describes certain legal rights. You may have other rights under the laws of your country. You may also have rights with respect to the party from whom you acquired the Dataset. This Agreement does not change your rights under the laws of your country if the laws of your country do not permit it to do so. 11. NO ASSIGNMENT. You may not assign this Agreement or any rights or obligations hereunder, except with Microsoft's express written consent. Any attempted assignment in violation of this section will be void. 12. DISCLAIMER OF WARRANTY. The Dataset is licensed "as-is." You bear the risk of using it. Microsoft gives no express warranties, guarantees or conditions. You may have additional consumer rights or statutory guarantees under your local laws which this agreement cannot change. To the extent permitted under your local laws, Microsoft excludes the implied warranties of merchantability, fitness for a particular purpose and non-infringement.