Human evaluation results and translation output for the Translator Human Parity Data release
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.

README.md

Translator-HumanParityData

Human evaluation results and translation output for the Translator Human Parity Data release

Microsoft Translator Human Parity Data (v1.0)

This release contains the following data:

  1. References

    Two new references for newstest2017, one based on human translation from scratch (Reference-HT), the other based on human post-editing (Reference-PE).

  2. Translations

    Human parity translations generated by our research systems Combo-4, Combo-5, and Combo-6; Output from online machine translation service Online-A-1710, collected on October 16, 2017.

  3. Evaluations

    We release all data points collected in our human evaluation campaigns. This includes annotations for Subset-1, Subset-2, Subset-3, and Subset-4.

    We share the (anonymized) annotator IDs, segment IDs, system IDs, type ID (either TGT or CHK, the second being a repeated judgment for the first), raw scores r in [0,100], as well as annotation start and end times.

    Additionally, we share the combined data for Meta-1 campaign on Subset-1.

  4. References

    When using this data we require that you cite our paper.

Releases

Translator-HumanParityData v1.0

Microsoft Research Data License Agreement

For Translator Human Parity Data 

This Microsoft Research Data License Agreement ("Agreement") is a legal
agreement between you and Microsoft Corporation (or based on where you
live, one of its affiliates). Please read them.  They apply to the Microsoft
Research dataset named above, which may include any associated materials,
text or speech files, associated media and "online" or electronic documentation
and any updates we provide in our discretion (together, the "Dataset"). The
terms also apply to any Microsoft (i) updates, (ii) supplements, and (iii)
internet-based services, and (iv) support services for this Dataset, unless
other terms accompany those items. If so, those terms apply.

By agreeing to this Agreement or by using the Dataset, you accept these terms.
If you do not accept them, do not use the Dataset. If you comply with these
terms, you have the rights below.

1.	SCOPE OF RIGHTS.
    You may use, copy, modify, and create derivative works of the Dataset:
      i.    for non-commercial or research purposes only. Examples of non-commercial
            uses are teaching, academic research, public demonstrations and personal
            experimentation;
     ii.    for analyzing and testing purposes; and
    iii.    to publish (or present papers or articles) on your results from using
            such Dataset, provided that no material portion of the Dataset is
            included in any such publication or presentation.

2.	DISTRIBUTION RESTRICTIONS.  You may not: (a) distribute the Dataset; (b) alter
any copyright, trademark or patent notice in the Dataset; (c) use Microsoft's
trademarks in a way that suggests your derivative works or modifications come from
or are endorsed by Microsoft; or (d) include the Dataset in malicious, deceptive
or unlawful programs.

3.	OWNERSHIP.  Microsoft retains all right, title, and interest in and to the Dataset.  

4.	FEEDBACK. If you give feedback about the Dataset to Microsoft, you give to
Microsoft, without charge, the right to use, share and commercialize your feedback
in any way and for any purpose. You also give to third parties, without charge, any
patent rights needed for their products, technologies and services to use or interface
with any specific parts of a Microsoft dataset or service that includes the feedback.
You will not give feedback that is subject to a license that requires Microsoft to
license its Dataset or documentation to third parties because we include your feedback
in them. These rights survive this Agreement.

5.	TERM; TERMINATION The term of this Agreement will commence upon your acceptance of
this Agreement and will continue unless terminated as provided herein. If you breach
this Agreement or if you sue anyone over patents that you think may apply to or read
on the Dataset or anyone's use of the Dataset, this Agreement (and your license and
rights obtained herein) terminate automatically. If this agreement is terminated, you
must return or destroy all full or partial copies of the Dataset in your possession
immediately.  Any sections that are intended to survive termination of this Agreement
shall survive.

6.	EXPORT RESTRICTIONS. The Dataset is subject to United States export laws and
regulations. You must comply with all domestic and international export laws and
regulations that apply to the Dataset. These laws include restrictions on
destinations, end users and end use. For additional information, see
www.microsoft.com/exporting.

7.	ENTIRE AGREEMENT. This Agreement, and the terms for any supplements or updates
that you use, are the entire agreement for the Dataset.

8.	SEVERABILITY. If any court of competent jurisdiction determines that any
provision of this Agreement is illegal, invalid or unenforceable, the remaining
provisions will remain in full force and effect.  

9.	GOVERNING LAW AND VENUE. This Agreement is governed by and construed in
accordance with the laws of the state of Washington, without reference to its
choice of law principles to the contrary.  Each party hereby consents to the
jurisdiction and venue of the state and federal courts located in King County,
Washington, with regard to any suit or claim arising under or by reason of
this Agreement. 

10.	LEGAL EFFECT. This Agreement describes certain legal rights. You may have
other rights under the laws of your country. You may also have rights with
respect to the party from whom you acquired the Dataset. This Agreement does not
change your rights under the laws of your country if the laws of your country do 
not permit it to do so.

11.	NO ASSIGNMENT. You may not assign this Agreement or any rights or
obligations hereunder, except with Microsoft's express written consent. Any
attempted assignment in violation of this section will be void.

12.	DISCLAIMER OF WARRANTY. The Dataset is licensed "as-is." You bear the risk
of using it. Microsoft gives no express warranties, guarantees or conditions.
You may have additional consumer rights or statutory guarantees under your
local laws which this agreement cannot change. To the extent permitted under
your local laws, Microsoft excludes the implied warranties of merchantability,
fitness for a particular purpose and non-infringement.