Skip to content

tntc-project/MultiEnJa

Repository files navigation

MultiEnJa

Introduction

This repository contains 46 examples of English source documents (SDs) in various content domains that are often dealt with by translation service providers (TSPs) and several types of translation-related derivatives that we have annotated and produced aiming to analyze the norms and competences in the translation process.

Contents

Textual data of source documents are found here.

For all of them, we have so far produced the following types of derivatives (See README.md in each directory for the details).

Figure 1 summarizes the processes of producing these heterogeneous translations and what can be analyzed by comparing some of them.

  • MT outputs vs. their post-edited version: the issues of MT
  • Human translation (HT) vs. MT+PE: the gap between what can be assured by ISO/TC37 (2015) and ISO/TC37 (2017)
  • Draft/unpolished translation vs. final/polished translation: the art in human translation (or translation strategies)


Fig. 1: Overview of the processes of producing translations and their purposes.

Todo

  • Include other types of derivatives.
  • Extend the number of documents as well as translation directions.

References

License

Creative Commons License

Acknowledgments

This dataset is an outcome of JSPS KAKENHI Grant-in-Aid for Scientific Research (S) 19H05660, Developing a Translation Process Model and Constructing an Integrated Translation Environment through Detailed Descriptions of Translation Norms and Competences.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published