Skip to content
Nicola Bertoldi edited this page May 31, 2019 · 11 revisions

Welcome to ModernMT

ModernMT is a context-aware, incremental and distributed general purpose Machine Translation technology.

ModernMT will overcome four technology barriers that still hinder the wide adoption of currently available MT software by end-users and language service providers:

  • ModernMT will be a ready to install application that will not require any initial training phase.
  • The ModernMT system will manage context automatically so that it will not require building domain-specific systems.
  • ModernMT will enable scalability of data and users so that no more expensive ad-hoc hardware installations are needed.
  • ModernMT will create a data collection infrastructure that accelerates the process of filling the data gap between large web companies and the machine translation industry.

Getting Started

If you want to quickly install and run ModernMT, follow the steps in our Readme.

You can find more details about the installation procedure go to the Installation page.

Once tested the system with an example engine, we suggest to read our CLI Documentation and API Documentation, you will find all the details about ModernMT Command Line Interface and REST API, respectively, fundamental if you want to start using ModernMT in your own translation process.

Translation process in ModernMT

Translation process in ModernMT is quite different from common, non-adapting Machine Translation technologies. The models you can create with this tool do not merge all the parallel data into a single indistinguishable heap; separate containers for each data source are created instead. We call them memories.

Memory

A memory is a container of thematic parallel text; good examples of memories could be Europarl, a customer Translation Memory or even the content of a multilingual website. In short, there is no strict definition of what a memory must contains; it's up to the user that can create a memory, append a single contribution (source/translation pair) or import in batch all translation units coming from a TMX for example.

Context Vector

ModernMT is able to adapt the output of the same exact source text, changing the translation accordingly to a given Context Vector. This data structure holds a score for each one of the most relevant memories for a translation and this information is used during decoder to change the internal models probabilities.

Calculating a Context Vector is indeed very simple, and it's done automatically by ModernMT when invoking the Translate API. You can specify a text, called context, that could better qualify the source sentence you want to translate: if, for example, you're about to translate a sentence from a document, you can use a bunch of lines that come right before the source sentence itself.

It is also possible to explicitly calculate the Context Vector with the Context Vector Guessing API. This comes handy when you want to analyse a large content, such as an entire document, and reuse the calculated Context Vector for multiple translation requests directly passing it to the Translate API.