Skip to content

MutNMT: Basic and advanced features

gramirez-prompsit edited this page Sep 22, 2022 · 4 revisions

MutNMT from scratch

The landing page of MutNMT invites people with no technical knowledge of neural machine translation to use it and sums up some of the features of the tool, the project it is related to and what further supporting materials we recommend. It also has a link to the code on Github, acknowledgement to JoeyNMT, a link to MultiTraiNMT project website and a Log in button:

MutNMT Homepage

Access to MutNMT is very simple: only a Google account is required and the user will be in. There are 3 profiles of users: Beginners (default access to all basic features), Experts (access to basic and advanced features without administration rights and Admins (full access to all features). By default, users will access as Beginners and Admins will be able to upgrade them to Experts or Admins.

Once in, depending on the profile, users will have a menu to access all the features that have been developped available for each profile. Under the menu, users will meet Nile, the guide to MutNMT, who will introduce users to the different features and further explain some of them if called. User profile and a Sign out button is also on this side of the screen. We show an Admin profile menu with all possible features enabled versus a Beginner profile with some of them restricted:

MutNMT User menu comparison MutNMT User menu comparison

Experts menu is as the Admin without the Admin section in the menu. From this moment on, users will be able to start interacting with MutNMT.

Basic features

The first set of features were created to let users preview available corpora and engines and use them to transalte texts and documents, inspect how translation is done, compare engines and evaluate machine translation output.

Data

MutNMT needs corpora in the form of parallel data (e.g. an English-French document where all sentences from English has been translated into French) to learn from. Beginners will be able to access the Data section to see which corpora are available as Public corpora. Then, they can add them to their own corpora section called Your corpora. A lot of interesting information is given per corpus regarding size, languages, domain, date of upload, description, etc:

MutNMT Data section

Previewing, downloading and grabbing corpora is possible as part the basic options for Corpora. Preview can be done as individual files (a sample of the sentences in one language or a sample of the sentences in the other language) but also as parallel data (pairs of source and target sentences one being translation of the other) as in the following screenshot:

MutNMT Data preview

We will see later that, as part of advanced features, uploading corpora is enabled for Experts and Admins profiles. After uploading a corpus, if decided by the user, it can be shared so that all profiles will be able to see it in the Public corpora section.

Engines

As well as for Data, there is a library of Engines in MutNMT, that is, already available machine translation systems that have been trained and shared. Here, Beginners can see Public engines and grab them to make them part of their own engines (Your Engines). They will use them to translate, inspect or compare different machine translation systems. Here also, a lot of interesting information is provided for each engine: language pair, description, trainer, automatic score, etc.

MutNMT Engines section

Of special interest are also the actions allowed: seeing the full training log of an engine, downloading the model, downloading the corpora it was trained with, grab or remove the corpora. The training log will keep users busy and excited about the engine details! As Begginers is only seeing, Experts and Admins will be able to Resume the training of an engine.

Translate

With the available engines in the Your engines section, all users will be able to copy and paste a bunch of sentence and translate them using one of those engines. You will get the resulting translation in the text box and be able to make a TMX out of the whole translation, that is, save in a standard format pairs of source and target sentences translated in this section. Documents are also allowed as input and, as ouput, you can get same document format or, again, a TMX.

MutNMT Translate section

Concatenating engines, and their translations is also a feature implemented for this section.

MutNMT Translate concatenate

Inspect

There are several options in this section all aimed at seeing the inside of the translation engines at work. The first one, allows users to input a sentence and see it at different steps of processing by a particular engine: pre-processed input, hypothesis generation (n-best), pre-processed output and final output.

MutNMT Inspect section

Users will also be able to compare engines sharing the same language pair to see the differences.

MutNMT Compare section

Evaluate

As a final step, users will be able to evaluate the output of machine tranlation compared to other machine translated texts or to professional translators, one or more. Evaluation in MutNMT provides automatic metrics very popular in the machine translation field at document level and at sentence level as well a as a way to see the sentences along with their individual scores. All these results can be also be downloaded in a spreadsheet.

MutNMT Evaluate section

Advanced features

Some advanced features were also developed for Expert and Admin users. These allow the usage of storage in the server (Data uploads) and computational power based on expensive GPUs that are needed in order to Train an engine. Besides this, Admin is allowed to see and manage users, processes and monitor server usage.

Data (corpora upload)

Besides all the nice features described for Data in the Basic Features section, Experts and Admins have a full section to upload new corpora to the Data libraries, either for their own or to be shared with others. Uploading can be done for up to 2GB of text in various shapes (individual files, one file) or formats (TXT for individual files, TSV and TMX for single ones).

MutNMT Data section

Train

This feature is an advanced feature for Experts and Admins to be able to train neural machine translation engines using MutNMT. Users will need to set up engine details, configuration parameters and select corpora for training a particular system.

MutNMT Train section

Once launched, the system will let users train for 1 hour (or less if they decide to stop it) and plot a live training log with all settings, parameters and intermediate training results being updated during the training process. Even energy consumption will be computed and shown to users. In the end, the system will produce a training log with all this information, accessible through the Engines section. It will allow the user its inspection, and some other actions as computing metrics on a test corpus or resume the training for one more hour.

MutNMT Train console MutNMT Train log

Admin

Finally, some options for only Admin profiles have also been implemented.

The first one allows to manage users in MutNMT. Admins are allowed to see some basic information, delete users or change their profile:

MutNMT Admin Users section

The second option allows to monitor server usage (hard drive, ram, CPU and GPU usage) and system processes:

MutNMT Admin System section

And the third option allows to see ongoing trainings and be able to stop them:

MutNMT Admin Instances section