Skip to content

This repo contains a Google Colab notebook to introduce the training of neural machine translation systems to students.

License

Notifications You must be signed in to change notification settings

PRHLT/nmt-practical-session

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Neural Machine Translation: a Practical Session

This repo contains a Google Colab notebook which presents a practical session to introduce the training of neural machine translation systems to students. It was created for the lab sessions of the machine translation module of the IARFID master from Universitat Politècnica de València.

You can open the notebook by cliking on Open In Colab.

Overview

The goal of this lab session is to build machine translation systems based on neural networks (neural machine translation; NMT) from a dataset of bilingual parallel sentences using a custom version of the OpenNMT-py toolkit (Klein et al., 2017).

Dataset

The dataset we are going to use in this practical session is the Spanish–English language pair of the EuTrans corpus (Casacuberta et al., 2004), whose content involves the interaction of a customer with a receptionist at the frontdesk of a hotel. It comes with the custom version of OpenNMT-py that we are using. It is located at OpenNMT-py/dataset/EuTrans.

Here we can see an example of its content:

por favor, ¿nos puede dar la llave de la habitación?

can you give us the key to the room, please?

Network description

The neural network that we are going to use for training the NMT system has the following configuration:

  • Encoder and decoder are both Transformer with 64 neurons.
  • 2 layers.
  • Hidden Transformer feed-forward of size 64.
  • 2 self-attention heads.
  • Source word vector of size 64.
  • Target word vector of size 64.

Colab resources

References

About

This repo contains a Google Colab notebook to introduce the training of neural machine translation systems to students.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%