Skip to content

pixelsandpointers/hierarchical-rl-joeynmt

Repository files navigation

Check the original repository: https://github.com/joeynmt/joeynmt

This was a fork for the software project done in the summer term 2020 and only for representative purposes.

  Joey-NMT Joey NMT

Build Status Gitter

Goal and Purpose

Joey NMT framework is developed for educational purposes. It aims to be a clean and minimalistic code base to help novices find fast answers to the following questions.

  • How to implement classic NMT architectures (RNN and Transformer) in PyTorch?
  • What are the building blocks of these architectures and how do they interact?
  • How to modify these blocks (e.g. deeper, wider, ...)?
  • How to modify the training procedure (e.g. add a regularizer)?

In contrast to other NMT frameworks, we will not aim for the most recent features or speed through engineering or training tricks since this often goes in hand with an increase in code complexity and a decrease in readability.

However, Joey NMT re-implements baselines from major publications.

Check out the detailed documentation and our paper.

Contributors

Joey NMT is developed by Joost Bastings (University of Amsterdam) and Julia Kreutzer (Heidelberg University).

Features

Joey NMT implements the following features (aka the minimalist toolkit of NMT):

  • Recurrent Encoder-Decoder with GRUs or LSTMs
  • Transformer Encoder-Decoder
  • Attention Types: MLP, Dot, Multi-Head, Bilinear
  • Word-, BPE- and character-based input handling
  • BLEU, ChrF evaluation
  • Beam search with length penalty and greedy decoding
  • Customizable initialization
  • Attention visualization
  • Learning curve plotting

Coding

In order to keep the code clean and readable, we make use of:

  • Style checks: pylint with (mostly) PEP8 conventions, see .pylintrc.
  • Typing: Every function has documented input types.
  • Docstrings: Every function, class and module has docstrings describing their purpose and usage.
  • Unittests: Every module has unit tests, defined in test/unit/. Travis CI runs the tests and pylint on every push to ensure the repository stays clean.

Installation

Joey NMT is built on PyTorch and torchtext for Python >= 3.5.

  1. Clone this repository: git clone https://github.com/joeynmt/joeynmt.git
  2. Install joeynmt and it's requirements: cd joeynmt pip3 install . (you might want to add --user for a local installation).
  3. Run the unit tests: python3 -m unittest

Warning! When running on GPU you need to manually install the suitable PyTorch version for your CUDA version. This is described in the PyTorch installation instructions.

Usage

For details, follow the tutorial in the docs.

Data Preparation

Parallel Data

For training a translation model, you need parallel data, i.e. a collection of source sentences and reference translations that are aligned sentence-by-sentence and stored in two files, such that each line in the reference file is the translation of the same line in the source file.

Pre-processing

Before training a model on it, parallel data is most commonly filtered by length ratio, tokenized and true- or lowercased.

The Moses toolkit provides a set of useful scripts for this purpose.

In addition, you might want to build the NMT model not on the basis of words, but rather sub-words or characters (the level in JoeyNMT configurations). Currently, JoeyNMT supports the byte-pair-encodings (BPE) format by subword-nmt.

Configuration

Experiments are specified in configuration files, in simple YAML format. You can find examples in the configs directory. small.yaml contains a detailed explanation of configuration options.

Most importantly, the configuration contains the description of the model architecture (e.g. number of hidden units in the encoder RNN), paths to the training, development and test data, and the training hyperparameters (learning rate, validation frequency etc.).

Training

Start

For training, run

python3 -m joeynmt train configs/small.yaml.

This will train a model on the training data specified in the config (here: small.yaml), validate on validation data, and store model parameters, vocabularies, validation outputs and a small number of attention plots in the model_dir (also specified in config).

Note that pre-processing like tokenization or BPE-ing is not included in training, but has to be done manually before.

Tip: Be careful not to overwrite models, set overwrite: False in the model configuration.

Validations

The validations.txt file in the model directory reports the validation results at every validation point. Models are saved whenever a new best validation score is reached, in batch_no.ckpt, where batch_no is the number of batches the model has been trained on so far. best.ckpt links to the checkpoint that has so far achieved the best validation score.

Visualization

JoeyNMT uses Tensorboard to visualize training and validation curves and attention matrices during training. Launch Tensorboard with tensorboard --logdir model_dir/tensorboard (or python -m tensorboard.main ...) and then open the url (default: localhost:6006) with a browser.

For a stand-alone plot, run python3 scripts/plot_validation.py model_dir --plot_values bleu PPL --output_path my_plot.pdf to plot curves of validation BLEU and PPL.

CPU vs. GPU

For training on a GPU, set use_cuda in the config file to True. This requires the installation of required CUDA libraries.

Translating

There are three options for testing what the model has learned.

Whatever data you feed the model for translating, make sure it is properly pre-processed, just as you pre-processed the training data, e.g. tokenized and split into subwords (if working with BPEs).

1. Test Set Evaluation

For testing and evaluating on your parallel test/dev set, run

python3 -m joeynmt test configs/small.yaml --output_path out.

This will generate translations for validation and test set (as specified in the configuration) in out.[dev|test] with the latest/best model in the model_dir (or a specific checkpoint set with load_model). It will also evaluate the outputs with eval_metric. If --output_path is not specified, it will not store the translation, and only do the evaluation and print the results.

2. File Translation

In order to translate the contents of a file not contained in the configuration (here my_input.txt), simply run

python3 -m joeynmt translate configs/small.yaml < my_input.txt > out.

The translations will be written to stdout or alternatively--output_path if specified.

3. Interactive

If you just want try a few examples, run

python3 -m joeynmt translate configs/small.yaml

and you'll be prompted to type input sentences that JoeyNMT will then translate with the model specified in the configuration.

Documentation and Tutorial

  • The docs include an overview of the NMT implementation, a walk-through tutorial for building, training, tuning, testing and inspecting an NMT system, the API documentation and FAQs.
  • A screencast of the tutorial is available on YouTube.
  • Jade Abbott wrote a notebook that runs on Colab that shows how to prepare data, train and evaluate a model, at the example of low-resource African languages.
  • Matthias Müller wrote a collection of scripts for installation, data download and preparation, model training and evaluation.

Benchmarks

Benchmark results on WMT and IWSLT datasets are reported here. Please also check the [https://github.com/masakhane-io/masakhane](Masakhane repository) for benchmarks and available models for African languages.]

Pre-trained Models

Pre-trained models from reported benchmarks for download (contains config, vocabularies, best checkpoint and dev/test hypotheses):

IWSLT14 de-en

IWSLT15 en-vi

WMT17

Following the pre-processing of the Sockeye paper.

Autshumato

Training with data provided in the Ukuxhumana project, with additional tokenization of the training data with the Moses tokenizer.

If you trained JoeyNMT on your own data and would like to share it, please email us so we can add it to the collection of pre-trained models.

Reinforcement Learning

JoeyNMT is now also capable of performing Reinforcement Learning with integration of OpenAI Gym. This allows an easy implementation of both Classic Control reinforcement learning scenarios such as Acrobot and Cartpole. To load and train these environments, follow the example in configs/acrobot.yaml.

  1. Change the env_name variable to the OpenAI Gym environment of choice
  2. Set the scenario variable accordingly. Choices: "ClassicControl", "Atari", and "HRL"
  3. Define the agent model under the model category
  4. Change the remaining parameters as needed

Implementing a custom reinforcement learning task

To implement an own custom reinforcement learning task follow this outline:

  • Custom Environment: Inherit the OpenAI Gym class and follow the OpenAI Gym API
  • Custom Agent: Implement the abstract interface defined in joeynmt/agent.py and override the defined methods therein
  • Actor-Critic methods: Implement value networks following the abstract interface in joeynmt/value_network.py

Hierarchical reinforcement learning tasks

Furthermore, JoeyNMT supports hierarchical reinforcement learning tasks such as Interactive Semantic Parsing](https://github.com/LittleYUYU/Interactive-Semantic-Parsing) by Yao et al., as is described in the following sections.

Task description

The implemented hierarchical reinforcement approach is an effort to reimplement the Interactive Semantic Parsing approach by Yao et al. The supervised baselines have been re-implemented, the reinforcement learning method, and replaced the Latent Attention Model with JoeyNMT's Transformer Encoder.

The input is a recipe description - too see what they look like please have a look at our data file (label: words). The model asks the user follow-up questions if the initial parse of the recipe description is not sufficient for a confident prediction. To reduce the number of questions, Reinforcement Learning is employed. This approach is considered hierarchical because the queries are subdivided into four subtasks which gives the model the opportunity to optimise the order in which it processes the subtasks and minimise user interaction.

Train


In the case of the described hierarchical reinforcement learning approach described above: To pretain the low-level agents, run

python ./subtask_pretraining_enhanced.py

If you want to customize the pretraining, you can change the parameters from line 66 to 74 - do not change SEED and ASK_LABELS.


For training with Reinforcement Learning, run

python -m joeynmt rl_train configs/<my_experiment>.yaml

To customize the training process, the configuration files can be altered or you can write your own. If you decide to write your own, please have look at the files in the configs directory. The comments in the configuration files will tell you which options you have for each parameter, what these options mean, and what the default settings are if available.

Testing

For testing and evaluating, run

python -m joeynmt rl_test configs/<my_experiment>.yaml --ckpt models/<my_experiment>/best.ckpt

The evaluation output will be printed to screen. To save this output to a file, add the --output_file <output_path> option to the testing and evaluating command above.

Contributing

Since this codebase is supposed to stay clean and minimalistic, contributions addressing the following are welcome:

  • code correctness
  • code cleanliness
  • documentation quality
  • speed or memory improvements
  • resolving issues
  • providing pre-trained models

Code extending the functionalities beyond the basics will most likely not end up in the master branch, but we're curions to learn what you used Joey for.

Projects and Extensions

Here we'll collect projects and repositories that are based on Joey, so you can find inspiration and examples on how to modify and extend the code.

  • Joey Toy Models. @bricksdont built a [collection of scripts] (https://github.com/bricksdont/joeynmt-toy-models) showing how to install JoeyNMT, preprocess data, train and evaluate models. This is a great starting point for anyone who wants to run systematic experiments, tends to forget python calls, or doesn't like to run notebook cells!
  • African NMT. @jaderabbit started an initiative at the Indaba Deep Learning School 2019 to "put African NMT on the map". The goal is to build and collect NMT models for low-resource African languages. The Masakhane repository contains and explains all the code you need to train JoeyNMT and points to data sources. It also contains benchmark models and configurations that members of Masakhane have built for various African languages.
  • Slack Joey. Code to locally deploy a Joey NMT model as chat bot in a Slack workspace. It's a convenient way to probe your model without having to implement an API. And bad translations for chat messages can be very entertaining, too ;)
  • Flask Joey. @kevindegila built a flask interface to Joey, so you can deploy your trained model in a web app and query it in the browser.
  • User Study. We evaluated the code quality of this repository by testing the understanding of novices through quiz questions. Find the details in Section 3 of the Joey NMT paper.
  • Self-Regulated Interactive Seq2Seq Learning. Julia Kreutzer and Stefan Riezler. Published at ACL 2019. Paper and Code. This project augments the standard fully-supervised learning regime by weak and self-supervision for a better trade-off of quality and supervision costs in interactive NMT.
  • Speech Joey. @Sariyusha is giving Joey ears for speech translation. Code.
  • Hieroglyph Translation. Joey NMT was used to translate hieroglyphs in this IWSLT 2019 paper by Philipp Wiesenbach and Stefan Riezler. They gave Joey NMT multi-tasking abilities.

If you used Joey NMT for a project, publication or built some code on top of it, let us know and we'll link it here.

Contact

Please leave an issue if you have questions or issues with the code.

For general questions, email us at joeynmt <at> gmail.com.

Reference

If you use Joey NMT in a publication or thesis, please cite the following paper:

@ARTICLE{JoeyNMT,
author = {{Kreutzer}, Julia and {Bastings}, Joost and {Riezler}, Stefan},
title = {Joey {NMT}: A Minimalist {NMT} Toolkit for Novices},
journal = {To Appear in EMNLP-IJCNLP 2019: System Demonstrations},
year = {2019},
month = {Nov},
address = {Hong Kong}
url = {https://arxiv.org/abs/1907.12484}
}

Naming

Joeys are infant marsupials.