Skip to content
master
Go to file
Code

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
Jun 9, 2018
Aug 16, 2017
Jun 7, 2016

README.md

seq2seq

Attention-based sequence to sequence learning

Dependencies

  • TensorFlow 1.2+ for Python 3
  • YAML and Matplotlib modules for Python 3: sudo apt-get install python3-yaml python3-matplotlib
  • A recent NVIDIA GPU

How to use

Train a model (CONFIG is a YAML configuration file, such as config/default.yaml):

./seq2seq.sh CONFIG --train -v 

Translate text using an existing model:

./seq2seq.sh CONFIG --decode FILE_TO_TRANSLATE --output OUTPUT_FILE

or for interactive decoding:

./seq2seq.sh CONFIG --decode

Example English→French model

This is the same model and dataset as Bahdanau et al. 2015.

config/WMT14/download.sh    # download WMT14 data into raw_data/WMT14
config/WMT14/prepare.sh     # preprocess the data, and copy the files to data/WMT14
./seq2seq.sh config/WMT14/baseline.yaml --train -v   # train a baseline model on this data

You should get similar BLEU scores as these (our model was trained on a single Titan X I for about 4 days).

Dev Test +beam Steps Time
25.04 28.64 29.22 240k 60h
25.25 28.67 29.28 330k 80h

Download this model here. To use this model, just extract the archive into the seq2seq/models folder, and run:

 ./seq2seq.sh models/WMT14/config.yaml --decode -v

Example German→English model

This is the same dataset as Ranzato et al. 2015.

config/IWSLT14/prepare.sh
./seq2seq.sh config/IWSLT14/baseline.yaml --train -v
Dev Test +beam Steps
28.32 25.33 26.74 44k

The model is available for download here.

Audio pre-processing

If you want to use the toolkit for Automatic Speech Recognition (ASR) or Automatic Speech Translation (AST), then you'll need to pre-process your audio files accordingly. This README details how it can be done. You'll need to install the Yaafe library, and use scripts/speech/extract-audio-features.py to extract MFCCs from a set of wav files.

Features

  • YAML configuration files
  • Beam-search decoder
  • Ensemble decoding
  • Multiple encoders
  • Hierarchical encoder
  • Bidirectional encoder
  • Local attention model
  • Convolutional attention model
  • Detailed logging
  • Periodic BLEU evaluation
  • Periodic checkpoints
  • Multi-task training: train on several tasks at once (e.g. French->English and German->English MT)
  • Subwords training and decoding
  • Input binary features instead of text
  • Pre-processing script: we provide a fully-featured Python script for data pre-processing (vocabulary creation, lowercasing, tokenizing, splitting, etc.)
  • Dynamic RNNs: we use symbolic loops instead of statically unrolled RNNs. This means that we don't mean to manually configure bucket sizes, and that model creation is much faster.

Credits

About

Attention-based sequence to sequence learning

Resources

License

Packages

No packages published