Attention-based sequence to sequence learning
- TensorFlow 1.2+ for Python 3
- YAML and Matplotlib modules for Python 3:
sudo apt-get install python3-yaml python3-matplotlib
- A recent NVIDIA GPU
How to use
Train a model (CONFIG is a YAML configuration file, such as
./seq2seq.sh CONFIG --train -v
Translate text using an existing model:
./seq2seq.sh CONFIG --decode FILE_TO_TRANSLATE --output OUTPUT_FILE
or for interactive decoding:
./seq2seq.sh CONFIG --decode
Example English→French model
This is the same model and dataset as Bahdanau et al. 2015.
config/WMT14/download.sh # download WMT14 data into raw_data/WMT14 config/WMT14/prepare.sh # preprocess the data, and copy the files to data/WMT14 ./seq2seq.sh config/WMT14/baseline.yaml --train -v # train a baseline model on this data
You should get similar BLEU scores as these (our model was trained on a single Titan X I for about 4 days).
Download this model here. To use this model, just extract the archive into the
seq2seq/models folder, and run:
./seq2seq.sh models/WMT14/config.yaml --decode -v
Example German→English model
This is the same dataset as Ranzato et al. 2015.
config/IWSLT14/prepare.sh ./seq2seq.sh config/IWSLT14/baseline.yaml --train -v
The model is available for download here.
If you want to use the toolkit for Automatic Speech Recognition (ASR) or Automatic Speech Translation (AST), then you'll need to pre-process your audio files accordingly.
This README details how it can be done. You'll need to install the Yaafe library, and use
scripts/speech/extract-audio-features.py to extract MFCCs from a set of wav files.
- YAML configuration files
- Beam-search decoder
- Ensemble decoding
- Multiple encoders
- Hierarchical encoder
- Bidirectional encoder
- Local attention model
- Convolutional attention model
- Detailed logging
- Periodic BLEU evaluation
- Periodic checkpoints
- Multi-task training: train on several tasks at once (e.g. French->English and German->English MT)
- Subwords training and decoding
- Input binary features instead of text
- Pre-processing script: we provide a fully-featured Python script for data pre-processing (vocabulary creation, lowercasing, tokenizing, splitting, etc.)
- Dynamic RNNs: we use symbolic loops instead of statically unrolled RNNs. This means that we don't mean to manually configure bucket sizes, and that model creation is much faster.