A Neural Machine Translation framework for training large-scale networks on multiple nodes with multiple GPUs.
- Python 2.7
- deepy >= 0.2
An example for WMT15 translation task
- Clone neuralmt
git clone https://github.com/zomux/neuralmt export PYTHONPATH="$PYTHONPATH:/path/to/neuralmt"
- Create a directory for WMT data
export WMT_ROOT="/path/to/your_wmt_folder" mkdir $WMT_ROOT/text mkdir $WMT_ROOT/models
- Tokenize de-en training corpus, and rename them to following filenames
- Build training data
cd /path/to/neuralmt python examples/gru_search/preprocess.py
- Train on 3 GPUs
python -m deepy.multigpu.launch examples/gru_search/train.py gpu0 gpu1 gpu2
Wait for several days
Test your model
(The test script only translate one sample sentence, you can modify it to translate a text file)
Training on multiple machine is still in development.
Although the current framework for parallelism shall be extended to multiple machine easily, it require some works.
- WMT15 German-English task (using the model in the example)
- BLEU: 21.29
- Duration: 2.5 days with 3 Titan X GPUs
Raphael Shu, 2016