Skip to content
Latent Alignment and Variational Attention
Branch: master
Clone or download
Latest commit ce2ac9a Nov 6, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
available_models Fixing example config Apr 3, 2018
data add readme and data Jun 26, 2018
docs
onmt . Oct 26, 2018
test
tools
.gitignore fix generation for soft Jun 28, 2018
.travis.yml . Mar 11, 2018
CONTRIBUTING.md
Dockerfile Added a dockerfile Sep 18, 2017
LICENSE.md correct setting of licence for github Nov 3, 2017
README.md . Nov 5, 2018
github_deploy_key_opennmt_opennmt_py.enc . Dec 22, 2017
preprocess.py
requirements.opt.txt . Jan 24, 2018
requirements.txt pytorch 0.4 Sep 18, 2018
server.py
setup.py
train.py
translate.py
va.sh

README.md

Latent Alignment and Variational Attention

This is a Pytorch implementation of the paper Latent Alignment and Variational Attention from a fork of OpenNMT.

Dependencies

The code was tested with python 3.6 and pytorch 0.4. To install the dependencies, run

pip install -r requirements.txt

Running the code

All commands are in the script va.sh.

Preprocessing the data

To preprocess the data, run

source va.sh && preprocess_bpe

The raw data in data/iwslt14-de-en was obtained from the fairseq repo with BPE_TOKENS=14000.

Training the model

To train a model, run one of the following commands:

  • Soft attention
source va.sh && CUDA_VISIBLE_DEVICES=0 train_soft_b6
  • Categorical attention with exact evidence
source va.sh && CUDA_VISIBLE_DEVICES=0 train_exact_b6
  • Variational categorical attention with exact ELBO
source va.sh && CUDA_VISIBLE_DEVICES=0 train_cat_enum_b6
  • Variational categorical attention with REINFORCE
source va.sh && CUDA_VISIBLE_DEVICES=0 train_cat_sample_b6
  • Variational categorical attention with Gumbel-Softmax
source va.sh && CUDA_VISIBLE_DEVICES=0 train_cat_gumbel_b6
  • Variational categorical attention using Wake-Sleep algorithm (Ba et al 2015)
source va.sh && CUDA_VISIBLE_DEVICES=0 train_cat_wsram_b6

Checkpoints will be saved to the project's root directory.

Evaluating on test

The exact perplexity of the generative model can be obtained by running the following command with $model replaced with a saved checkpoint.

source va.sh && CUDA_VISIBLE_DEVICES=0 eval_cat $model

The model can also be used to generate translations of the test data:

source va.sh && CUDA_VISIBLE_DEVICES=0 gen_cat $model
sed -e "s/@@ //g" $model.out | perl tools/multi-bleu.perl data/iwslt14-de-en/test.en

Trained Models

Models with the lowest validation PPL were selected for evaluation on test. Numbers are slightly different from those reported in the paper since this is a re-implementation.

Model Test PPL Test BLEU
Soft Attention 7.17 32.77
Exact Marginalization 6.34 33.29
Variational Attention + Enumeration 6.08 33.69
Variational Attention + Sampling 6.17 33.30
You can’t perform that action at this time.