Skip to content
Scripts for running the experiments in our "Sparse Constrained Attention" paper at ACL 2018.
Java Python Perl Shell TeX
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
aer
coverage-eval
fertility
meteor-1.5
LICENSE
README.md
evaluate.sh
force_align.py
generate_fertilities.py
multi-bleu.perl
prepare_experiment.sh
run_experiment.sh

README.md

Sparse and Constrained Attention for Neural Machine Translation


by Chaitanya Malaviya, Pedro Ferreira, André Martins

This repository contains scripts for carrying out the experiments described in ref. [1] below.

Prerequisites

OpenNMT-py Unbabel fork

Preparing the data for a language pair

For the sake of an example, let's assume we're handling the ro-en language pair. The procedure below works for other language pairs, provided the file names are consistent.

  1. Store your data (tokenized and BPE'ed) in a data folder <DATA PATH>/ro-en. This must contain the following files:
  • corpus.bpe.ro
  • corpus.bpe.en
  • newsdev2016.bpe.ro
  • newsdev2016.bpe.en
  • newstest2016.bpe.ro
  • newstest2016.bpe.en
  • newstest2016.tc.en

Note 1: don't forget to include the non-BPE's test target file.

Note 2: these files should not include the sink symbol.

  1. Run the following script:
>> prepare_experiment.sh ro en

This will add the sink symbol on the source files, train the aligner, force alignments on the dev and test data, and run scripts which create the gold, guided, actual, and predicted fertility files. It will also run the preprocess.py OpenNMT-py script to create the train and validation data files.

Note 1: you need to adjust the DATA, ALIGNER, and OPENNMT paths in this script.

Note 2: you need to adjust the PATH_FAST_ALIGN in the script force_align.py.

Note 3: you need to adjust the DATA path in the script fertility/train_test_fertility_predictor.sh.

Training models

For training models with different configurations, run the following example scripts:

>> run_experiment.sh <gpuid> ro en softmax 0 &
>> run_experiment.sh <gpuid> ro en sparsemax 0 &
>> run_experiment.sh <gpuid> ro en csoftmax 0 fixed 2 &
>> run_experiment.sh <gpuid> ro en csparsemax 0 fixed 2 &
>> run_experiment.sh <gpuid> ro en csoftmax 0 fixed 3 &
>> run_experiment.sh <gpuid> ro en csparsemax 0 fixed 3 &
>> run_experiment.sh <gpuid> ro en csoftmax 0.2 fixed 3 &
>> run_experiment.sh <gpuid> ro en csparsemax 0.2 fixed 3 &
>> run_experiment.sh <gpuid> ro en csoftmax 0 guided &
>> run_experiment.sh <gpuid> ro en csparsemax 0 guided &
>> run_experiment.sh <gpuid> ro en csoftmax 0.2 guided &
>> run_experiment.sh <gpuid> ro en csparsemax 0.2 guided &

This will generate log files in the folder logs.

Note: you need to adjust the DATA, ALIGNER, and OPENNMT paths in this script.

Evaluation

To measure the BLEU and METEOR model performance on the test files, pick the with best dev performance and use this script:

>> ./evaluate.sh <model> <DATA>/ro-en/newstest.bpe.sink.ro <DATA>/ro-en/newstest.tc.en

To measure coverage-related metrics (REP-score and DROP-score) on the test files, please follow the instructions in the README in the coverage-eval folder.

In addition, to dump the attention matrices, use the -dump_attn argument with translate.py. You can load the outputted file with pickle as:

>> attn_matrices = pickle.load( open(<filename>, 'rb') )

References

[1] Chaitanya Malaviya, Pedro Ferreira, Andre Martins. Sparse and Constrained Attention for Neural Machine Translation. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL). Melbourne, Australia, July 2018.

You can’t perform that action at this time.