PReP: Pseudo-References filtered by Paraphrasing

Metric for Automatic Machine Translation Evaluation
We submited it to WMT2019 Metrics Shared Task.
Paper: http://www.statmt.org/wmt19/pdf/53/WMT60.pdf

Dependencies

export PYTHONPATH="path to bert dir:$PYTHONPATH"

export TUNED_MODEL_DIR="path to fine-tuned BERT model"

Prepare test set to data/orig (File names are src, out, ref)
Make pseudo-references
Translate the source of test set with off-the-shelf MT system and set the outputs to data/pseudo_references/

Note: Don't use a off-the-shelf MT system whose output is contained the test set.
Filtering with BERT
```
sh script/filter.sh
```
Pseudo-references with paraphrase score are in data/sim_scores.
Filtered pseudo-references are in data/filtered_paseudo_references/ .
Evaluate
Please evaluate the score with a metric which allows use of multiple references.
If you evaluate with sentence bleu, please download moses binaries and
```
sh scripts/evaluate.sh [language] [path to moses folder]
```
Generated output_score file contains each sentence-bleu score.

Note: If you evaluate the score with other metrics, use the metrics that take into account all references, not to get the maximum value for each single reference.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
scripts		scripts
LICENSE		LICENSE
README.md		README.md