Machine Translation (MT) Evaluation Scripts
Tutorials for this repository can be found at: https://blog.machinetranslation.io/tag/machine-translation-evaluation/
All dependencies can be installed via:
pip3 install -r requirements.txt
To run the Python scripts and calculate the MT evaluation metrics on your machine translation output, you need to have two files:
- Reference: It is the human translation (target) file of your test dataset.
- System: It is the MTed translation/prediction, generated by the machine translation model for the source of the same test dataset used for “Reference”.
Corpus BLEU: Calculates the BLEU score for the whole corpus and prints the result.
python3 compute-bleu.py Reference.txt System.txt
Sentence BLEU: Calculates the BLEU score for sentence by sentence and saves the result to a file.
python3 compute-bleu-sentence.py Reference.txt System.txt
Sentence METEOR: Note that METEOR works on the sentence level only.
python3 sentence-meteor.py Reference.txt System.txt
Corpus WER: Calculates the WER score for the whole corpus and prints the result.
python3 corpus-wer.py Reference.txt System.txt
Sentence WER: Calculate the WER score for sentence by sentence and saves the result to a file.
python3 sentence-wer.py Reference.txt System.txt
If you have questions or suggestions, please feel free to contact me.