Skip to content
Evaluation of the morphological quality of machine translation outputs
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md
evaluate_cs.py
evaluate_de.py
evaluate_fr.py

README.md

morpheval_v2

Evaluation of the morphological quality of machine translation outputs. The automatically generated test suite in English should be translated into one of the supported target languages (French, Czech). The output is then analyzed and provides three types of information:

  • Adequacy: has the morphological information been well conveyed from the source?
  • Fluency: do we have local agreement?
  • Consistency: how well is the system confident in its prediction?

Requirements

How To

Translate the source file morpheval.limsi.v2.en.sents and run the Moses tokenizer on it (with arguments -no-escape and -l {fr|cs|de}). Then:

French

python3 evaluate_fr.py -i output.tokenized -n morpheval.limsi.v2.en.info -d lefff.pkl

Czech

cat output.tokenized | sed 's/$/\n/' | tr ' ' '\n' | morphodita/src/run_morpho_analyze dictionary --input=vertical --output=vertical > output.analysis
python3 evaluate_cs.py -i output.analysis -n morpheval.limsi.v2.en.info

German

cat output.tokenized | tr ' ' '\n' | sort | uniq | ./smor > output.smored
python3 evaluate_de.py -i output.tokenized -n morpheval.limsi.v2.en.info -d output.smored

Publication

Franck Burlot and François Yvon, Evaluating the morphological competence of machine translation systems. In Proceedings of the Second Conference on Machine Translation (WMT’17). Association for Computational Linguistics, Copenhagen, Denmark, 2017.

You can’t perform that action at this time.