MT error analysis with Meteor and Hjerson.
To run Meteorson, you will need source, reference, and hypothesis files for the data you wish to evaluate. The main pipeline script, errors-pipeline.sh, expects these files to have the same base name (e.g., news.tr-en) and the appropriate suffix (src, ref, hyp). Simply pass this base filename to the script, and it will automatically run through the process (storing files for intermediate stages in meteorson/work) and write two output files: an inline annotated text file (e.g., news.tr-en.cats.final) and a web page view (e.g., news.tr-en.cats.html).
Meteorson is available on GitHub. Clone the repository and make sure that the dependencies below are set. You will then need to compile the METEOR error classifier add-on:
javac -cp $METEOR/meteor-1.5.jar src/ErrorCategorizer.java
Meteorson relies on three external packages: the Perl interface to Stanford's CoreNLP (Lingua::StanfordCoreNLP), Meteor, and Hjerson. Hjerson and METEOR can be installed anywhere on the system as long as environment variables $METEOR and $HJERSON are set. As packaged, tokenization and lemmatization are performed by CoreNLP via Perl scripts, but another tokenizer and/or lemmatizer can be substituted by editing the pipeline script.