Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README

## This software is purified with GPLv3+ ##

 DLMAnalyzer

  Koichi Akabe <vbkaisetsu@gmail.com>

============================================
 This is a program that lists informative n-grams for MT error analysis
 using structured perceptron.
============================================

Required:
  My development environment is:
    Boost (1.49), gflags (2.0), g++ (4.8.2)

Build:
  $ autoreconf -i
  $ ./configure
  $ make

  Additionaly, you can run "(sudo) make install" to install dlm_train on your computer.

Train discriminative LM and generate the model file:
  $ dlm_train -eta [ETA] -modeldata [MODEL_FILE] -traindata [N-BESTS for training] -testdata [N-BESTS for testing]

Generate the evaluation sheet:
  $ ./scripts/generate_sheet_seed.py [MODEL_FILE] [ONE-BESTS] [NUMBER of n-grams] > SEED
  $ ./scripts/build_analysis_sheet.py [SOURCE] [TARGET REFS] [ORDER MAP] [SEED] > HTML file

============================================

Input data:
List of translation candidates for each sentence with system scores and evaluation scores.

sentence id ||| translation ||| system score ||| evaluation score
............
......
...

sentence id       ...  ID of original sentence
translation       ...  translation candidate for the original sentence
system score      ...  the score given by the translation system
evaluation score  ...  the score given by the evaluation measure

For example, we have three candidates for the 2nd original sentence:

2 ||| 僕 は 少女 を 望遠鏡 で 見 た 。 ||| -54.24256771686 ||| 0.82328
2 ||| 僕 は 望遠鏡 を 持 っ た 少女 を 見 た 。 ||| -54.26887833166 ||| 0.788141
2 ||| 僕 は 少女 を 望遠鏡 で 見 る 。 ||| -54.27542894284 ||| 0.834369

In this case, translation system outputs the 1st sentence as the best translation,
but actually the 3rd translation is the best translation.

About

Discriminative Language Models as a Tool for Machine Translation Error Analysis

Resources

License

Releases

No releases published

Packages

No packages published
You can’t perform that action at this time.