Skip to content

vbkaisetsu/dlm-analyzer

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
## This software is purified with GPLv3+ ##

 DLMAnalyzer

  Koichi Akabe <vbkaisetsu@gmail.com>

============================================
 This is a program that lists informative n-grams for MT error analysis
 using structured perceptron.
============================================

Required:
  My development environment is:
    Boost (1.49), gflags (2.0), g++ (4.8.2)

Build:
  $ autoreconf -i
  $ ./configure
  $ make

  Additionaly, you can run "(sudo) make install" to install dlm_train on your computer.

Train discriminative LM and generate the model file:
  $ dlm_train -eta [ETA] -modeldata [MODEL_FILE] -traindata [N-BESTS for training] -testdata [N-BESTS for testing]

Generate the evaluation sheet:
  $ ./scripts/generate_sheet_seed.py [MODEL_FILE] [ONE-BESTS] [NUMBER of n-grams] > SEED
  $ ./scripts/build_analysis_sheet.py [SOURCE] [TARGET REFS] [ORDER MAP] [SEED] > HTML file

============================================

Input data:
List of translation candidates for each sentence with system scores and evaluation scores.

sentence id ||| translation ||| system score ||| evaluation score
............
......
...

sentence id       ...  ID of original sentence
translation       ...  translation candidate for the original sentence
system score      ...  the score given by the translation system
evaluation score  ...  the score given by the evaluation measure

For example, we have three candidates for the 2nd original sentence:

2 ||| 僕 は 少女 を 望遠鏡 で 見 た 。 ||| -54.24256771686 ||| 0.82328
2 ||| 僕 は 望遠鏡 を 持 っ た 少女 を 見 た 。 ||| -54.26887833166 ||| 0.788141
2 ||| 僕 は 少女 を 望遠鏡 で 見 る 。 ||| -54.27542894284 ||| 0.834369

In this case, translation system outputs the 1st sentence as the best translation,
but actually the 3rd translation is the best translation.

About

Discriminative Language Models as a Tool for Machine Translation Error Analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published