Discriminative Language Models as a Tool for Machine Translation Error Analysis
License
vbkaisetsu/dlm-analyzer
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
master
Could not load branches
Nothing to show
Could not load tags
Nothing to show
{{ refName }}
default
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code
-
Clone
Use Git or checkout with SVN using the web URL.
Work fast with our official CLI. Learn more.
- Open with GitHub Desktop
- Download ZIP
Sign In Required
Please sign in to use Codespaces.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching Xcode
If nothing happens, download Xcode and try again.
Launching Visual Studio Code
Your codespace will open once ready.
There was a problem preparing your codespace, please try again.
## This software is purified with GPLv3+ ## DLMAnalyzer Koichi Akabe <vbkaisetsu@gmail.com> ============================================ This is a program that lists informative n-grams for MT error analysis using structured perceptron. ============================================ Required: My development environment is: Boost (1.49), gflags (2.0), g++ (4.8.2) Build: $ autoreconf -i $ ./configure $ make Additionaly, you can run "(sudo) make install" to install dlm_train on your computer. Train discriminative LM and generate the model file: $ dlm_train -eta [ETA] -modeldata [MODEL_FILE] -traindata [N-BESTS for training] -testdata [N-BESTS for testing] Generate the evaluation sheet: $ ./scripts/generate_sheet_seed.py [MODEL_FILE] [ONE-BESTS] [NUMBER of n-grams] > SEED $ ./scripts/build_analysis_sheet.py [SOURCE] [TARGET REFS] [ORDER MAP] [SEED] > HTML file ============================================ Input data: List of translation candidates for each sentence with system scores and evaluation scores. sentence id ||| translation ||| system score ||| evaluation score ............ ...... ... sentence id ... ID of original sentence translation ... translation candidate for the original sentence system score ... the score given by the translation system evaluation score ... the score given by the evaluation measure For example, we have three candidates for the 2nd original sentence: 2 ||| 僕 は 少女 を 望遠鏡 で 見 た 。 ||| -54.24256771686 ||| 0.82328 2 ||| 僕 は 望遠鏡 を 持 っ た 少女 を 見 た 。 ||| -54.26887833166 ||| 0.788141 2 ||| 僕 は 少女 を 望遠鏡 で 見 る 。 ||| -54.27542894284 ||| 0.834369 In this case, translation system outputs the 1st sentence as the best translation, but actually the 3rd translation is the best translation.
About
Discriminative Language Models as a Tool for Machine Translation Error Analysis
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published