FDA

Feature Decay Algorithms (FDA)

Citation:

Ergun Bicici and Deniz Yuret, “Optimizing Instance Selection for Statistical Machine Translation with Feature Decay Algorithms”, IEEE/ACM Transactions On Audio, Speech, and Language Processing (TASLP), 2014.

FDA is developed as part of my PhD thesis:

Ergun Biçici. The Regression Model of Machine Translation. PhD thesis, Koç University, 2011. Note: Supervisor: Deniz Yuret.

Usage: fda [opts] train1 test1 [train2] [test2]\n" "train1: first (mandatory) arg gives source language train file\n" "test1 : second (mandatory) arg gives source language test file\n" "train2: third (optional) arg gives target language train file\n" " this is used to output target sentences as a second column\n" "test2 : fourth (optional) arg gives target language test file\n" " this is used to calculate metrics like bigram coverage\n" "If any of these arguments are "-", the data is read from stdin.\n" "Gzip compressed files are automatically recognized and handled.\n" "Other options (and defaults) are:\n" "-v (1): verbosity level, -v0 no messages, -v2 more detail\n" "-n (3): maximum ngram order for features\n" "-t (0): number of training words output, -t0 means no limit\n" "-o (null): output file, stdout is used if not specified\n" "The rest of the options are used to calculate feature and sentence scores:\n" "-i (1.0): initial feature score idf exponent\n" "-l (1.0): initial feature score ngram length exponent\n" "-d (0.5): final feature score decay factor\n" "-c (0.0): final feature score decay exponent\n" "-s (1.0): sentence score length exponent\n" "Formulas:\n" "initial feature score: fscore0 = idf^i * ngram^l\n" "final feature score : fscore1 = fscore0 * d^cnt * cnt^(-c)\n" "sentence score : sscore = sum_fscore1 * slen^(-s)\n"

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
fda.c		fda.c
fda.h		fda.h
foreach.h		foreach.h
minialloc.c		minialloc.c
minialloc.h		minialloc.h
ngram.c		ngram.c
ngram.h		ngram.h
procinfo.h		procinfo.h
sentence.c		sentence.c
sentence.h		sentence.h
token.h		token.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FDA

About

Releases

Packages

Languages

License

bicici/FDA

Folders and files

Latest commit

History

Repository files navigation

FDA

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages