Skip to content
This repository has been archived by the owner on Oct 9, 2018. It is now read-only.
/ UvA-MT1-IBM Public archive

Implementation of IBM machine translation models 1 and 2 in Python.

Notifications You must be signed in to change notification settings

wenkokke/UvA-MT1-IBM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IBM Models 1 and 2

Python implementation of IBM Models 1 and 2. Tested using the a French-English parallel corpus from the HLT-NAACL 2003 Workshop.

We use various initializations for both model 1 and 2:

  • Random initialization
  • Uniform initialization
  • Initialization of IBM model 2 with t from IBM model 1

We also added improvements to IBM model 1:

  • Smoothing
  • N-Null words

Training

Train the model by executing.

python src/main.py

Be aware we will save intermediate states of the model which may result in large data files (400MB+).

Evaluation

The python code will dump .eval files which can be used with the provided perl script: data/test/eval/wa_eval_align.pl. Or use bash eval.sh to execute all evaluations at once.

About

Implementation of IBM machine translation models 1 and 2 in Python.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages