## Overview

In this homework we tried two models for alignment:

- IBM Model 1
- HMM model

And we apply bidirectional alignment training and decoding for both models. We found that by applying the pretrained parameters in IBM model 1 into HMM model, we got relatively nice **AER -- 0.11FIXME** only training for FIXME iterations in HMM.

## Baseline (IBM model 1)

For each English-French word pair, we have $t(f|e)$, which is initially $1/|f|$.

For each iteration, we:

* initial count() and count_pair() to 0
* for each parallel sentence pair $(f,e)$
 * for each French word $f_i$
 * $z = \sum_{e_j} t(f_i|e_j)$
  * for each English word $e_j$
   * $c = t(f_i|e_j) / z$
   * $count\_pair(f_i|e_j) += c$
   * $count(e_j) += c$
* for each word pair (f,e) in count_pair()
 * $t(f|e) = count\_pair(f|e) / count_e(e)$

Repeat the process until the difference between the new log likelihood and the previous one is smaller than a fixed value epsilon or until we have run a fixed number of iterations.

#### Result

After 8 iterations:
* Precision = 0.599407
* Recall = 0.773403
* AER = 0.341046

## Improvements

### Bidirectional IBM model 1 (align using $Pr(f|e)$ and $Pr(e|f)$)

Align using $Pr(f|e)$ and also align using $Pr(e|f)$. Then decode the best alignment using each model independently. Then report the alignments that are the intersection of these two alignment sets.

#### Result

After 100 iterations:
* Precision = 0.867216
* Recall = 0.695146
* AER = 0.220469

#### Analysis

We can see that by intersecting the decoding results of two alignment directions, we got much higher precision but lower recall. This means we discarded many good results which do not appear in the intersections. There are ways to improve both precision and recall by intersecting during training (Liang et al.)

### HMM-based alignment model



#### Result

After 4 iterations:
* Precision = 0.731220
* Recall = 0.862803
* AER = 0.223748

### Combine HMM-based alignment model with bidirectional IBM model 1

#### Result

After 3 iteration:
* Precision = 0.949554
* Recall = 0.820456
* AER = 0.113253

## References

[1] "IBM Models". SMT Research Survey Wiki. 11 September 2015. Retrieved 20 Nov 2018.
[2]  P. Liang, B. Taskar, and D. Klein. Alignment by agreement. In NAACL. 2006

In [5]:
import argparse, sys, os, logging
from itertools import islice
import pickle
from tqdm import tqdm
import numpy as np
from collections import defaultdict
import math
import matplotlib.pyplot as plt
from HMMmodel import BiHMMmodel, score_alignments
f_data = "data/hansards.fr"
e_data = "data/hansards.en"
a_data = "data/hansards.a"
with open(f_data) as f, open(e_data) as e, open(a_data) as a:
    f_data, e_data, a_data = f.readlines(),\
                             e.readlines(), \
                             a.readlines()

bitext = [[sentence.strip().split() for sentence in pair] for pair in 
    zip(f_data, e_data)]
rev_bitext = [[e_sentence, f_setence] for f_setence, e_sentence in bitext]
bihmmmodel = BiHMMmodel()
bihmmmodel.load_model('bihmmckpt/bihmm_iter3.m')
bihmmmodel.validate(bitext, rev_bitext, f_data, e_data, a_data)

Precision = 0.949554
Recall = 0.820456
AER = 0.113253
