Skip to content

Commit

Permalink
improve bilingualLM alignment heuristics consistency
Browse files Browse the repository at this point in the history
  • Loading branch information
rsennrich committed Nov 26, 2014
1 parent ee759bf commit 4ca730a
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion scripts/training/bilingual-lm/extract.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,8 @@ def get_ngrams(corpus_stem, align_file, tagged_stem, svocab, tvocab, slang,tlang

if not spos_list:
raise Exception("No alignments in sentence \nSRC: " + lines[0][:-1] + "\nTGT: " + lines[1][:-1])
spos = (max(spos_list) + min(spos_list)) / 2
midpos = (len(spos_list)-1) / 2
spos = sorted(spos_list)[midpos]


# source-context, target-context, predicted word
Expand Down

0 comments on commit 4ca730a

Please sign in to comment.