NMTGMinor.lowLatency

Based on https://github.com/quanpn90/NMTGMinor and https://github.com/jniehues-kit/SLT.KIT

Requirements

For training:

Python >= 3.6
pytorch
hdf5
apex
kaldiio

For eval:

sacrebleu
nmtpytorch

Data prep

Get pre-packaged data via https://github.com/srvk/how2-dataset

Place the dataset under $BASEDIR/data/orig/how2-300h-v1

Run preprocessor:

./prep.data.sh

Train

Train a unidirectional model by:

./train.sh

where NMTGMinor/Train.speech.uni0.sh gets called.

-limit_rhs_steps controls the number of look-ahead steps. -limit_rhs_steps=0 means no look-ahead.

Predict & Eval

Full-sequence decoding

Run prediction and eval by:

./pred.sh

Chunk-based incremental decoding

First chunk the input utterances by:

python ./smalltools/create_chunks.py $RESDIR/how2/data/prepro/eval/dev5.scp $RESDIR/how2/data/prepro/eval_partial_0.5sec 50

Point to the partial utterances in test set:

ln -s $RESDIR/how2/data/prepro/eval_partial_0.5sec/feats.scp $RESDIR/how2/data/prepro/eval/dev5.0.5sec.scp

ln -s $RESDIR/how2/data/prepro/eval_partial_0.5sec/num.partial.seqs.0.5sec.pickle $RESDIR/how2/data/prepro/eval/dev5.0.5sec.num.partial.seqs.pickle

Strategy: local agreement

Run prediction and eval under strategy local agreement:

./pred.agree.sh

Strategy: hold-n

Run prediction and eval under strategy hold-n:

./pred.holdn.sh

Strategy: wait-k

Run prediction and eval under strategy wait-k:

./pred.waitk.sh

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
onmt		onmt
scripts		scripts
smalltools		smalltools
LICENSE		LICENSE
README.md		README.md
average_checkpoints.py		average_checkpoints.py
average_checkpoints_fromlogs.py		average_checkpoints_fromlogs.py
options.py		options.py
pred.agree.sh		pred.agree.sh
pred.holdn.sh		pred.holdn.sh
pred.sh		pred.sh
pred.waitk.sh		pred.waitk.sh
prep.data.sh		prep.data.sh
preprocess.py		preprocess.py
train.py		train.py
train.sh		train.sh
translate.py		translate.py

License

dannigt/NMTGMinor.lowLatency

Folders and files

Latest commit

History

Repository files navigation

NMTGMinor.lowLatency

Requirements

Data prep

Train

Predict & Eval

Full-sequence decoding

Chunk-based incremental decoding

Strategy: local agreement

Strategy: hold-n

Strategy: wait-k

About

Resources

License

Stars

Watchers

Forks

Languages