-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New beam-search framework: ScorerInterface, CPU/GPU float16/32/64 decoding, and new language models (SeqRNNLM and TransformerLM) #1092
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1092 +/- ##
==========================================
+ Coverage 76.36% 78.65% +2.29%
==========================================
Files 83 91 +8
Lines 7810 8045 +235
==========================================
+ Hits 5964 6328 +364
+ Misses 1846 1717 -129
Continue to review full report at Codecov.
|
Actually, the current GPU batch ASR decoding does not do special things for rnnlm. See espnet/espnet/lm/pytorch_backend/lm.py Lines 134 to 146 in f1bb241
So, you could just do the same thing for now. I want to revisit it after #980. |
@sw005320 As there is no breaking change, I expect master is better. |
OK. |
I have finished easy ones: Transformer and length penalty because they are state-less in the current implementation. Tomorrow, I will implement the stateful ones (RNN ASR, RNN LM, CTC Prefix Score). Currently, the old and new implementations results completely same scores. However, IMO, I do not want to reimplement this local pruning with espnet/espnet/nets/pytorch_backend/e2e_asr_transformer.py Lines 320 to 333 in 6759e7f
I have no idea to share scores between decoder and ctc modules because every decoder runs independently in my impl.espnet/espnet/nets/pytorch_backend/beam_search.py Lines 130 to 133 in be342ae
Does this |
|
OK. I got how to do it. For simplicity, I will add new option like
So first |
@kan-bayashi I have a suggestion for
# --extend-ignore for wip files for flake8-docstrings
flake8 --extend-ignore=D espnet test utils
# white list of files that should support flake8-docstrings
flake8 \
espnet/nets/beam_search.py \
espnet/nets/lm_interface.py \
espnet/nets/scorer_interface.py \
... see how to disable flake8-docstrings temporally https://stackoverflow.com/questions/41860123/how-to-ignore-installed-flake8-plugin-easily-for-one-time |
That’s a nice idea. |
I agree. I made the black list. No more invalid docstrings 🍺 Now, I need to fix ppl things in |
I believe it is done. Additionally, I split the flake8 part in |
Fixed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-Authored-By: Tomoki Hayashi <hayashi.tomoki@g.sp.m.is.nagoya-u.ac.jp>
Many thanks! |
This PR refactors beam search as discussed in #489
espnet.nets.pytorch_backend.lm.legacy:LegacyRNNLM
I will implement these essential feature not to break current features
lm_train.py --model-module xxx
asr_recog.py --lm xxx
(since it is not--rnnlm
anymore. I will leave the existing--rnnlm
as-is.)and I will not support the other existing features with custom LMs like ASR training with custom LMs, GPU batch decoding, streaming decoding, word LM, etc in this PR.