New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
time sync decoding for asr #4792
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- can you add the result in
egs2/librispeech_100/asr1/README.md
? - does this support the GPU inference? (if so, please add some comments)
- add some test functions. Please check some examples in the normal beam search
espnet2/bin/asr_inference.py
Outdated
sos=asr_model.sos, | ||
decoder=asr_model.decoder, | ||
lm=scorers["lm"] if "lm" in scorers else None, | ||
ctc_weight=ctc_weight, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make the weight and scores to be the same interface as the normal BeamSearch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
espnet2/bin/asr_inference.py
Outdated
if time_sync: | ||
beam_search = TimeSyncBeamSearch( | ||
beam_size=beam_size, | ||
ctc=asr_model.ctc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some error handling when there is no ctc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
espnet/nets/time_sync_beam_search.py
Outdated
lambda: (float("-inf"), float("-inf")) | ||
) # (p_nb, p_b) | ||
tmp = [] | ||
for l in hyps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend you add some logging.debug
functions of how the hypothesis grows in each time step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
espnet/nets/time_sync_beam_search.py
Outdated
beam_size: num hyps | ||
sos: sos index | ||
ctc: CTC module | ||
pre_beam_ratio: pre_beam_ratio * beam_size = pre_beam |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to have more explanations of pre_beam
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@YosukeHiguchi, could you also review this PR? |
This pull request is now in conflict :( |
…net-ml into time_sync_asr_pr
Codecov Report
@@ Coverage Diff @@
## master #4792 +/- ##
==========================================
+ Coverage 80.62% 80.66% +0.04%
==========================================
Files 535 536 +1
Lines 47379 47517 +138
==========================================
+ Hits 38197 38329 +132
- Misses 9182 9188 +6
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@sw005320 I think this passed CI except for an unrelated test |
Can you add a unit test? |
Sorry for the delay. I added a unit test; it is very basic and just checks that the beam search produces some result. I do not compare to any hand computed values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two comments
- naming: by following
beam_search.py
andbeam_search_transducer.py,
it would be better to name it asbeam_search_timesync.py.
Please also change the other names (e.g., the class nameTimeSyncBeamSearch
-->BeamSearchTimeSync
accordingly. - I'm thinking about where is the best place to put this function. We generally want to add a new function under the
espnet2
directory (e.g.,espnet2/asr
). But at the same time, putting it in its current place also makes sense in terms of putting all beam search-related functions in the same directory. I think we can leave it based on your idea, but I just added this remark for the future beam search design.
Thanks @sw005320. This is fixed now. |
Thanks, @brianyan918 ! |
This PR supports time synchronous decoding for ASR.
Librispeech_100 results:
WER
RTF
Note on RTF: