New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add InterCTC to E-Branchformer encoder, and the ability to save InterCTC inference output to files #5084
Conversation
…ate CTC predictions to file during ASR inference
for more information, see https://pre-commit.ci
Thanks! |
token_list = self.beam_search.token_list | ||
|
||
for layer_idx, encoder_out in intermediate_outs: | ||
y = self.asr_model.ctc.argmax(encoder_out)[0] # batch_size = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intermediate CTC only supports greedy decoding? I think it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I think greedy decoding can probably cover most use cases
I think it is good as long as this new feature does not affect the default behavior without interCTC. |
Codecov Report
@@ Coverage Diff @@
## master #5084 +/- ##
==========================================
- Coverage 75.88% 75.86% -0.03%
==========================================
Files 615 615
Lines 54767 54824 +57
==========================================
+ Hits 41559 41591 +32
- Misses 13208 13233 +25
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 8 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Thanks a lot! |
Add InterCTC with optional self-conditioning to E-Branchformer encoder
This was originally written by @wanchichen.
The implementation is nearly the same as InterCTC in
TransformerEncoder
andConformerEncoder
.Add the ability to write predictions of InterCTC layer(s) to file during inference
The output is saved at
path/to/asr_decode_dir/test_set/logdir/output.*/1best_recog/encoder_interctc_layer<layer_idx>.txt
.<layer_idx>
corresponds to the indices set in model config file.For example, if the config contains this line:
interctc_layer_idx: [6,12]
, then there will two files corresponding the output of the 6th and the 12th encoder layer:encoder_interctc_layer6.txt
andencoder_interctc_layer12.txt
.Misc
The code in this PR was used in many of my Aphasia speech detection experiments and produced good results. I also added more tests to make sure ASR without InterCTC still works.
@sw005320 I mentioned there was a bug I encountered in my experiments in our last meeting, but looks like it has already been patched.