-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ESPnet2] Intermediate/Self-conditioned CTC #4084
Conversation
@brianyan918, can I ask you to review this PR? |
I believe this implementation is well done. I agree with the way to pass the pointer to the ctc module from forward() instead of during init. I still need to fix that way in espnet1, because it would cause issues when using multi-gpu. |
First results on LibriSpeech-100h (w/o an LM and beam-search decoding):
Overall, the results are really good except the SC-CTC results are slightly worse than InterCTC's, which contradicts our observations on espnet1 (https://arxiv.org/pdf/2110.05249.pdf). I will look over my implementation again and try to tune the models. |
Codecov Report
@@ Coverage Diff @@
## master #4084 +/- ##
========================================
Coverage 80.94% 80.94%
========================================
Files 435 435
Lines 37425 37651 +226
========================================
+ Hits 30294 30477 +183
- Misses 7131 7174 +43
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Can you add a test? |
Thanks, @YosukeHiguchi! |
This PR adds implementations for Intermediate/Self-conditioned CTC (Inter/SC-CTC) on espnet2, along with a strong baseline configuration for training a pure Conformer-CTC model. I'm still running some experiments to verify that the implementation is accurate and after that I will add the results.