-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add fastconformer timestamp (reopen) #8438
Conversation
Signed-off-by: biscayan <skgudwn34@gmail.com>
for more information, see https://pre-commit.ci
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
Do you remember how you come up with the model_stride_sec = 0.08? Link to the line This needs more investigation on time-stamp generation on FastConformer Model. |
@tango4j |
jenkins |
@biscayan |
@tango4j |
@tango4j what are other CTC models you were comparing, are both of similar sizes? |
@nithinraok |
@tango4j |
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days. |
This PR was closed because it has been inactive for 7 days since being marked as stale. |
@KunalDhawan pls test this PR on a known set to compare against similar sized conformer model |
Also add a integration test for testing purposes! testing on English dataset is good enough. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @biscayan,
I ran offline diarization+ASR with timestamps using your proposed changes for the following 2 models:
- conformer-ctc-large: 120M params trained on 24.5K hours of English speech
- fastconformer-ctc-large: 115M params trained on 24.5K hours of English speech
Both the models have similar parameters and have been trained on the same data.
For evaluation, I used a cleaned version of CallHome-109.
Following is the performance of both the models:
-
conformer-ctc-large:
DER : 0.0474
FA : 0.0111
MISS : 0.0285
CER : 0.0078
Spk. counting acc. : 0.9083
cpWER : 0.4640
WER : 0.2383 -
fastconformer-ctc-large:
DER : 0.0474
FA : 0.0111
MISS : 0.0285
CER : 0.0078
Spk. counting acc. : 0.9083
cpWER : 0.5158
WER : 0.3153
Clearly the fastconformer model with the parameters proposed in the PR doesn't perform up to the expectation. I also verified the performance of the conformer and fastconformer models on non-chunk ASR (librispeech test), here the fastconformer model performs better than the conformer model, as expected.
Looking at the model predictions on the CH109 samples used above, it appears that the fastconformer model is missing a few words in the chunk based prediction, leading to the high observed WER and hence cpWER. Chucked ASR with fastconformer in offline_diar_with_asr_infer would have to be fixed before we can support timestamp calculation with fastconformer with this script.
Because of the reasons highlighted above, we'll close this incomplete PR and open another one with all the fixes.
Hi @KunalDhawan |
What does this PR do ?
I closed a pull request #8341 because of lots of conflicts in the branch.
I tried to solve, but finally I deleted the branch, made new one, and add sign-off to the commit.
Sorry for my mistakes.
Collection: [Note which collection this PR will affect]
Changelog
Usage
# Add a code snippet demonstrating how to use this
Jenkins CI
To run Jenkins, a NeMo User with write access must comment
jenkins
on the PR.Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
@titu1994 @nithinraok @tango4j
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information