add fastconformer timestamp (reopen) #8438

biscayan · 2024-02-16T02:04:16Z

What does this PR do ?

I closed a pull request #8341 because of lots of conflicts in the branch.
I tried to solve, but finally I deleted the branch, made new one, and add sign-off to the commit.
Sorry for my mistakes.

Collection: [Note which collection this PR will affect]

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Jenkins CI

To run Jenkins, a NeMo User with write access must comment jenkins on the PR.

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

@titu1994 @nithinraok @tango4j
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

Signed-off-by: biscayan <skgudwn34@gmail.com>

for more information, see https://pre-commit.ci

github-actions · 2024-03-02T01:43:04Z

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

tango4j · 2024-03-05T23:53:48Z

@biscayan

Do you remember how you come up with the model_stride_sec = 0.08? Link to the line
With 0.08 sec, fast conformer is not yielding the right timestamp. It is generating slightly longer sample length.

This needs more investigation on time-stamp generation on FastConformer Model.

biscayan · 2024-03-06T01:17:41Z

@tango4j
Thank you for reviewing.
In the fastconformer paper, it is explained that fastconformer model has downsampling schema with 80ms frame rate.
It means that the time of one frame is 80ms.
I tested the extracted timestamps while listening the wav file.
When I set the model_stride_sec=0.04, the timestamps do not match with the wavfile, and after changing the model_stride_sec from 0.04 to 0.08, timestamps accurately match with the wavfile

nithinraok · 2024-03-08T19:39:58Z

jenkins

tango4j · 2024-03-08T23:37:05Z

@biscayan
Did you check the cpWER ? or you just checked DER ?
When I tested with CallHome data, the fast conformer was showing 50% ish cpWER while other CTC models are showing 18~20% errors.
I think there is some issues with chunk-based ASR system.

biscayan · 2024-03-11T00:46:24Z

@tango4j
I didn't compute the cpWER for fastconformer model.
I just compared the asr result and the transcript in clean speech data, and the result is almost accurate.
However, I also faced the problem in real reverberated meeting data, the model has low asr performance.

nithinraok · 2024-03-11T02:30:42Z

@tango4j what are other CTC models you were comparing, are both of similar sizes?

tango4j · 2024-03-11T19:31:22Z

@nithinraok
Model size does not affect the timestamp issue with chunk based ASR. Only fast-conformer CTC models are returning the inaccurate timestamps with chunk based ASR.
We use ConformerCTC large stt_en_conformer_ctc_large for test purpose.

biscayan · 2024-03-15T07:52:40Z

@tango4j
Should I do something more for the review?

github-actions · 2024-03-30T01:43:03Z

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

github-actions · 2024-04-06T01:44:05Z

This PR was closed because it has been inactive for 7 days since being marked as stale.

nithinraok · 2024-05-08T18:00:16Z

@KunalDhawan pls test this PR on a known set to compare against similar sized conformer model

nithinraok · 2024-05-08T18:01:53Z

Also add a integration test for testing purposes! testing on English dataset is good enough.

KunalDhawan

Hi @biscayan,
I ran offline diarization+ASR with timestamps using your proposed changes for the following 2 models:

conformer-ctc-large: 120M params trained on 24.5K hours of English speech
fastconformer-ctc-large: 115M params trained on 24.5K hours of English speech

Both the models have similar parameters and have been trained on the same data.

For evaluation, I used a cleaned version of CallHome-109.

Following is the performance of both the models:

conformer-ctc-large:
DER : 0.0474
FA : 0.0111
MISS : 0.0285
CER : 0.0078
Spk. counting acc. : 0.9083
cpWER : 0.4640
WER : 0.2383
fastconformer-ctc-large:
DER : 0.0474
FA : 0.0111
MISS : 0.0285
CER : 0.0078
Spk. counting acc. : 0.9083
cpWER : 0.5158
WER : 0.3153

Clearly the fastconformer model with the parameters proposed in the PR doesn't perform up to the expectation. I also verified the performance of the conformer and fastconformer models on non-chunk ASR (librispeech test), here the fastconformer model performs better than the conformer model, as expected.

Looking at the model predictions on the CH109 samples used above, it appears that the fastconformer model is missing a few words in the chunk based prediction, leading to the high observed WER and hence cpWER. Chucked ASR with fastconformer in offline_diar_with_asr_infer would have to be fixed before we can support timestamp calculation with fastconformer with this script.

Because of the reasons highlighted above, we'll close this incomplete PR and open another one with all the fixes.

biscayan · 2024-05-21T23:57:16Z

Hi @KunalDhawan
Thank you for your report.
I will develop the PR later.
Thank you.

add fastconformer timestamp

4fbc168

Signed-off-by: biscayan <skgudwn34@gmail.com>

github-actions bot added the ASR label Feb 16, 2024

[pre-commit.ci] auto fixes from pre-commit.com hooks

236db9e

for more information, see https://pre-commit.ci

github-actions bot added the stale label Mar 2, 2024

Merge branch 'main' into fastconformer_ts

332abd3

github-actions bot removed the stale label Mar 5, 2024

Merge branch 'main' into fastconformer_ts

9f50dbd

Merge branch 'main' into fastconformer_ts

fec8221

github-actions bot added the stale label Mar 30, 2024

github-actions bot closed this Apr 6, 2024

nithinraok requested a review from KunalDhawan May 8, 2024 17:59

nithinraok reopened this May 8, 2024

KunalDhawan requested changes May 21, 2024

View reviewed changes

KunalDhawan closed this May 21, 2024

KunalDhawan mentioned this pull request Aug 15, 2024

Support for FastConformer-CTC models in Diarization+ASR with timestamps #10154

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add fastconformer timestamp (reopen) #8438

add fastconformer timestamp (reopen) #8438

biscayan commented Feb 16, 2024 •

edited

Loading

github-actions bot commented Mar 2, 2024

tango4j commented Mar 5, 2024 •

edited

Loading

biscayan commented Mar 6, 2024

nithinraok commented Mar 8, 2024

tango4j commented Mar 8, 2024

biscayan commented Mar 11, 2024

nithinraok commented Mar 11, 2024

tango4j commented Mar 11, 2024

biscayan commented Mar 15, 2024

github-actions bot commented Mar 30, 2024

github-actions bot commented Apr 6, 2024

nithinraok commented May 8, 2024

nithinraok commented May 8, 2024

KunalDhawan left a comment

biscayan commented May 21, 2024

add fastconformer timestamp (reopen) #8438

add fastconformer timestamp (reopen) #8438

Conversation

biscayan commented Feb 16, 2024 • edited Loading

What does this PR do ?

Changelog

Usage

Jenkins CI

Before your PR is "Ready for review"

Who can review?

Additional Information

github-actions bot commented Mar 2, 2024

tango4j commented Mar 5, 2024 • edited Loading

biscayan commented Mar 6, 2024

nithinraok commented Mar 8, 2024

tango4j commented Mar 8, 2024

biscayan commented Mar 11, 2024

nithinraok commented Mar 11, 2024

tango4j commented Mar 11, 2024

biscayan commented Mar 15, 2024

github-actions bot commented Mar 30, 2024

github-actions bot commented Apr 6, 2024

nithinraok commented May 8, 2024

nithinraok commented May 8, 2024

KunalDhawan left a comment

Choose a reason for hiding this comment

biscayan commented May 21, 2024

biscayan commented Feb 16, 2024 •

edited

Loading

tango4j commented Mar 5, 2024 •

edited

Loading