Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add InterCTC to E-Branchformer encoder, and the ability to save InterCTC inference output to files #5084

Merged
merged 5 commits into from Apr 7, 2023

Conversation

tjysdsg
Copy link
Contributor

@tjysdsg tjysdsg commented Apr 6, 2023

Add InterCTC with optional self-conditioning to E-Branchformer encoder

This was originally written by @wanchichen.
The implementation is nearly the same as InterCTC in TransformerEncoder and ConformerEncoder.

Add the ability to write predictions of InterCTC layer(s) to file during inference

The output is saved at path/to/asr_decode_dir/test_set/logdir/output.*/1best_recog/encoder_interctc_layer<layer_idx>.txt.

<layer_idx> corresponds to the indices set in model config file.

For example, if the config contains this line: interctc_layer_idx: [6,12], then there will two files corresponding the output of the 6th and the 12th encoder layer: encoder_interctc_layer6.txt and encoder_interctc_layer12.txt.

Misc

The code in this PR was used in many of my Aphasia speech detection experiments and produced good results. I also added more tests to make sure ASR without InterCTC still works.

@sw005320 I mentioned there was a bug I encountered in my experiments in our last meeting, but looks like it has already been patched.

@mergify mergify bot added the ESPnet2 label Apr 6, 2023
@sw005320
Copy link
Contributor

sw005320 commented Apr 6, 2023

Thanks!
@pyf98, can you review this PR?

@sw005320 sw005320 added Enhancement Enhancement ASR Automatic speech recogntion labels Apr 6, 2023
@sw005320 sw005320 added this to the v.202303 milestone Apr 6, 2023
token_list = self.beam_search.token_list

for layer_idx, encoder_out in intermediate_outs:
y = self.asr_model.ctc.argmax(encoder_out)[0] # batch_size = 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intermediate CTC only supports greedy decoding? I think it's fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think greedy decoding can probably cover most use cases

@pyf98
Copy link
Collaborator

pyf98 commented Apr 6, 2023

I think it is good as long as this new feature does not affect the default behavior without interCTC.

@codecov
Copy link

codecov bot commented Apr 7, 2023

Codecov Report

Merging #5084 (6b33255) into master (78db412) will decrease coverage by 0.03%.
The diff coverage is 88.33%.

@@            Coverage Diff             @@
##           master    #5084      +/-   ##
==========================================
- Coverage   75.88%   75.86%   -0.03%     
==========================================
  Files         615      615              
  Lines       54767    54824      +57     
==========================================
+ Hits        41559    41591      +32     
- Misses      13208    13233      +25     
Flag Coverage Δ
test_integration_espnet1 66.29% <ø> (ø)
test_integration_espnet2 48.44% <21.66%> (-0.06%) ⬇️
test_python 65.78% <81.66%> (-0.08%) ⬇️
test_utils 23.28% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
espnet2/asr/encoder/e_branchformer_encoder.py 93.87% <87.50%> (-1.28%) ⬇️
espnet2/bin/asr_inference.py 87.91% <89.28%> (+0.33%) ⬆️

... and 8 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@sw005320 sw005320 added the auto-merge Enable auto-merge label Apr 7, 2023
@sw005320
Copy link
Contributor

sw005320 commented Apr 7, 2023

Thanks a lot!

@mergify mergify bot merged commit a505a23 into espnet:master Apr 7, 2023
25 checks passed
@tjysdsg tjysdsg deleted the interctc_patch branch April 9, 2023 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ASR Automatic speech recogntion auto-merge Enable auto-merge Enhancement Enhancement ESPnet2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants