Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] E-Branchformer Encoder in ESPnet2 #4812

Merged
merged 11 commits into from Dec 8, 2022
Merged

Conversation

kkim-asapp
Copy link
Contributor

Hi,
This PR is about the E-Branchformer encoder (Kim et al., SLT 2022: will be appeared, Arxiv: link).

The module is implemented in ESPnet2 with the base recipe for Librispeech ASR.
Also, I've adding an argument in Repeat module to support LayerDrop (link).

This PR is in progress, and I will update results on README file.

@sw005320 sw005320 added New Features ASR Automatic speech recogntion labels Dec 6, 2022
@sw005320 sw005320 added this to the v.202211 milestone Dec 6, 2022
x_input (Union[Tuple, torch.Tensor]): Input tensor w/ or w/o pos emb.
- w/ pos emb: Tuple of tensors [(#batch, time, size), (1, time, size)].
- w/o pos emb: Tensor (#batch, time, size).
mask (torch.Tensor): Mask tensor for the input (#batch, time).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shape of mask is (#batch, 1, time)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kkim-asapp Is the shape correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @pyf98! You're right. It was wrong. Let me update it.

Copy link
Collaborator

@pyf98 pyf98 Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember it was wrong in Conformer (or other encoders). If so, can you update it as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do.

@pyf98
Copy link
Collaborator

pyf98 commented Dec 6, 2022

Looks good to me! I'm looking forward to seeing the new results and models.

Since the repeat function is updated to support the new Layer Drop, we may need to update the other encoders as well (probably in a later PR?).

@sw005320
Copy link
Contributor

sw005320 commented Dec 6, 2022

@kkim-asapp, thanks a lot for the great PR!
Can you fix the format error?
https://github.com/espnet/espnet/actions/runs/3632971395/jobs/6129594548#step:8:12

@kkim-asapp
Copy link
Contributor Author

Looks good to me! I'm looking forward to seeing the new results and models.

Since the repeat function is updated to support the new Layer Drop, we may need to update the other encoders as well (probably in a later PR?).

Right! Currently, I didn't touch any parts of stochastic_depth_rate which has similar purpose of this new Layer Drop.
If we agree that we don't need stochastic_depth_rate anymore, then I will replace them with this new method as well.

@kkim-asapp
Copy link
Contributor Author

@kkim-asapp, thanks a lot for the great PR! Can you fix the format error? https://github.com/espnet/espnet/actions/runs/3632971395/jobs/6129594548#step:8:12

I did!

@kkim-asapp
Copy link
Contributor Author

espnet/nets/pytorch_backend/conformer/contextual_block_encoder_layer.py
All lines in this file has ^M at the end. I don't think this is what we want, so I removed them.

@sw005320
Copy link
Contributor

sw005320 commented Dec 7, 2022

You can run some of the test scripts locally instead of checking it via CI.
https://github.com/espnet/espnet/blob/master/CONTRIBUTING.md#4-unit-testing

@kkim-asapp
Copy link
Contributor Author

You can run some of the test scripts locally instead of checking it via CI. https://github.com/espnet/espnet/blob/master/CONTRIBUTING.md#4-unit-testing

I did, but the error logs are different. Let me check after reinstall dependencies.

@sw005320
Copy link
Contributor

sw005320 commented Dec 7, 2022

Oh, OK.
It also depends on the version of the test script and it would not be the exactly same.

@codecov
Copy link

codecov bot commented Dec 7, 2022

Codecov Report

Merging #4812 (a294ca6) into master (a9f7c1e) will increase coverage by 11.38%.
The diff coverage is 54.20%.

@@             Coverage Diff             @@
##           master    #4812       +/-   ##
===========================================
+ Coverage   69.06%   80.44%   +11.38%     
===========================================
  Files         526      534        +8     
  Lines       46125    47118      +993     
===========================================
+ Hits        31858    37906     +6048     
+ Misses      14267     9212     -5055     
Flag Coverage Δ
test_integration_espnet1 66.38% <100.00%> (?)
test_integration_espnet2 48.83% <12.77%> (?)
test_python 68.86% <54.20%> (+0.61%) ⬆️
test_utils 23.34% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
espnet/nets/pytorch_backend/conformer/encoder.py 93.57% <ø> (ø)
...et/nets/pytorch_backend/conformer/encoder_layer.py 93.10% <ø> (ø)
...kend/transformer/contextual_block_encoder_layer.py 81.30% <ø> (+6.54%) ⬆️
espnet/nets/pytorch_backend/transformer/encoder.py 93.60% <ø> (ø)
.../nets/pytorch_backend/transformer/encoder_layer.py 98.07% <ø> (+3.84%) ⬆️
espnet2/asr/decoder/transformer_decoder.py 94.95% <ø> (ø)
espnet2/asr/encoder/branchformer_encoder.py 94.14% <ø> (ø)
espnet2/asr/encoder/conformer_encoder.py 96.29% <ø> (ø)
...ackend/conformer/contextual_block_encoder_layer.py 6.08% <6.08%> (ø)
espnet2/asr/encoder/e_branchformer_encoder.py 95.09% <95.09%> (ø)
... and 152 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@kkim-asapp
Copy link
Contributor Author

Hi @pyf98, is there anything I can do to fix one last failing check? Maybe I need to fix This branch is out-of-date with the base branch by Merge the latest changes from master into this branch., but I'm not sure.

@sw005320
Copy link
Contributor

sw005320 commented Dec 8, 2022

You don't need it.
I can merge this PR.
Thanks, @kkim-asapp

@sw005320
Copy link
Contributor

sw005320 commented Dec 8, 2022

Please make a follow-up PR to provide an updated README file etc.

@sw005320 sw005320 merged commit e9a940e into espnet:master Dec 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ASR Automatic speech recogntion ESPnet1 ESPnet2 New Features
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants