Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix the gradient backward issue when joint training with s3prl frontend #5159

Merged
merged 1 commit into from May 22, 2023

Conversation

simpleoier
Copy link
Collaborator

@simpleoier simpleoier commented May 5, 2023

In s3prl, feature_grad_mult was set to 0. Thus the forward is in the context of torch.no_grad() here. It will stop the gradient back-propagation in joint training. To fix it, just set it to be 1 manually.

Also in this PR, the encoder layerdrop part is removed, because it is set to 0 from s3prl for all upstreams, e.g. WavLM.

@codecov
Copy link

codecov bot commented May 11, 2023

Codecov Report

Merging #5159 (d3dbd76) into master (4430286) will decrease coverage by 0.01%.
The diff coverage is 33.33%.

@@            Coverage Diff             @@
##           master    #5159      +/-   ##
==========================================
- Coverage   74.99%   74.99%   -0.01%     
==========================================
  Files         618      618              
  Lines       55588    55589       +1     
==========================================
  Hits        41689    41689              
- Misses      13899    13900       +1     
Flag Coverage Δ
test_integration_espnet1 66.28% <ø> (-0.01%) ⬇️
test_integration_espnet2 47.61% <0.00%> (-0.01%) ⬇️
test_python 65.45% <33.33%> (-0.01%) ⬇️
test_utils 23.28% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
espnet2/asr/frontend/s3prl.py 77.27% <33.33%> (-1.19%) ⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@sw005320 sw005320 added this to the v.202307 milestone May 19, 2023
@sw005320 sw005320 added Bugfix SSL self-supervised learning labels May 19, 2023
@sw005320 sw005320 requested review from Emrys365 and removed request for pengchengguo May 19, 2023 02:40
@sw005320
Copy link
Contributor

@Emrys365, can you review this PR?
I hope this would solve our issues completely.

Copy link
Collaborator

@Emrys365 Emrys365 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I can also verify the gradient issue is resolved on my side.

Comment on lines -54 to -60
if getattr(
upstream.upstream, "model", None
) is not None and upstream.upstream.model.__class__.__name__ in [
"Wav2Vec2Model",
"HubertModel",
]:
upstream.upstream.model.encoder.layerdrop = 0.0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these lines removed?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is because S3PRL already sets encoder_layerdrop=0 when initializing an upstream, so we no longer need to do this in ESPnet. Am I right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is correct.

Copy link
Collaborator

@Emrys365 Emrys365 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I can also verify the gradient issue is resolved on my side.

@sw005320 sw005320 merged commit 53ccddd into espnet:master May 22, 2023
22 of 25 checks passed
@sw005320
Copy link
Contributor

Thanks a lot for fixing it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants