fix the gradient backward issue when joint training with s3prl frontend #5159

simpleoier · 2023-05-05T22:01:50Z

In s3prl, feature_grad_mult was set to 0. Thus the forward is in the context of torch.no_grad() here. It will stop the gradient back-propagation in joint training. To fix it, just set it to be 1 manually.

Also in this PR, the encoder layerdrop part is removed, because it is set to 0 from s3prl for all upstreams, e.g. WavLM.

codecov · 2023-05-11T15:12:30Z

Codecov Report

Merging #5159 (d3dbd76) into master (4430286) will decrease coverage by 0.01%.
The diff coverage is 33.33%.

@@            Coverage Diff             @@
##           master    #5159      +/-   ##
==========================================
- Coverage   74.99%   74.99%   -0.01%     
==========================================
  Files         618      618              
  Lines       55588    55589       +1     
==========================================
  Hits        41689    41689              
- Misses      13899    13900       +1

Flag	Coverage Δ
test_integration_espnet1	`66.28% <ø> (-0.01%)`	⬇️
test_integration_espnet2	`47.61% <0.00%> (-0.01%)`	⬇️
test_python	`65.45% <33.33%> (-0.01%)`	⬇️
test_utils	`23.28% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
espnet2/asr/frontend/s3prl.py	`77.27% <33.33%> (-1.19%)`	⬇️

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

sw005320 · 2023-05-19T02:41:25Z

@Emrys365, can you review this PR?
I hope this would solve our issues completely.

Emrys365

LGTM. I can also verify the gradient issue is resolved on my side.

Emrys365 · 2023-05-20T15:37:47Z

espnet2/asr/frontend/s3prl.py

-        if getattr(
-            upstream.upstream, "model", None
-        ) is not None and upstream.upstream.model.__class__.__name__ in [
-            "Wav2Vec2Model",
-            "HubertModel",
-        ]:
-            upstream.upstream.model.encoder.layerdrop = 0.0


Why are these lines removed?

I think this is because S3PRL already sets encoder_layerdrop=0 when initializing an upstream, so we no longer need to do this in ESPnet. Am I right?

Yes, it is correct.

Emrys365

LGTM. I can also verify the gradient issue is resolved on my side.

sw005320 · 2023-05-22T14:15:00Z

Thanks a lot for fixing it!

mergify bot added ESPnet2 Installation labels May 5, 2023

simpleoier closed this May 10, 2023

simpleoier force-pushed the joint_train_fix branch from 5c62225 to 33aa097 Compare May 10, 2023 13:30

simpleoier reopened this May 10, 2023

fix the gradient backward issue when joint training with s3prl frontend

d3dbd76

simpleoier force-pushed the joint_train_fix branch from 3434492 to d3dbd76 Compare May 11, 2023 14:37

sw005320 added this to the v.202307 milestone May 19, 2023

sw005320 requested a review from pengchengguo May 19, 2023 02:40

sw005320 added Bugfix SSL self-supervised learning labels May 19, 2023

sw005320 requested review from Emrys365 and removed request for pengchengguo May 19, 2023 02:40

Emrys365 approved these changes May 20, 2023

View reviewed changes

sw005320 merged commit 53ccddd into espnet:master May 22, 2023
22 of 25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix the gradient backward issue when joint training with s3prl frontend #5159

fix the gradient backward issue when joint training with s3prl frontend #5159

simpleoier commented May 5, 2023 •

edited

codecov bot commented May 11, 2023 •

edited

sw005320 commented May 19, 2023

Emrys365 left a comment

Emrys365 May 20, 2023

pengchengguo May 21, 2023

simpleoier May 22, 2023

Emrys365 left a comment

sw005320 commented May 22, 2023

fix the gradient backward issue when joint training with s3prl frontend #5159

fix the gradient backward issue when joint training with s3prl frontend #5159

Conversation

simpleoier commented May 5, 2023 • edited

codecov bot commented May 11, 2023 • edited

Codecov Report

sw005320 commented May 19, 2023

Emrys365 left a comment

Choose a reason for hiding this comment

Emrys365 May 20, 2023

Choose a reason for hiding this comment

pengchengguo May 21, 2023

Choose a reason for hiding this comment

simpleoier May 22, 2023

Choose a reason for hiding this comment

Emrys365 left a comment

Choose a reason for hiding this comment

sw005320 commented May 22, 2023

simpleoier commented May 5, 2023 •

edited

codecov bot commented May 11, 2023 •

edited