Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix grad noise v2 #912

Merged
merged 4 commits into from Jun 26, 2019
Merged

Fix grad noise v2 #912

merged 4 commits into from Jun 26, 2019

Conversation

@sw005320
Copy link
Contributor

sw005320 commented Jun 21, 2019

  • follow the new default parameters by #906
  • modify the interface of the add_gradient_noise function
  • add a new style documentation
  • fix to use .to(param.device) istead of .cuda()
@sw005320 sw005320 requested a review from creatorscan Jun 21, 2019
@sw005320 sw005320 added the Bugfix label Jun 21, 2019
@sw005320 sw005320 added this to the v.0.4.1 milestone Jun 21, 2019
@codecov

This comment has been minimized.

Copy link

codecov bot commented Jun 21, 2019

Codecov Report

Merging #912 into master will increase coverage by 0.01%.
The diff coverage is 20%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master    #912      +/-   ##
=========================================
+ Coverage   50.19%   50.2%   +0.01%     
=========================================
  Files          88      88              
  Lines        9780    9777       -3     
=========================================
  Hits         4909    4909              
+ Misses       4871    4868       -3
Impacted Files Coverage Δ
espnet/asr/pytorch_backend/asr.py 0% <0%> (ø) ⬆️
espnet/asr/asr_utils.py 26.49% <25%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 58dd322...45980f8. Read the comment docs.

@sw005320

This comment has been minimized.

Copy link
Contributor Author

sw005320 commented Jun 21, 2019

@creatorscan, eta / interval ** scale_factor and (eta / interval) ** scale_factor, which is correct? Can you tell me the reference about this?

@sw005320

This comment has been minimized.

Copy link
Contributor Author

sw005320 commented Jun 22, 2019

@creatorscan, eta / interval ** scale_factor and (eta / interval) ** scale_factor, which is correct? Can you tell me the reference about this?

I got the original reference. It seems that eta / interval ** scale_factor is correct.
The second question is that we may put this noise injection right before optimizer.step() because now we use the gradient accumulation and if we use it in the current line, the noise injection is performed #gradient_accum times, which yields too large noises.

@sw005320 sw005320 merged commit b31eea9 into espnet:master Jun 26, 2019
6 of 7 checks passed
6 of 7 checks passed
codecov/patch 20% of diff hit (target 50.19%)
Details
ci/circleci: test-centos7 Your tests passed on CircleCI!
Details
ci/circleci: test-debian9 Your tests passed on CircleCI!
Details
ci/circleci: test-ubuntu16 Your tests passed on CircleCI!
Details
ci/circleci: test-ubuntu18 Your tests passed on CircleCI!
Details
codecov/project 50.2% (+0.01%) compared to 58dd322
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@sw005320 sw005320 mentioned this pull request Jun 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.