Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do you keep buffer fixed during gradient steps #20

Closed
zw615 opened this issue Sep 24, 2019 · 1 comment
Closed

How do you keep buffer fixed during gradient steps #20

zw615 opened this issue Sep 24, 2019 · 1 comment

Comments

@zw615
Copy link

zw615 commented Sep 24, 2019

Hello!
I've noticed your warning

logging.warn(('{} contains buffer {}. The buffer will be treated as '
                        'a constant and assumed not to change during gradient '
                        'steps. If this assumption is violated (e.g., '
                        'BatchNorm*d\'s running_mean/var), the computation will '
                        'be incorrect.').format(m.__class__.__name__, n))

May I ask how do you keep buffer fixed during gradient steps(e.g. running mean and running var in batchnorm)? In this code there is only LeNet and AlexNet, so this won't be a problem. But I wonder have you done experiment on networks with batchnorm?

Thanks a lot!

@ssnl
Copy link
Owner

ssnl commented Sep 27, 2019

Hi, the code provided currently does not support batch norm. You can implement batch norm support by either (1) using batch norm always in eval mode (track_running_stats=False) or (2) adding code to track and add autograd graphs for buffers.

@ssnl ssnl closed this as completed Sep 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants