How do you keep buffer fixed during gradient steps #20

zw615 · 2019-09-24T14:30:48Z

Hello!
I've noticed your warning

logging.warn(('{} contains buffer {}. The buffer will be treated as '
                        'a constant and assumed not to change during gradient '
                        'steps. If this assumption is violated (e.g., '
                        'BatchNorm*d\'s running_mean/var), the computation will '
                        'be incorrect.').format(m.__class__.__name__, n))

May I ask how do you keep buffer fixed during gradient steps(e.g. running mean and running var in batchnorm)? In this code there is only LeNet and AlexNet, so this won't be a problem. But I wonder have you done experiment on networks with batchnorm?

Thanks a lot!

The text was updated successfully, but these errors were encountered:

ssnl · 2019-09-27T20:14:06Z

Hi, the code provided currently does not support batch norm. You can implement batch norm support by either (1) using batch norm always in eval mode (track_running_stats=False) or (2) adding code to track and add autograd graphs for buffers.

ssnl closed this as completed Sep 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do you keep buffer fixed during gradient steps #20

How do you keep buffer fixed during gradient steps #20

zw615 commented Sep 24, 2019 •

edited

Loading

ssnl commented Sep 27, 2019

How do you keep buffer fixed during gradient steps #20

How do you keep buffer fixed during gradient steps #20

Comments

zw615 commented Sep 24, 2019 • edited Loading

ssnl commented Sep 27, 2019

zw615 commented Sep 24, 2019 •

edited

Loading