Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meaning behind --update-mean-var --train-beta-gamma #40

Closed
Tamme opened this issue Feb 9, 2018 · 3 comments
Closed

Meaning behind --update-mean-var --train-beta-gamma #40

Tamme opened this issue Feb 9, 2018 · 3 comments

Comments

@Tamme
Copy link

Tamme commented Feb 9, 2018

Hi.

I havent encountered in other projects this kind of updating values. Is it something originating from PSPNet or this really is the way to use momentum or is it something third?

Thanks,
Tamme

@hellochick
Copy link
Owner

hellochick commented Feb 10, 2018

Hey @Tamme, let me explain the batch normalization layer first. There are four variables in the batch normalization layer, and they are moving_mean, moving_variance, gamma, and beta respectively. And moving_mean and moving variance are not trainable variables, so we need to update them using update ops which is put in tf.GraphKeys.UPDATE_OPS, you can take a look at tensorflow docs. So I use the flag --update-mean-var to decide whether to update mean and var ( Since update them in large batch size is better, if we train in mini-batch, we can frozen these two variables for better reuslts).

@manuel-88
Copy link

hey, when I run the training without -update-mean-var the evaluation results are almost zero. Do you know why @hellochick ?

@hellochick
Copy link
Owner

@manuel-88, if you didn't update mean and variance, then the batch normalization layer will do nothing. Maybe this is the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants