Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shouldn't we use moving_mean and moving_bar in training mode instead of batch mean and var? #16

Closed
jonhe88 opened this issue Oct 29, 2018 · 4 comments

Comments

@jonhe88
Copy link

jonhe88 commented Oct 29, 2018

            if 'train' in stats_mode:
                xn = tf.nn.batch_normalization(
                    list_input[i], mean, var, beta, gamma, bn_epsilon)
                if tf.get_variable_scope().reuse or 'gather' not in stats_mode:
                    list_output.append(xn)
                else:
                    # gather stats and it is the main gpu device.
                    xn = update_bn_ema(xn, mean, var, moving_mean, moving_var, bn_ema)
                    list_output.append(xn)
            else:
                xn = tf.nn.batch_normalization(
                    list_input[i], moving_mean, moving_var, beta, gamma, bn_epsilon)
                list_output.append(xn)
@holyseven
Copy link
Owner

What is your argument of using moving_mean and moving_bar?

@jonhe88
Copy link
Author

jonhe88 commented Oct 29, 2018

In the third line of the pasted codes, the batch mean and var of the current batch are used for batchnorm computation. I think we should use moving mean and var in BN

@holyseven
Copy link
Owner

Why do you think we should use moving mean and var in BN during training?

@jonhe88
Copy link
Author

jonhe88 commented Oct 29, 2018

You're right, this is the standard procedure, moving mean during training has defects

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants