Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maybe wrong when computing batch_normalization? #7

Closed
lfwin opened this issue Apr 5, 2016 · 7 comments
Closed

Maybe wrong when computing batch_normalization? #7

lfwin opened this issue Apr 5, 2016 · 7 comments
Labels

Comments

@lfwin
Copy link

lfwin commented Apr 5, 2016

Hi,
in function batch_normalization, there has a line which update the mean and var or not according to in training or evaluating mode, but seems like both return the same variables.
... mean, var = tf.python.control_flow_ops.cond( is_training, update_mean_var, lambda: (ema_mean, ema_var)) ...

@aymericdamien
Copy link
Member

Hi,

By returning same variables, do you mean same value? or is it another issue?
This operation returns a single Tensor, whatever "is_training" value. But during computation, the calculation graph will be altered according to "is_training" value, so that if is_training is True, the first operation (mean/variance update) will be performed, or the second one otherwise.

If you are using that operation without TFLearn trainer class, you need to specify a training mode:

# To set training mode:
tflearn.is_training(True)
# To remove training mode:
tflearn.is_training(False)

(There is little typo in the doc hat can be confusing; this operation actually 'set' the training mode, and not 'check' it. I will update it)

@lfwin
Copy link
Author

lfwin commented Apr 5, 2016

Hi,
During training, function update_mean_var seems just not update the mean/variance and just return a identity of it.
` def update_mean_var():
with tf.control_dependencies([ema_apply_op]):
return tf.identity(ema_mean), tf.identity(ema_var)

    is_training = tflearn.get_training_mode()
    mean, var = tf.python.control_flow_ops.cond(
        is_training, update_mean_var, lambda: (ema_mean, ema_var))`

@aymericdamien
Copy link
Member

tf.control_dependencies is an operation used to ensure that ema_apply_op is calculated before return tf.identity(ema_mean), tf.identity(ema_var), so the ema_apply_op is actually computed (updating the mean/variance).

@lfwin
Copy link
Author

lfwin commented Apr 6, 2016

Hi,
sorry, I am not correct, confused by the use of the class tf.train.ExponentialMovingAverage.
Another question, where does this op 'ema_apply_op' runs during training?

@aymericdamien
Copy link
Member

ema_apply_op runs at training time only, everytime a data batch 'enter' a batch_normalization layer, it will be applied (So the mean/variance is updated at every batch).

@lfwin
Copy link
Author

lfwin commented Apr 7, 2016

I find a example code use the very similar code in tensorflow/tensorflow#1724 as in batch_normalization:
`import tensorflow as tf
import numpy as np

x = tf.Variable(1.)
update_x = tf.assign_add(x, 1.0)

do_update = tf.placeholder(tf.bool)

ema = tf.train.ExponentialMovingAverage(.9)
ema_assign = ema.apply([x])

avg_without_update = ema.average(x)

with tf.control_dependencies([ema_assign]):
avg_with_update = tf.identity(avg_without_update)

avg = tf.python.control_flow_ops.cond(do_update, lambda: avg_with_update, lambda: avg_without_update)

with tf.Session() as sess:
sess.run(tf.initialize_all_variables())
print(sess.run([update_x, avg], {do_update: False}))
print(sess.run([update_x, avg], {do_update: False}))
print(sess.run([update_x, avg], {do_update: False}))
print(sess.run([update_x, avg], {do_update: False}))
print(sess.run([update_x, avg], {do_update: False}))`

here do_update is false, but the average is updated, could you tell me where goes wrong?

@aymericdamien
Copy link
Member

It seems there is a bug here. Referring to the issue you sent, it seems that conditions (fn1 & fn2) dependencies are always executed, no matter the condition of tf.cond(), thus the result... I can try to see if there is a workaround, or wait for TensorFlow to patch it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants