-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conditionally trainable variables and stochastic depth neural networks #8817
Comments
I am adding this to our list of models that we would like to make easier in TensorFlow. I don't have any personal knowledge of the paper, but as far as your comment about having a trainable flag. It seems like you could multiply by a vector of 0's or 1' to mask the variable dynamically to achieve the same effect. Let me know if that would be sufficient? Thanks! |
Thank you @aselle, |
I see. I think you could probably implement this using a custom optimizer that controls the update vector and disables it using knowledge of variables and their position in layers. This may not be easy, but it may be possible. |
Closing due to inactivity. I'll reopen this issue if @awav indicates the previous suggestion was not sufficient. Note: We have an on-call rotation for triaging issue. When filing issues, please let us take care of tagging team members for you. |
@awav - correct me if I'm missing something, but is the goal to simply not update Variables that aren't used due to a conditional? TensorFlow already zeros out these gradients. Here's some sample code: import tensorflow as tf
tf.reset_default_graph()
a = tf.Variable(10.0)
b = tf.Variable(10.0)
switch = tf.placeholder(tf.bool)
res = tf.cond(switch, lambda: tf.mul(2.0, a), lambda: tf.square(b))
opt = tf.train.GradientDescentOptimizer(0.05)
grads = opt.compute_gradients(res)
train = opt.apply_gradients(grads)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
print(sess.run(grads, {switch: True})) If you adjust the When [(2.0, 10.0), (0.0, 10.0)] When [(0.0, 10.0), (20.0, 10.0)] For completeness, if you apply the gradients with different switches set, you only update one or the other:
I think the most likely problem that might occur when trying to implement stochastic depth is that you may not see the reduced computation due to the less-lazy way |
The conditional statement does not seem to cut it for me. In my case, I have a model of the form data -> encoder -> intermediate result -> decoder -> result. I would like to be able to set the variables in the decoder and encoder as trainable during training by passing a boolean tensor. Is it possible to do it using 'tf.cond'? When I pass a boolean tensor as tf.get_variable(...,trainable = boolTensor) I get a TypeError. |
I have same problem. In my case, I have the model input -> features -> decode1 -> loss1 |
I came across with a task where I would like to apply stochastic depth regularization technique using Tensorflow (https://arxiv.org/pdf/1603.09382.pdf). Tensorflow doesn't provide enough settings to implement this one. I found closed issue #1784 which is similar to this request, where guys finished the discussion with claim that [
tf.cond
|tf.select
] primitives are enough for this task. But if you carefully read the paper it says that during training the depth changes for both directions: forward and backward propagation steps. Therefore number of tranable W parameters of the network changes too. The core conception of the Tensorflow is building computation graph before session of training is run. Currently, I can not create dynamic computation graph, so that depending on a boolean value W parameters of a layer were not engaged in optimisation process.If
tf.Variable
acceptedtrainable
parameter as a boolean tensor apart from built-in boolean value it would solve the problem. In this case, it would mean that Tensorflow operates natively with dynamic computational graphs, which in fact very powerful tool.I would appreciate any suggestions and ideas, so that this question was closed for good and all.
@vrv, @martinwicke, @aselle
The text was updated successfully, but these errors were encountered: