-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dependencies of tensors created within a tf.while_loop() might not be executed #15891
Comments
@ebrevdo Do you have any thoughts on this or know who would know it best? |
Can you write a much smaller minimal failing test? |
oh wait; i see you did! |
Try passing |
I just now tried adding As for the first test case, you can take Parts I and III separately to reproduce the issue: from __future__ import division, print_function
import numpy as np
import tensorflow as tf
from tensorflow.python.ops import resource_variable_ops as rr
rs = np.random.RandomState(seed = 2)
A = rs.normal(size = (10, 10,))
print('singular values of A: %s' % (np.linalg.svd(A, compute_uv = False),))
B = rs.normal(size = (10, 10,))
print('singular values of B: %s' % (np.linalg.svd(B, compute_uv = False),))
A_var = tf.Variable(B)
init_A_var_op = tf.assign(A_var, A)
A_dep = tf.constant(9, tf.int32)
def loop_condition(j, A_dep):
return j < 1
def loop_body(j, A_dep):
with tf.control_dependencies([init_A_var_op]):
A_dep = A_dep + 1
return j + 1, A_dep
_, A_dep = tf.while_loop(loop_condition,
loop_body,
loop_vars = [tf.constant(0, tf.int32), A_dep],
parallel_iterations = 1,
back_prop = False)
with tf.control_dependencies([A_dep]):
var_s = tf.svd(A_var, compute_uv = False)
with tf.Session() as session:
session.run(tf.global_variables_initializer())
computed_s, computed_A_dep = session.run([var_s, A_dep])
print('computed_s = %s, computed_A_dep = %d' % (computed_s, computed_A_dep,)) (Alternatively, take Parts I and IV separately.)
|
A member of the TensorFlow organization has replied after the stat:awaiting tensorflower label was applied. |
/CC @alextp can you take a look? |
Can you add a print(tf.get_default_graph().as_graph_def()) so I can understand how we're generating the wrong graph? (i.e. are control dependencies missing, mangled, or ignored because of weird variable stuff) |
I created a gist for the graph def: https://gist.github.com/dtrebbien/f917cb2891e0b141b9fa6323a3c55239 Here is the exact modified test case that I used to print the graph def: from __future__ import division, print_function
import numpy as np
import tensorflow as tf
rs = np.random.RandomState(seed = 2)
A = rs.normal(size = (10, 10,))
print('singular values of A: %s' % (np.linalg.svd(A, compute_uv = False),))
B = rs.normal(size = (10, 10,))
print('singular values of B: %s' % (np.linalg.svd(B, compute_uv = False),))
graph = tf.Graph()
with graph.as_default():
A_var = tf.Variable(B, name = 'A_var')
init_A_var_op = tf.assign(A_var, A, name = 'init_A_var_op')
A_dep = tf.constant(9, tf.int32, name = 'initial_A_dep')
def loop_condition(j, A_dep):
return j < 1
def loop_body(j, A_dep):
with tf.control_dependencies([init_A_var_op]):
A_dep = tf.add(A_dep, 1, name = 'increment_A_dep')
return j + 1, A_dep
_, A_dep = tf.while_loop(loop_condition,
loop_body,
loop_vars = [tf.constant(0, tf.int32), A_dep],
parallel_iterations = 1,
back_prop = False)
with tf.control_dependencies([A_dep]):
var_s = tf.svd(A_var, compute_uv = False)
print(graph.as_graph_def())
with tf.Session(graph = graph) as session:
session.run(tf.global_variables_initializer())
computed_A_dep, computed_s, computed_A_dep2 = session.run([A_dep, var_s, A_dep])
assert computed_A_dep == computed_A_dep2
print('computed_s = %s, computed_A_dep = %d' % (computed_s, computed_A_dep,)) By the way, is there a utility that can graphically display this? |
Looking at the graph def, it looks like no node has control input "^init_A_var_op". Contrasting that with the following working script—which does not use a tf.while_loop()—the "increment_A_dep/y" node corresponding to the const second argument to the "increment_A_dep" tf.add() op has control input "^init_A_var_op": from __future__ import division, print_function
import numpy as np
import tensorflow as tf
rs = np.random.RandomState(seed = 2)
A = rs.normal(size = (10, 10,))
print('singular values of A: %s' % (np.linalg.svd(A, compute_uv = False),))
B = rs.normal(size = (10, 10,))
print('singular values of B: %s' % (np.linalg.svd(B, compute_uv = False),))
graph = tf.Graph()
with graph.as_default():
A_var = tf.Variable(B, name = 'A_var')
init_A_var_op = tf.assign(A_var, A, name = 'init_A_var_op')
A_dep = tf.constant(9, tf.int32, name = 'initial_A_dep')
with tf.control_dependencies([init_A_var_op]):
A_dep = tf.add(A_dep, 1, name = 'increment_A_dep')
with tf.control_dependencies([A_dep]):
var_s = tf.svd(A_var, compute_uv = False)
print(graph.as_graph_def())
with tf.Session(graph = graph) as session:
session.run(tf.global_variables_initializer())
computed_A_dep, computed_s, computed_A_dep2 = session.run([A_dep, var_s, A_dep])
assert computed_A_dep == computed_A_dep2
print('computed_s = %s, computed_A_dep = %d' % (computed_s, computed_A_dep,)) Here is an excerpt from the working script's graph def:
The non-working graph's "while/increment_A_dep/y" node has control input "^while/Identity" but not "^init_A_var_op". |
Ok, so this is a real bug. @asimshankar who do you think we should assign this to? There's a bug somewhere in the control dependency processing logic of WhileContext, somewhere around here most likely:
|
System information
python repro.py
.. where
repro.py
contains the test case to reproduce, listed below.Describe the problem
Here is my test case:
Part I is basic setup. I create two random 10×10 matrices and compute their singular values:
Part II shows usage of control_dependencies() to guarantee that
A
has been assigned toA_var
before the singular values ofA_var
are computed. The output from this part is:(This is the expected result for Part II.)
In Part III, I have introduced use of a tf.while_loop(). Now, tf.svd() is returning the singular values of
B
:(This is not the expected result for Part III. I expect that the singular values of
A
would be printed.)In Part IV, based on reading #4663 (comment) , I switched to using
ResourceVariable
. However, the output is still the same (the singular values ofB
):(This is not the expected result for Part IV. I expect that the singular values of
A
would be printed.)It appears the issue is that tf.control_dependencies() on tensors created by tf.while_loop() might not execute the tensors' own dependencies.
This used to work okay (around TensorFlow 1.1, if I recall correctly).
While searching for a previous report of this issue, I found #6087 which appears related, in that the sample code there has a tf.while_loop() that creates tensors with dependencies. When I run the sample code, I consistently get result = 10. This is an unexpected result, in my opinion. What is happening is that
update_x
runs exactly once, so for each of the 5 loop iterations,x
has the value 2.I tried rewriting the sample code to use a ResourceVariable, but the output is the same:
The text was updated successfully, but these errors were encountered: