Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorArray grad bug #13355

Closed
albertz opened this issue Sep 28, 2017 · 8 comments
Closed

TensorArray grad bug #13355

albertz opened this issue Sep 28, 2017 · 8 comments
Assignees
Labels
stat:awaiting tensorflower Status - Awaiting response from tensorflower

Comments

@albertz
Copy link
Contributor

albertz commented Sep 28, 2017

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
  • TensorFlow installed from (source or binary): pip binary
  • TensorFlow version (use command below): v1.3.0-rc2-20-g0787eee 1.3.0
  • Python version: 3.6.1

Describe the problem

tf.TensorArray in some cases does not correctly passes the gradient. See the test case.

Source code / logs

This fails:

def test_tensorarray_grad_simple():
  n_time = 1
  n_dim = 1
  x = [[1.42]]
  dy = [[2.42]]

  x = tf.convert_to_tensor(x)
  x.set_shape(tf.TensorShape((n_time, n_dim)))
  with tf.name_scope("gradients"):
    # Note that tensor_array_grad._GetGradSource() has this ugly hack
    # which requires that we have the "gradients" prefix.
    dy = tf.identity(tf.convert_to_tensor(dy), name="dy")
  dy.set_shape(tf.TensorShape((n_time, n_dim)))

  ta = tf.TensorArray(tf.float32, size=n_time, element_shape=tf.TensorShape((n_dim,)))
  for t in range(n_time):
    ta = ta.write(index=t, value=x[t])
  y = ta.stack()
  y.set_shape(tf.TensorShape((n_time, n_dim)))
  # y = y[::1]  -- if you add this, the test passes
  dx, = tf.gradients(ys=[y], grad_ys=[dy], xs=[x])
  vx, vdy, vy, vdx = session.run([x, dy, y, dx])
  print("x:", vx)
  print("y:", vy)
  print("dy:", vdy)
  print("dx:", vdx)
  assert_allclose(vx, vy)
  assert_allclose(vdy, vdx)

I get the output:

x: [[ 1.41999996]]
y: [[ 1.41999996]]
dy: [[ 2.42000008]]
dx: [[ 0.]]

Strangely, if you add something like y = y[::1] before taking the gradient, it passes.

albertz added a commit to rwth-i6/returnn that referenced this issue Sep 28, 2017
@cy89
Copy link

cy89 commented Oct 8, 2017

@alextp would you please take a look?

@cy89 cy89 added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Oct 8, 2017
@alextp
Copy link
Contributor

alextp commented Oct 9, 2017

@ebrevdo , can you take a look?

@ebrevdo
Copy link
Contributor

ebrevdo commented Oct 9, 2017

This is definitely a bug, and it has to do with this line:

grad_source = _GetGradSource(grad)

the weird "gradients" prefix is there so that the TensorArray can differentiate between different calls to tf.gradients: each call to tf.gradients must create a separate gradient TensorArray. The reason you're getting zeros is that your gradients name prefix is not the same name prefix created by tf.gradients (since gradients is already taken, it's probably creating a name prefix gradients_1 - and this difference in prefix is confusing the TensorArray).

Note that this will work:

tf.gradients(y, x, [[[2.42]]])

because the grad_ys [[[2.42]]] is converted to a tensor inside the tf.gradients, and will have the appropriate name scope.

One solution might be to wrap all grad_ys going into tf.gradients in a tf.identity that brings them into the same gradient name scope. Not sure it's the best solution.

@ebrevdo
Copy link
Contributor

ebrevdo commented Oct 11, 2017

I have a solution; will test and push - if all is green, you will see it in 2-3 days.

@caisq caisq closed this as completed in d6b6169 Oct 13, 2017
@albertz
Copy link
Contributor Author

albertz commented Nov 3, 2017

This is not fixed yet in 1.4.0. The test does not pass.

@ebrevdo
Copy link
Contributor

ebrevdo commented Nov 3, 2017 via email

@albertz
Copy link
Contributor Author

albertz commented Nov 3, 2017

No, the official 1.4.0 release, installed via pip install. I thought that the commit should be in that release but not sure. Specifically, my version is v1.4.0-rc1-11-g130a514 1.4.0.

@ebrevdo
Copy link
Contributor

ebrevdo commented Nov 3, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting tensorflower Status - Awaiting response from tensorflower
Projects
None yet
Development

No branches or pull requests

4 participants