Hessian (calling tf.gradients twice) of tf.scan fails #2598

dementrock · 2016-05-31T21:47:27Z

GitHub issues are for bugs / installation problems / feature requests.
For general support from the community, see StackOverflow.
To make bugs and feature requests more easy to find and organize, we close issues that are deemed
out of scope for GitHub Issues and point people to StackOverflow.

For bugs or installation issues, please provide the following information.
The more information you provide, the more easily we will be able to offer
help and advice.

Environment info

Operating System: Mac OS X 10.11.2

Installed version of CUDA and cuDNN: None

If installed from sources, provide the commit hash:

4455f81

Steps to reproduce

Run the following script:

import tensorflow as tf

theta = tf.Variable(initial_value=1.)


def fn(x, prev):
    return prev + x * theta

result = tf.scan(fn, [1., 2., 3.])

grad_theta = tf.gradients(result, theta)

tf.gradients(grad_theta, theta)

will result in the following error:

Traceback (most recent call last):
  File "sandbox/rocky/tf/small_example.py", line 13, in <module>
    tf.gradients(grad_theta, theta)
  File "/Users/dementrock/anaconda/envs/rllab/lib/python2.7/site-packages/tensorflow/python/ops/gradients.py", line 379, in gradients
    to_ops, from_ops)
  File "/Users/dementrock/anaconda/envs/rllab/lib/python2.7/site-packages/tensorflow/python/ops/gradients.py", line 185, in _PendingCount
    between_ops)
  File "/Users/dementrock/anaconda/envs/rllab/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 874, in MaybeCreateControlFlowState
    loop_state.AddWhileContext(op, between_op_list, between_ops)
  File "/Users/dementrock/anaconda/envs/rllab/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 738, in AddWhileContext
    for loop_exit in forward_ctxt.loop_exits:
TypeError: 'NoneType' object is not iterable

What have you tried?

Nothing beyond creating this minimal reproducible example

Logs or other output that would be helpful

(If logs are large, please upload as attachment).

The text was updated successfully, but these errors were encountered:

concretevitamin · 2016-05-31T22:57:51Z

Adding @yuanbyu@, @ebrevdo.

girving · 2016-06-07T17:08:52Z

We don't support taking gradients of nonscalars, so I wouldn't expect this to work. However, this particular error message is pretty confusing. @yuanbyu: Is there a way we could improve this error message if someone tries for a Hessian in the naive way and control flow is involved?

yuanbyu · 2016-06-07T19:24:13Z

Yes, the error message should be better. Let me see what I can do.

dementrock · 2016-06-07T23:34:29Z

@girving Here theta is just a scalar value though. What would be the work around here?

girving · 2016-06-08T00:10:59Z

@dementrock: That's true in this case, but we don't want to do extra work on the control flow ops if all it provides is higher order derivatives w.r.t. scalars. What is your intended use case?

dementrock · 2016-06-08T21:09:03Z

@girving I was doing some hessian vector product computations and had the same error, but the code snippet above is simpler and highlights the issue.

So is there no way to get higher order derivatives w.r.t. scan right now?

girving · 2016-06-08T21:10:05Z

@dementrock: We don't support higher order gradients even ignoring control flow.

dementrock · 2016-06-09T19:32:06Z

@girving No support as in no official support or it won't work at all? Seems like the following code at least compiles:

import tensorflow as tf

theta = tf.Variable(initial_value=1.)


def fn(x, prev):
    return prev + x * theta

result = fn(fn(1.0, 2.0), 3.0)

grad_theta = tf.gradients(result, theta)

tf.gradients(grad_theta, theta)

Any plan to support it in the future?

girving · 2016-06-09T23:44:43Z

@dementrock The problem is that the registered gradient routines would get significantly more complicated if both sides were nonscalar, and we don't want to support that kind of complexity. As discussed in #675, it's possible one could implement registered gradients with some sort of automatic machinery to map scalar gradient routines to nonscalar gradient routines, but this is a lot of work and we don't have any plans to do it.

The other problem is that the applications I know of nonscalar gradients aren't that compelling as yet, since they tend to be impractically huge. However, there are cases where higher order gradient information arises where you're differentiating a scalar, specifically Hessian-free Krylov-ish methods where one evaluates the gradient dotted with a suitably chosen vector.

If that last bit is what you're trying to do, or something similar, we'd be happy to accept pull requests to make control flow not interfere. It might be pretty complicated, though.

…while loop. Fixes tensorflow#2598 Change: 124304732

concretevitamin assigned yuanbyu May 31, 2016

aselle added triaged and removed triaged labels Jun 1, 2016

girving added stat:awaiting response Status - Awaiting response from author and removed triaged labels Jun 7, 2016

girving added triaged and removed stat:awaiting response Status - Awaiting response from author labels Jun 7, 2016

vrv closed this as completed in e5c136b Jun 8, 2016

yaroslavvb pushed a commit to yaroslavvb/tensorflow that referenced this issue Jun 11, 2016

Better error message for trying to compute second-order gradient for …

e6c57c6

…while loop. Fixes tensorflow#2598 Change: 124304732

tatatodd mentioned this issue Sep 7, 2016

tf.nn.bias_add does not support multiple derivatives (using with contrib.learn) #4174

Closed

tillahoffmann mentioned this issue Nov 1, 2016

Hessian with respect to one-dimensional tensors #5329

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hessian (calling tf.gradients twice) of tf.scan fails #2598

Hessian (calling tf.gradients twice) of tf.scan fails #2598

dementrock commented May 31, 2016 •

edited

concretevitamin commented May 31, 2016

girving commented Jun 7, 2016

yuanbyu commented Jun 7, 2016

dementrock commented Jun 7, 2016

girving commented Jun 8, 2016

dementrock commented Jun 8, 2016 •

edited

girving commented Jun 8, 2016

dementrock commented Jun 9, 2016

girving commented Jun 9, 2016

Hessian (calling tf.gradients twice) of tf.scan fails #2598

Hessian (calling tf.gradients twice) of tf.scan fails #2598

Comments

dementrock commented May 31, 2016 • edited

Environment info

Steps to reproduce

What have you tried?

Logs or other output that would be helpful

concretevitamin commented May 31, 2016

girving commented Jun 7, 2016

yuanbyu commented Jun 7, 2016

dementrock commented Jun 7, 2016

girving commented Jun 8, 2016

dementrock commented Jun 8, 2016 • edited

girving commented Jun 8, 2016

dementrock commented Jun 9, 2016

girving commented Jun 9, 2016

dementrock commented May 31, 2016 •

edited

dementrock commented Jun 8, 2016 •

edited