Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hessian (calling tf.gradients twice) of tf.scan fails #2598

Closed
dementrock opened this issue May 31, 2016 · 9 comments
Closed

Hessian (calling tf.gradients twice) of tf.scan fails #2598

dementrock opened this issue May 31, 2016 · 9 comments
Assignees

Comments

@dementrock
Copy link

dementrock commented May 31, 2016

GitHub issues are for bugs / installation problems / feature requests.
For general support from the community, see StackOverflow.
To make bugs and feature requests more easy to find and organize, we close issues that are deemed
out of scope for GitHub Issues and point people to StackOverflow.

For bugs or installation issues, please provide the following information.
The more information you provide, the more easily we will be able to offer
help and advice.

Environment info

Operating System: Mac OS X 10.11.2

Installed version of CUDA and cuDNN: None

If installed from sources, provide the commit hash:

4455f81

Steps to reproduce

Run the following script:

import tensorflow as tf

theta = tf.Variable(initial_value=1.)


def fn(x, prev):
    return prev + x * theta

result = tf.scan(fn, [1., 2., 3.])

grad_theta = tf.gradients(result, theta)

tf.gradients(grad_theta, theta)

will result in the following error:

Traceback (most recent call last):
  File "sandbox/rocky/tf/small_example.py", line 13, in <module>
    tf.gradients(grad_theta, theta)
  File "/Users/dementrock/anaconda/envs/rllab/lib/python2.7/site-packages/tensorflow/python/ops/gradients.py", line 379, in gradients
    to_ops, from_ops)
  File "/Users/dementrock/anaconda/envs/rllab/lib/python2.7/site-packages/tensorflow/python/ops/gradients.py", line 185, in _PendingCount
    between_ops)
  File "/Users/dementrock/anaconda/envs/rllab/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 874, in MaybeCreateControlFlowState
    loop_state.AddWhileContext(op, between_op_list, between_ops)
  File "/Users/dementrock/anaconda/envs/rllab/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 738, in AddWhileContext
    for loop_exit in forward_ctxt.loop_exits:
TypeError: 'NoneType' object is not iterable

What have you tried?

Nothing beyond creating this minimal reproducible example

Logs or other output that would be helpful

(If logs are large, please upload as attachment).

@concretevitamin
Copy link
Contributor

Adding @yuanbyu@, @ebrevdo.

@aselle aselle added triaged and removed triaged labels Jun 1, 2016
@girving
Copy link
Contributor

girving commented Jun 7, 2016

We don't support taking gradients of nonscalars, so I wouldn't expect this to work. However, this particular error message is pretty confusing. @yuanbyu: Is there a way we could improve this error message if someone tries for a Hessian in the naive way and control flow is involved?

@girving girving added stat:awaiting response Status - Awaiting response from author and removed triaged labels Jun 7, 2016
@yuanbyu
Copy link
Contributor

yuanbyu commented Jun 7, 2016

Yes, the error message should be better. Let me see what I can do.

@girving girving added triaged and removed stat:awaiting response Status - Awaiting response from author labels Jun 7, 2016
@dementrock
Copy link
Author

@girving Here theta is just a scalar value though. What would be the work around here?

@girving
Copy link
Contributor

girving commented Jun 8, 2016

@dementrock: That's true in this case, but we don't want to do extra work on the control flow ops if all it provides is higher order derivatives w.r.t. scalars. What is your intended use case?

@vrv vrv closed this as completed in e5c136b Jun 8, 2016
@dementrock
Copy link
Author

dementrock commented Jun 8, 2016

@girving I was doing some hessian vector product computations and had the same error, but the code snippet above is simpler and highlights the issue.

So is there no way to get higher order derivatives w.r.t. scan right now?

@girving
Copy link
Contributor

girving commented Jun 8, 2016

@dementrock: We don't support higher order gradients even ignoring control flow.

@dementrock
Copy link
Author

@girving No support as in no official support or it won't work at all? Seems like the following code at least compiles:

import tensorflow as tf

theta = tf.Variable(initial_value=1.)


def fn(x, prev):
    return prev + x * theta

result = fn(fn(1.0, 2.0), 3.0)

grad_theta = tf.gradients(result, theta)

tf.gradients(grad_theta, theta)

Any plan to support it in the future?

@girving
Copy link
Contributor

girving commented Jun 9, 2016

@dementrock The problem is that the registered gradient routines would get significantly more complicated if both sides were nonscalar, and we don't want to support that kind of complexity. As discussed in #675, it's possible one could implement registered gradients with some sort of automatic machinery to map scalar gradient routines to nonscalar gradient routines, but this is a lot of work and we don't have any plans to do it.

The other problem is that the applications I know of nonscalar gradients aren't that compelling as yet, since they tend to be impractically huge. However, there are cases where higher order gradient information arises where you're differentiating a scalar, specifically Hessian-free Krylov-ish methods where one evaluates the gradient dotted with a suitably chosen vector.

If that last bit is what you're trying to do, or something similar, we'd be happy to accept pull requests to make control flow not interfere. It might be pretty complicated, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants