-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
input and output of @tf.custom_gradient #21756
Comments
@andydavis1 would you PTAL, or reassign to someone who knows the custom gradients code? |
The documentation could be improved here. Saying " CC @alextp @DSRYhh - do you have suggestions for better phrasing? |
I'm preparing a PR which removes "list" from the documentation, fixing the issue you saw there. In your last example the correct way to do this is
as in, the second return value when variables is not None should be a list with one element per variable in variables. I'll clarify the documentation there too. |
@alextp To be more clear, for the second parameter (and the second return value), If that's correct, why not |
grad_ys is the "downstream" gradient of the outputs of your function; since
the variables are not outputs there is no gradient already computed wrt
them. If you want you can call tf.gradients or use the tf.GradientTape
yourself to compute the gradient wrt the variables to then modify it, but
we don't force you to do that since it would waste computation in eager
execution.
…On Tue, Sep 11, 2018 at 3:56 AM Huang Yuheng ***@***.***> wrote:
@alextp <https://github.com/alextp> To be more clear, for the second
parameter (and the second return value), grad_fn accepts *original*
variables (not the *gradient* of variables) and return the *gradient* of
variables, is that correct?
If that's correct, why not grad_fn accepts *gradient* of variables
instead of *original* variables ( in order to be consistent with grad_ys
(the first parameter))? In that case, we can use the derivates of variables
by automatic differentiation instead of writing the derivates of variables
manually.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#21756 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAATxetzCj348nk4KLUjm3FAJNNXJqR3ks5uZ5b1gaJpZM4WFeuE>
.
--
- Alex
|
System information
Describe the problem
I am confusing about the input and output of tf.custom_gradient.
Input
In doc, it says:
x
is aTensor
or sequence ofTensor
inputs to the function. But with multiple inputs, instead of taking a sequence ofTensor
s, functionf
takesN
positional arguments. I think this is a mistake in documentation. A sequence ofTensor
s can't be passed tof
which can be reproduced by code below:It will try to convert
[x, y]
to a singleTensor
and raises a error:While change to positional arguments can fix the bug:
Related discussion can be found at https://stackoverflow.com/questions/51836242/tf-custom-gradient-with-multiple-inputs.
Output
This is the problem about the output of
grad_fn
.In doc,
grad_vars
is alist<Tensor>
with the derivatives ofTensor
s iny
with respect to the variables, and signature isg(*grad_ys, variables=None)
.variables
is originalvariables
or the gradient ofvariables
likegrad_ys
?grad_vars
as alist<Tensor>
will raise an error:It seems like it handles
grad_vars
as aTensor
:Change
grad_vars
toTensor
doesn't work either:The text was updated successfully, but these errors were encountered: