New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with custom grad with multiple external variables #43535
Comments
The issue is that tensorflow/tensorflow/python/ops/custom_gradient.py Lines 507 to 508 in e798106
|
Are you telling me it's a tf problem or a problem in my code ? I'm not sure what I should read in the two lines: it seems logiclal. According to my code the two "variables" actual_grad_fn receive should be varr and varr3, (i'm not sure why you said it only receive "e+varr" since it's an output of fn) and indeed there 's only one i guess |
Yes It was just a typo I meant |
OK so I've inspected with the code:
and I found something: In the above code, varr3 is not registered in variables_in_tape seemingly because it's used as an "identity function" and there's no operation involving it For example if I change my function to return this (just adding 0.0 to varr3 so the function is essentially the same) : Then tensorflow will register varr3, and as expected will require me to provide gradient for both varr and varr3. I'm pretty sure this is not expected behavior from the tensorflow team ... not sure how to escalate but you guys should check this out. Here's code to test it
|
/cc @rohan100jain |
@RochMollero |
@Saduf2019 The user is asking something different. /cc @jaingaurav @wangpengmit can you give a feedback here? |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you. |
precisely because you ran the working version only. the not working version is when you set return [e+varr, varr3], (without the 0.0*) |
You're right here. We have some logic that detects what variables are used within a custom_gradient function and that only works when there is actual variable read (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/eager/pywrap_tfe_src.cc#L3294) and not when a variable is simply passed through to the output. Right now we don't have plans to fix this immediately, so I'd recommend just doing 1.0*varr or something so that a read is triggered. On the other hand, contributions are welcome to fix this! One suggestion would be to look at the outputs from running the function (
|
Was able to reproduce your issue in TF 2.11, please find the gist here. Thanks! |
This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you. |
Describe the current behavior
The "variables" argument in custom grad seems to be buggy. For example this code, which is a complexified version of the "custom_grad" example in the doc:
Leads to the error:
ValueError: Must return gradient for each variable from @custom_gradient grad_fn.
Which i seem to be doing (log1pexp has 2 inputs, outputs and two external variables, so the return size should be list of size 2 each ?)
However if I change line 23 to
return [dy * (1 - 1 / (1 + e)) + 0 * varr2, tf.constant(0.0)], [0.0 * varr]
Then it doesn't give an error, which doesn't make sense to me since in that case, grad_vars only contain gradient info for ONE of the external variables I have registered in the variables parameters.
Also it doesn't care about varr2 at all ...
Describe the expected behavior
The code above should not give error?
Standalone code to reproduce the issue
See above
Other info / logs
Basic ubuntu 20. error with tf 2.2 and 2.3
The text was updated successfully, but these errors were encountered: