-
Notifications
You must be signed in to change notification settings - Fork 25k
Add python API for get_gradients() method. #28926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add python API for get_gradients() method. #28926
Conversation
The get_gradients method was a pybind only method without any documentation for this method for users. I've moved this method to our python distributed autograd API and ensured that we have appropriate docs for this method. Differential Revision: [D18234443](https://our.internmc.facebook.com/intern/diff/D18234443/) [ghstack-poisoned]
The get_gradients method was a pybind only method without any documentation for this method for users. I've moved this method to our python distributed autograd API and ensured that we have appropriate docs for this method. Differential Revision: [D18234443](https://our.internmc.facebook.com/intern/diff/D18234443/) ghstack-source-id: 92941094 Pull Request resolved: #28926
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a comment on the PR itself.
Why is this map keyed by tensors? Reason I ask: in autograd backward
causes the grads to be accumulated in the grad
member field of tensors with requires_grad=True
. If you want access to the grads without accumulating them (i.e. compute them without side effects), you can use the torch.autograd.grad
function. Similarly, I would expect dist_autograd.backward
to accumulate grads in the grad
member fields.
@pietern The reason we don't accumulate grads on the |
You don't have to do this to attach documentation. Take a look at what we do in |
@ezyang If we have a handful of methods, wouldn't it be better to attach the documentation as we've done in this PR? This way its easier for developers navigating the code to quickly see the documentation inline itself. |
Making Python wrappers for C functions adds nontrivial overhead; it's one of the main reasons we've done it this way for regular torch functions. So in this case we decided to give up "there is alway a def ... for every Python function" in the name of performance. |
I agree this makes a lot of sense for the tensor methods. Although, in this case isn't each method called only once per forward and backward pass? Do you feel the performance overhead would still be significant? |
Yes, you might be fine. So the request to do it the other way is more of a hygiene thing (so that someone else doesn't come along and copy what you did here for a case where the performance does matter.) |
@pritamdamania87 If I remember correctly, if you pass a char* as the 3rd argument of the module |
The get_gradients method was a pybind only method without any documentation for this method for users. I've moved this method to our python distributed autograd API and ensured that we have appropriate docs for this method. Differential Revision: [D18234443](https://our.internmc.facebook.com/intern/diff/D18234443/) [ghstack-poisoned]
The get_gradients method was a pybind only method without any documentation for this method for users. I've moved this method to our python distributed autograd API and ensured that we have appropriate docs for this method. Differential Revision: [D18234443](https://our.internmc.facebook.com/intern/diff/D18234443/) [ghstack-poisoned]
Pull Request resolved: #28926 The get_gradients method was a pybind only method without any documentation for this method for users. I've moved this method to our python distributed autograd API and ensured that we have appropriate docs for this method. ghstack-source-id: 93558845 Differential Revision: [D18234443](https://our.internmc.facebook.com/intern/diff/D18234443/)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This pull request has been merged in 17b0ab4. |
Stack from ghstack:
The get_gradients method was a pybind only method without any
documentation for this method for users.
I've moved this method to our python distributed autograd API and ensured that
we have appropriate docs for this method.
Differential Revision: D18234443