- 
                Notifications
    You must be signed in to change notification settings 
- Fork 25.7k
[JIT] Propagate profiled information to DifferentiableGraph outputs #78875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JIT] Propagate profiled information to DifferentiableGraph outputs #78875
Conversation
Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output. Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs. [ghstack-poisoned]
| 🔗 Helpful links
 ✅ No Failures (0 Pending)As of commit 21611bc (more details on the Dr. CI page): Expand to see more💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. | 
Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output. Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs. ghstack-source-id: bc2a5cd Pull Request resolved: #78875
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice !!!
| c10::optional<bool> requiresGrad = c10::nullopt; | ||
| for (auto& use : diff_graph->output(i)->uses()) { | ||
| if (use.user->kind() == prim::profile) { | ||
| requiresGrad = getProfileNodeRequiresGrad(use.user); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a value has multiple uses, and one of the the uses isn't profiled, we don't want to overwrite the existing use's requires_grad.
if (use.user->kind() == prim::profile) {
   if (auto req_grad_use = getProfileNodeRequiresGrad(use.user)) {
      requiresGrad = req_grad_use;
   }
}
| if (dg_use.user->kind() == prim::profile) { | ||
| requiresGrad = getProfileNodeRequiresGrad(dg_use.user); | ||
| if (requiresGrad) { | ||
| break; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we are doing the early break for if (requiresGrad) in this loop but not the loop above
| Value* dg_value = dg->inputs()[use.offset]; | ||
| for (auto& dg_use : dg_value->uses()) { | ||
| if (dg_use.user->kind() == prim::profile) { | ||
| requiresGrad = getProfileNodeRequiresGrad(dg_use.user); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe also do the same don't overwrite check as above
…h outputs" Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output. Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs. [ghstack-poisoned]
Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output. Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs. ghstack-source-id: d60714a Pull Request resolved: #78875
…h outputs" Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output. Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs. [ghstack-poisoned]
Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output. Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs. ghstack-source-id: 9b338c6 Pull Request resolved: #78875
| @pytorchbot merge | 
| @pytorchbot successfully started a merge job. Check the current status here | 
| Hey @davidberard98. | 
…78875) Summary: Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output. Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs. Pull Request resolved: #78875 Approved by: https://github.com/eellison Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/1d2a6c2e94ddda1c1e5bd611796f0775c26ff456 Reviewed By: osalpekar Differential Revision: D37085552 Pulled By: davidberard98 fbshipit-source-id: 188d8149e85dffb02d98f409cf08110c7fec9c14
| @pytorchbot help | 
| PyTorchBot HelpMergeRevertRebaseFor more info, consult the wiki. | 
| @pytorchbot revert -m "Internal failures were bisected to this change" -c ghfirst | 
| @pytorchbot successfully started a revert job. Check the current status here | 
…utputs" This reverts commit 1d2a6c2. Reverted #78875 on behalf of https://github.com/davidberard98 due to Internal failures were bisected to this change
…utputs" Summary: This reverts commit 1d2a6c2. Reverted #78875 on behalf of https://github.com/davidberard98 due to Internal failures were bisected to this change Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/5413580f9e6ce1d490d0b137595574befd28d5f4 Reviewed By: osalpekar, yuchenhao Differential Revision: D37114173 fbshipit-source-id: f1256b2a317ed2b5c8979ef558412494208ab126
Stack from ghstack:
Without profiled outputs, autodiff can't tell whether or not the outputs of a DifferentiableGraph should requires_grad. Autodiff would default to requires_grad=True if there was no profiled information, causing autodiff to mark tensors as requires_grad when they shouldn't have. This adds requires_grad info onto the type of the output, if it can be found in later uses of the output.
Adds a test for correct autodiff requires_grad behavior and also a test to make sure the output type is correctly annotated in create_autodiff_subgraphs.