Skip to content

Conversation

… when computing differentiable outputs that alias each other

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Dec 7, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/115315

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 3b51fe6 with merge base f591933 (image):

UNSTABLE - The following job failed but was likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…ot_autograd when computing differentiable outputs that alias each other"

[ghstack-poisoned]
voznesenskym added a commit that referenced this pull request Dec 7, 2023
… when computing differentiable outputs that alias each other

ghstack-source-id: 7e9e529
Pull Request resolved: #115315

finger moved line
@voznesenskym voznesenskym added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 7, 2023
intermediate_base_tensor_id_to_output_idx: Dict[int, int] = {}
intermediate_bases: List[torch.Tensor] = []
# Why do we care if storage changed?
# There is a really care class of situations, which basically only happen with something
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

care -> rare

#
# return out
#
# Esentially, what his code does is calls set_() with no_grad() - aka, our simulation
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention the unsafe autograd preservation too, and that this is what fsdp does lol

#
# return out
#
# Esentially, what his code does is calls set_() with no_grad() - aka, our simulation
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

his -> this

@voznesenskym voznesenskym marked this pull request as draft December 7, 2023 09:27
@voznesenskym
Copy link
Collaborator Author

o needs a type check

@albanD albanD removed their request for review December 7, 2023 19:28
…ot_autograd when computing differentiable outputs that alias each other"

[ghstack-poisoned]
voznesenskym added a commit that referenced this pull request Dec 7, 2023
… when computing differentiable outputs that alias each other

ghstack-source-id: 6062d66
Pull Request resolved: #115315

finger moved line

Fixes
…ot_autograd when computing differentiable outputs that alias each other"

[ghstack-poisoned]
…ot_autograd when computing differentiable outputs that alias each other"

cc penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 aakhundov kadeng

[ghstack-poisoned]
voznesenskym added a commit that referenced this pull request Dec 8, 2023
… when computing differentiable outputs that alias each other

ghstack-source-id: 937d4d4
Pull Request resolved: #115315

finger moved line

Fixes

Fix

reword
@voznesenskym voznesenskym marked this pull request as ready for review December 8, 2023 03:31
@github-actions github-actions bot requested a review from albanD December 8, 2023 03:31
@albanD albanD removed their request for review December 8, 2023 15:36
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@voznesenskym
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Dec 13, 2023
…ok on flat_param (#112184)

Pull Request resolved: #112184
Approved by: https://github.com/albanD
ghstack dependencies: #115315
@facebook-github-bot facebook-github-bot deleted the gh/voznesenskym/290/head branch December 16, 2023 15:26
pytorchmergebot pushed a commit that referenced this pull request Dec 16, 2023
Support for something we need for both FSDP and optimizers. For sourced args that are not inputs (params, etc) - we use the dynamic_getattr flow on tensors. This soundly handles the storage and registration and guarding downstream of tensor_wrap for the grad values. For non sourced (true intermediates), we only support None (the idea being that if we have a true intermediate in the graph with grad, we are already doing something weird).

Pull Request resolved: #115898
Approved by: https://github.com/bdhirsh
ghstack dependencies: #115315, #112184
guilhermeleobas pushed a commit to guilhermeleobas/pytorch that referenced this pull request Dec 18, 2023
… when computing differentiable outputs that alias each other (pytorch#115315)

Pull Request resolved: pytorch#115315
Approved by: https://github.com/bdhirsh
guilhermeleobas pushed a commit to guilhermeleobas/pytorch that referenced this pull request Dec 18, 2023
guilhermeleobas pushed a commit to guilhermeleobas/pytorch that referenced this pull request Dec 18, 2023
Support for something we need for both FSDP and optimizers. For sourced args that are not inputs (params, etc) - we use the dynamic_getattr flow on tensors. This soundly handles the storage and registration and guarding downstream of tensor_wrap for the grad values. For non sourced (true intermediates), we only support None (the idea being that if we have a true intermediate in the graph with grad, we are already doing something weird).

Pull Request resolved: pytorch#115898
Approved by: https://github.com/bdhirsh
ghstack dependencies: pytorch#115315, pytorch#112184
@voznesenskym voznesenskym restored the gh/voznesenskym/290/head branch December 18, 2023 18:35
@facebook-github-bot facebook-github-bot deleted the gh/voznesenskym/290/head branch December 19, 2023 15:27
dmenig pushed a commit to dmenig/pytorch that referenced this pull request Dec 21, 2023
… when computing differentiable outputs that alias each other (pytorch#115315)

Pull Request resolved: pytorch#115315
Approved by: https://github.com/bdhirsh
dmenig pushed a commit to dmenig/pytorch that referenced this pull request Dec 21, 2023
dmenig pushed a commit to dmenig/pytorch that referenced this pull request Dec 21, 2023
Support for something we need for both FSDP and optimizers. For sourced args that are not inputs (params, etc) - we use the dynamic_getattr flow on tensors. This soundly handles the storage and registration and guarding downstream of tensor_wrap for the grad values. For non sourced (true intermediates), we only support None (the idea being that if we have a true intermediate in the graph with grad, we are already doing something weird).

Pull Request resolved: pytorch#115898
Approved by: https://github.com/bdhirsh
ghstack dependencies: pytorch#115315, pytorch#112184
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants