Skip to content

Conversation

ezyang
Copy link
Contributor

@ezyang ezyang commented Apr 23, 2024

Stack from ghstack (oldest at bottom):

This is a subset of changes extracted from #124683

This PR contains modifications to make Inductor work with unbacked symbol inputs, which can occur when a data-dependent sized tensor is saved for backwards. The problems to be fixed:

  • When binding initial symbols, we unconditionally bind unbacked symbols (instead of computing if they are needed, which only looks at backed symbols)
  • Benchmark generation code doesn't work with unbacked symints as we have no hints to actually feed in real values. So I pick a random number and you are expected to fix it if it doesn't work
  • Need to make sure we don't install dependencies on unbacked SymInt inputs, that puts us down the "promptly deallocate the input" path, but that's pointless for unbacked SymInt

Fixes #124652

Signed-off-by: Edward Z. Yang ezyang@meta.com

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Apr 23, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124739

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 7205be7 with merge base a22847a (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]
ezyang added 2 commits April 23, 2024 14:37
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@ezyang
Copy link
Contributor Author

ezyang commented Apr 25, 2024

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 25, 2024
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team Raised by workflow job

@ezyang
Copy link
Contributor Author

ezyang commented Apr 25, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Apr 25, 2024
#124782)

Previously, unbacked SymInts would gradually get larger and larger as we kept rebinding them.  Now, we do the replacement to preserve the old symbol.

Actually doing this is a bit tricky.  Here’s the order things happen when retracing data dependent:

1. Run fake tensor prop: allocate new unbacked SymInt
2. Run proxy tensor mode, calculate bindings and associate them with FX node
3. Run PropagateUnbackedSymInts, rename unbacked bindings to their old ones so they are consistent

So the problem is when we calculate bindings in step (2), we don't know
what the original names are yet, we only find out later at (3).  But by
the time (3) runs, we've already stuffed some new bindings in
meta["unbacked_bindings"] and we don't know how to update them!  To fix
this, I introduce resolve_unbacked_bindings which post facto applies any
of the renamings we discovered in (3).

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: #124782
Approved by: https://github.com/lezcano
ghstack dependencies: #124310, #124314, #124316, #124394, #124739
pytorchmergebot pushed a commit that referenced this pull request Apr 25, 2024
…124785)

This ensures we have correct SymInts when we allocate tangents.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: #124785
Approved by: https://github.com/lezcano
ghstack dependencies: #124310, #124314, #124316, #124394, #124739, #124782
pytorchmergebot pushed a commit that referenced this pull request Apr 25, 2024
…ass (#123735)

This ensures that first argument to record_shapeenv_event is a ShapeEnv
so we can appropriately short circuit when recording is not in progress.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: #123735
Approved by: https://github.com/ysiraichi, https://github.com/zou3519, https://github.com/albanD
ghstack dependencies: #124310, #124314, #124316, #124394, #124739, #124782, #124785
pytorch-bot bot pushed a commit that referenced this pull request May 3, 2024
This is a subset of changes extracted from #124683

This PR contains modifications to make Inductor work with unbacked symbol inputs, which can occur when a data-dependent sized tensor is saved for backwards. The problems to be fixed:

* When binding initial symbols, we unconditionally bind unbacked symbols (instead of computing if they are needed, which only looks at backed symbols)
* Benchmark generation code doesn't work with unbacked symints as we have no hints to actually feed in real values. So I pick a random number and you are expected to fix it if it doesn't work
* Need to make sure we don't install dependencies on unbacked SymInt inputs, that puts us down the "promptly deallocate the input" path, but that's pointless for unbacked SymInt

Fixes #124652

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: #124739
Approved by: https://github.com/jansel
ghstack dependencies: #124310, #124314, #124316, #124394
petrex pushed a commit to petrex/pytorch that referenced this pull request May 3, 2024
pytorch#124782)

Previously, unbacked SymInts would gradually get larger and larger as we kept rebinding them.  Now, we do the replacement to preserve the old symbol.

Actually doing this is a bit tricky.  Here’s the order things happen when retracing data dependent:

1. Run fake tensor prop: allocate new unbacked SymInt
2. Run proxy tensor mode, calculate bindings and associate them with FX node
3. Run PropagateUnbackedSymInts, rename unbacked bindings to their old ones so they are consistent

So the problem is when we calculate bindings in step (2), we don't know
what the original names are yet, we only find out later at (3).  But by
the time (3) runs, we've already stuffed some new bindings in
meta["unbacked_bindings"] and we don't know how to update them!  To fix
this, I introduce resolve_unbacked_bindings which post facto applies any
of the renamings we discovered in (3).

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: pytorch#124782
Approved by: https://github.com/lezcano
ghstack dependencies: pytorch#124310, pytorch#124314, pytorch#124316, pytorch#124394, pytorch#124739
petrex pushed a commit to petrex/pytorch that referenced this pull request May 3, 2024
…ytorch#124785)

This ensures we have correct SymInts when we allocate tangents.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>

Pull Request resolved: pytorch#124785
Approved by: https://github.com/lezcano
ghstack dependencies: pytorch#124310, pytorch#124314, pytorch#124316, pytorch#124394, pytorch#124739, pytorch#124782
pytorch-bot bot pushed a commit that referenced this pull request May 3, 2024
…ass (#123735)

This ensures that first argument to record_shapeenv_event is a ShapeEnv
so we can appropriately short circuit when recording is not in progress.

Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: #123735
Approved by: https://github.com/ysiraichi, https://github.com/zou3519, https://github.com/albanD
ghstack dependencies: #124310, #124314, #124316, #124394, #124739, #124782, #124785
@github-actions github-actions bot deleted the gh/ezyang/2695/head branch June 3, 2024 01:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants