Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inductor Freezing #100652

Closed
wants to merge 31 commits into from
Closed

Inductor Freezing #100652

wants to merge 31 commits into from

Conversation

eellison
Copy link
Contributor

@eellison eellison commented May 4, 2023

Stack from ghstack (oldest at bottom):

Adds a freezing pass that will constant fold parameters in inductor config.freezing. This occurs post functionalization in aot autograd to capture both dispatching and allow passes to occur post functionalization. A few notes:

  • There is an option to discard parameters config.freezing_discard_parameters which will take the current eager modules and wrap parameters to a Tensor subclass which will error if used.
  • I needed to expose flat_params in aot_autograd in order to discard old references when we constant fold away parameters, like with amp. I also exposed fw_metadata to avoid constant folding mutated paraemters.
  • Caching parameter transformations/constant folding across different inferences nyi
  • Checking version_counter of constant folded params nyi

I'm not really sure what the actual naming should be. In jit there was both "freezing", which was platform agnostic, and "optimize for inference", which made device specific optimizations. We're doing the latter here but maybe freezing is a better name.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @aakhundov @soumith @desertfire

Differential Revision: D46244033

@pytorch-bot
Copy link

pytorch-bot bot commented May 4, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/100652

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 6 Unrelated Failures

As of commit 17a09e6 with merge base f37be77 (image):

NEW FAILURE - The following job has failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cc soumith voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request May 8, 2023
Currently, the packed op doesn't support autocast and the packing path happened before AOTAutograd, which changes the default autocast behavior. Now, we disable the packing path, and the bfloat16 packing path can work after we move this path after AOTAutograd(I will do it after #100652 is done).

Pull Request resolved: #100844
Approved by: https://github.com/jgong5, https://github.com/jansel
kiersten-stokes pushed a commit to kiersten-stokes/pytorch that referenced this pull request May 8, 2023
…100844)

Currently, the packed op doesn't support autocast and the packing path happened before AOTAutograd, which changes the default autocast behavior. Now, we disable the packing path, and the bfloat16 packing path can work after we move this path after AOTAutograd(I will do it after pytorch#100652 is done).

Pull Request resolved: pytorch#100844
Approved by: https://github.com/jgong5, https://github.com/jansel
cc soumith voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
cc soumith voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
cc soumith voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
Gives 1% boost on hf_Bert inference.

cc soumith voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
Gives 1% boost on hf_Bert inference.

cc soumith voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
@eellison eellison changed the title Inductor Optimize For Inference/Freezing Inductor Optimize For Freezing May 16, 2023
@eellison eellison changed the title Inductor Optimize For Freezing Inductor Freezing May 16, 2023
Gives 1% boost on hf_Bert inference.

cc soumith voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
eellison added a commit that referenced this pull request May 16, 2023
ghstack-source-id: 0ebf73bfa14ebebe376a2bba0d1acf0564379d44
Pull Request resolved: #100652

cc soumith voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
eellison added a commit that referenced this pull request May 16, 2023
ghstack-source-id: b5dc8c70ee1b6fcd02d2d2b4a1c9c1b9ed4af9db
Pull Request resolved: #100652

cc soumith voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
eellison added a commit that referenced this pull request May 16, 2023
ghstack-source-id: 2d504f26845f6c89b892a23ac48376a002894f51
Pull Request resolved: #100652

cc soumith voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]

cc soumith voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 desertfire

[ghstack-poisoned]
@eellison
Copy link
Contributor Author

eellison commented Jun 9, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 9, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / win-vs2019-cpu-py3 / test (default, 3, 3, windows.4xlarge.nonephemeral)

Details for Dev Infra team Raised by workflow job


Adds a freezing pass that will constant fold parameters in inductor `config.freezing`. This occurs post functionalization in aot autograd to capture both dispatching and allow passes to occur post functionalization. A few notes:

- There is an option to discard parameters `config.freezing_discard_parameters` which will take the current eager modules and wrap parameters to a Tensor subclass which will error if used. 
- I needed to expose flat_params in aot_autograd in order to discard old references when we constant fold away parameters, like with amp. I also exposed `fw_metadata` to avoid constant folding mutated paraemters. 
- Caching parameter transformations/constant folding across different inferences nyi
- Checking version_counter of constant folded params nyi


I'm not really sure what the actual naming should be. In jit there was both "freezing", which was platform agnostic, and "optimize for inference", which made device specific optimizations. We're doing the latter here but maybe freezing is a better name.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 aakhundov soumith desertfire

Differential Revision: [D46244033](https://our.internmc.facebook.com/intern/diff/D46244033)

[ghstack-poisoned]

Adds a freezing pass that will constant fold parameters in inductor `config.freezing`. This occurs post functionalization in aot autograd to capture both dispatching and allow passes to occur post functionalization. A few notes:

- There is an option to discard parameters `config.freezing_discard_parameters` which will take the current eager modules and wrap parameters to a Tensor subclass which will error if used. 
- I needed to expose flat_params in aot_autograd in order to discard old references when we constant fold away parameters, like with amp. I also exposed `fw_metadata` to avoid constant folding mutated paraemters. 
- Caching parameter transformations/constant folding across different inferences nyi
- Checking version_counter of constant folded params nyi


I'm not really sure what the actual naming should be. In jit there was both "freezing", which was platform agnostic, and "optimize for inference", which made device specific optimizations. We're doing the latter here but maybe freezing is a better name.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 aakhundov soumith desertfire

Differential Revision: [D46244033](https://our.internmc.facebook.com/intern/diff/D46244033)

[ghstack-poisoned]

Adds a freezing pass that will constant fold parameters in inductor `config.freezing`. This occurs post functionalization in aot autograd to capture both dispatching and allow passes to occur post functionalization. A few notes:

- There is an option to discard parameters `config.freezing_discard_parameters` which will take the current eager modules and wrap parameters to a Tensor subclass which will error if used. 
- I needed to expose flat_params in aot_autograd in order to discard old references when we constant fold away parameters, like with amp. I also exposed `fw_metadata` to avoid constant folding mutated paraemters. 
- Caching parameter transformations/constant folding across different inferences nyi
- Checking version_counter of constant folded params nyi


I'm not really sure what the actual naming should be. In jit there was both "freezing", which was platform agnostic, and "optimize for inference", which made device specific optimizations. We're doing the latter here but maybe freezing is a better name.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 aakhundov soumith desertfire

Differential Revision: [D46244033](https://our.internmc.facebook.com/intern/diff/D46244033)

[ghstack-poisoned]
@eellison
Copy link
Contributor Author

@pytorchbot merge -f "unrelated failures"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@osalpekar
Copy link
Member

@pytorchbot revert -m "This seems to be breaking test_aliased_param_return_cpu on trunk. See for more details: https://www.torch-ci.com/pytorch/pytorch/commit/d083d444ff41cfb2352f4f5e1780c1b9a2126049" -c landrace

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

Can't revert PR that was landed via phabricator as D46244033. Please revert by going to the internal diff and clicking Unland.

@eellison eellison reopened this Jun 12, 2023
@eellison eellison closed this Jun 12, 2023
@eellison
Copy link
Contributor Author

Test disabled here #103466

@huydhn
Copy link
Contributor

huydhn commented Jun 12, 2023

There are now other failed tests besides the above disabled one https://hud.pytorch.org/pytorch/pytorch/commit/c3d3165f16dccd88872139b72cd421e0ceafdd9b, and the diff hasn't been landed internally yet, so should we submit a revert PR or disable the whole test_inductor_freezing file?

@facebook-github-bot facebook-github-bot deleted the gh/eellison/439/head branch June 16, 2023 14:16
pytorchmergebot pushed a commit that referenced this pull request Jun 28, 2023
During revert, use title of "Meta Internal-Only Changes Check" to determine whether or not internal diff is associated with the PR. When PR is merged/closed, "Meta Internal-Only Changes Check" status is always success, but title message can differ:
- "There is no internal Diff connected, this can be merged now" means that there are no internal change associated with PR (or it was landed via GitHub First workflow)
- "The internal Diff has landed, this can be merged now" meaning that PR has associated internal DIFF, and OSS and internal reverts must happen in sync using internal tooling. (Or a revert PR can be authored in OSS)

Add regression test for #100652 that was originated from the internal diff, but was merged as OSS PR.

Fixes #104232

Pull Request resolved: #104344
Approved by: https://github.com/bigfootjon, https://github.com/huydhn
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants