Add Post Freezing Optimizations, turn on by default in torch.jit.freeze #50222

eellison · 2021-01-07T20:00:20Z

Stack from ghstack:

Re-enable no-op peepholes #50241 Re-enable no-op peepholes
Add Post Freezing Optimizations, turn on by default in torch.jit.freeze #50222 Add Post Freezing Optimizations, turn on by default in torch.jit.freeze
Peephole Optimize out conv(x).dim(), which prevents BN fusion #50221 Peephole Optimize out conv(x).dim(), which prevents BN fusion
[JIT] Factor out peephole to own test file #50220 [JIT] Factor out peephole to own test file
[JIT] Add Frozen Conv-> Add/Sub/Mul/Div fusion #50075 [JIT] Add Frozen Conv-> Add/Sub/Mul/Div fusion
[JIT] Frozen Graph Conv-BN fusion #50074 [JIT] Frozen Graph Conv-BN fusion

This PR adds a pass which runs a set of optimizations to be done after freezing. Currently this encompasses Conv-BN folding, Conv->Add/Sub/Mul/Div folding and i'm also planning on adding dropout removal.

I would like some feedback on the API. torch.jit.freeze is technically in ~prototype~ phase so we have some leeway around making changes. I think in the majority of cases, the user is going to want to freeze their model, and then run in inference. I would prefer if the optimization was opt-out instead of opt-in. All internal/framework use cases of freezing all use freeze_module, not the python API, so this shouldn't break anything.

I have separated out the optimization pass as a separate API to make things potentially modular, even though I suspect that is an unlikely case. In a future PR i would like to add a torch::jit::freeze which follows the same api as torch.jit.freeze intended for C++ use, and runs the optimizations.

Differential Revision: D25856264

[ghstack-poisoned]

facebook-github-bot · 2021-01-07T20:00:44Z

💊 CI failures summary and remediations

As of commit 8896c3e (more details on the Dr. CI page):

2/2 failures possibly* introduced in this PR
- 2/2 non-CircleCI failure(s)

Extra GitHub checks: 1 failed

Failed: GitHub Actions - clang-format

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 72 times.

…ch.jit.freeze" [ghstack-poisoned]

…ch.jit.freeze" This PR adds a pass which runs a set of optimizations to be done after freezing. Currently this encompasses Conv-BN folding, Conv->Add/Sub/Mul/Div folding and i'm also planning on adding dropout removal. I would like some feedback on the API. torch.jit.freeze is technically in ~prototype~ phase so we have some leeway around making changes. I think in the majority of cases, the user is going to want to freeze their model, and then run in inference. I would prefer if the optimization was opt-out instead of opt-in. All internal/framework use cases of freezing all use `freeze_module`, not the python API, so this shouldn't break anything. I have separated out the optimization pass as a separate API to make things potentially modular, even though I suspect that is an unlikely case. In a future PR i would like to add a `torch::jit::freeze` which follows the same api as `torch.jit.freeze` intended for C++ use, and runs the optimizations. [ghstack-poisoned]

ghstack-source-id: 5acbb012b21c13cc7b6817f8ca6e0d20a616d8ab Pull Request resolved: #50222

…ch.jit.freeze" This PR adds a pass which runs a set of optimizations to be done after freezing. Currently this encompasses Conv-BN folding, Conv->Add/Sub/Mul/Div folding and i'm also planning on adding dropout removal. I would like some feedback on the API. torch.jit.freeze is technically in ~prototype~ phase so we have some leeway around making changes. I think in the majority of cases, the user is going to want to freeze their model, and then run in inference. I would prefer if the optimization was opt-out instead of opt-in. All internal/framework use cases of freezing all use `freeze_module`, not the python API, so this shouldn't break anything. I have separated out the optimization pass as a separate API to make things potentially modular, even though I suspect that is an unlikely case. In a future PR i would like to add a `torch::jit::freeze` which follows the same api as `torch.jit.freeze` intended for C++ use, and runs the optimizations. [ghstack-poisoned]

ghstack-source-id: 61c598b050a62772117920d85c54ddbc06edd914 Pull Request resolved: #50222

…ch.jit.freeze" This PR adds a pass which runs a set of optimizations to be done after freezing. Currently this encompasses Conv-BN folding, Conv->Add/Sub/Mul/Div folding and i'm also planning on adding dropout removal. I would like some feedback on the API. torch.jit.freeze is technically in ~prototype~ phase so we have some leeway around making changes. I think in the majority of cases, the user is going to want to freeze their model, and then run in inference. I would prefer if the optimization was opt-out instead of opt-in. All internal/framework use cases of freezing all use `freeze_module`, not the python API, so this shouldn't break anything. I have separated out the optimization pass as a separate API to make things potentially modular, even though I suspect that is an unlikely case. In a future PR i would like to add a `torch::jit::freeze` which follows the same api as `torch.jit.freeze` intended for C++ use, and runs the optimizations. [ghstack-poisoned]

ghstack-source-id: 75fece97a678d3ab95c4d4a0c251ffdc499d1f1f Pull Request resolved: #50222

…ch.jit.freeze" This PR adds a pass which runs a set of optimizations to be done after freezing. Currently this encompasses Conv-BN folding, Conv->Add/Sub/Mul/Div folding and i'm also planning on adding dropout removal. I would like some feedback on the API. torch.jit.freeze is technically in ~prototype~ phase so we have some leeway around making changes. I think in the majority of cases, the user is going to want to freeze their model, and then run in inference. I would prefer if the optimization was opt-out instead of opt-in. All internal/framework use cases of freezing all use `freeze_module`, not the python API, so this shouldn't break anything. I have separated out the optimization pass as a separate API to make things potentially modular, even though I suspect that is an unlikely case. In a future PR i would like to add a `torch::jit::freeze` which follows the same api as `torch.jit.freeze` intended for C++ use, and runs the optimizations. [ghstack-poisoned]

bzinodev

How does it interact with graph mode quantization? And also for mobile? I suggest to have couple a test?

eellison · 2021-01-08T19:23:31Z

It doesn't, because they all use _freeze_module, not the torch.jit.freeze API. If they did tests would break.

ZolotukhinM

Looks good to me!

ZolotukhinM · 2021-01-08T20:52:47Z

torch/csrc/jit/passes/frozen_graph_optimizations.cpp

+namespace jit {
+
+void OptimizeFrozenGraph(std::shared_ptr<Graph>& graph) {
+  // run a couple times to capture Conv -> Mul -> Add etc


Might be worth running that while something is changed (with some threshold to not run for too long).

This would be nice but i'm not sure how relevant it is, might do as a follow up.

I mentioned this simply because it's a common pattern in compilers. I agree with you that it's not clear whether it matters here.

ZolotukhinM · 2021-01-08T20:54:54Z

torch/jit/_freeze.py

@@ -10,7 +10,7 @@
 from torch.jit._script import RecursiveScriptModule, ScriptModule


-def freeze(mod, preserved_attrs: Optional[List[str]] = None):
+def freeze(mod, preserved_attrs: Optional[List[str]] = None, optimize: bool = True):


It might be easier to land this in two steps: 1) add a new flag with the new functionality, but disabled by default, 2) flip the default. This way if something breaks, only the flag switch would be reverted.

…ch.jit.freeze" This PR adds a pass which runs a set of optimizations to be done after freezing. Currently this encompasses Conv-BN folding, Conv->Add/Sub/Mul/Div folding and i'm also planning on adding dropout removal. I would like some feedback on the API. torch.jit.freeze is technically in \~prototype\~ phase so we have some leeway around making changes. I think in the majority of cases, the user is going to want to freeze their model, and then run in inference. I would prefer if the optimization was opt-out instead of opt-in. All internal/framework use cases of freezing all use `freeze_module`, not the python API, so this shouldn't break anything. I have separated out the optimization pass as a separate API to make things potentially modular, even though I suspect that is an unlikely case. In a future PR i would like to add a `torch::jit::freeze` which follows the same api as `torch.jit.freeze` intended for C++ use, and runs the optimizations. [ghstack-poisoned]

…ch.jit.freeze" This PR adds a pass which runs a set of optimizations to be done after freezing. Currently this encompasses Conv-BN folding, Conv->Add/Sub/Mul/Div folding and i'm also planning on adding dropout removal. I would like some feedback on the API. torch.jit.freeze is technically in \~prototype\~ phase so we have some leeway around making changes. I think in the majority of cases, the user is going to want to freeze their model, and then run in inference. I would prefer if the optimization was opt-out instead of opt-in. All internal/framework use cases of freezing all use `freeze_module`, not the python API, so this shouldn't break anything. I have separated out the optimization pass as a separate API to make things potentially modular, even though I suspect that is an unlikely case. In a future PR i would like to add a `torch::jit::freeze` which follows the same api as `torch.jit.freeze` intended for C++ use, and runs the optimizations. Differential Revision: [D25856264](https://our.internmc.facebook.com/intern/diff/D25856264) [ghstack-poisoned]

facebook-github-bot · 2021-01-12T19:39:47Z

@eellison merged this pull request in a389b30.

xsacha · 2021-02-08T21:34:48Z

@eellison as a suggestion for another post freezing optimisation: you could convert the dtype of the weights in instances such as AMP. Currently you end up with a torchscript file containing 32-bit weights and script that converts specific weights to 16-bit wherever they are used.

eellison · 2021-02-08T23:23:48Z

@xsacha thanks for the suggestion! Do you have a repro by any chance ?

Add Post Freezing Optimizations, turn on by default in torch.jit.freeze

7521aa6

[ghstack-poisoned]

This was referenced Jan 7, 2021

[JIT] Frozen Graph Conv-BN fusion #50074

Closed

[JIT] Add Frozen Conv-> Add/Sub/Mul/Div fusion #50075

Closed

[JIT] Factor out peephole to own test file #50220

Closed

Peephole Optimize out conv(x).dim(), which prevents BN fusion #50221

Closed

facebook-github-bot added cla signed oncall: jit Add this issue/PR to JIT oncall triage queue labels Jan 7, 2021

Update on "Add Post Freezing Optimizations, turn on by default in tor…

5d581fa

…ch.jit.freeze" [ghstack-poisoned]

eellison requested review from Krovatkin, ZolotukhinM and bertmaher January 7, 2021 20:30

eellison pushed a commit that referenced this pull request Jan 7, 2021

Add Post Freezing Optimizations, turn on by default in torch.jit.freeze

22b5757

ghstack-source-id: 5acbb012b21c13cc7b6817f8ca6e0d20a616d8ab Pull Request resolved: #50222

eellison pushed a commit that referenced this pull request Jan 7, 2021

Add Post Freezing Optimizations, turn on by default in torch.jit.freeze

15770be

ghstack-source-id: 61c598b050a62772117920d85c54ddbc06edd914 Pull Request resolved: #50222

eellison pushed a commit that referenced this pull request Jan 7, 2021

Add Post Freezing Optimizations, turn on by default in torch.jit.freeze

e1613e6

ghstack-source-id: 75fece97a678d3ab95c4d4a0c251ffdc499d1f1f Pull Request resolved: #50222

eellison mentioned this pull request Jan 7, 2021

Re-enable no-op peepholes #50241

Closed

bzinodev reviewed Jan 8, 2021

View reviewed changes

eellison requested a review from bzinodev January 8, 2021 19:35

ZolotukhinM approved these changes Jan 8, 2021

View reviewed changes

eellison added 2 commits January 8, 2021 12:59

facebook-github-bot closed this in a389b30 Jan 12, 2021

facebook-github-bot added the Merged label Jan 12, 2021

facebook-github-bot deleted the gh/eellison/153/head branch January 16, 2021 15:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Post Freezing Optimizations, turn on by default in torch.jit.freeze #50222

Add Post Freezing Optimizations, turn on by default in torch.jit.freeze #50222

eellison commented Jan 7, 2021 •

edited

Loading

facebook-github-bot commented Jan 7, 2021 •

edited

Loading

bzinodev left a comment

eellison commented Jan 8, 2021

ZolotukhinM left a comment

ZolotukhinM Jan 8, 2021

eellison Jan 8, 2021

ZolotukhinM Jan 8, 2021

ZolotukhinM Jan 8, 2021

facebook-github-bot commented Jan 12, 2021

xsacha commented Feb 8, 2021 •

edited

Loading

eellison commented Feb 8, 2021

Add Post Freezing Optimizations, turn on by default in torch.jit.freeze #50222

Add Post Freezing Optimizations, turn on by default in torch.jit.freeze #50222

Conversation

eellison commented Jan 7, 2021 • edited Loading

facebook-github-bot commented Jan 7, 2021 • edited Loading

💊 CI failures summary and remediations

Extra GitHub checks: 1 failed

bzinodev left a comment

Choose a reason for hiding this comment

eellison commented Jan 8, 2021

ZolotukhinM left a comment

Choose a reason for hiding this comment

ZolotukhinM Jan 8, 2021

Choose a reason for hiding this comment

eellison Jan 8, 2021

Choose a reason for hiding this comment

ZolotukhinM Jan 8, 2021

Choose a reason for hiding this comment

ZolotukhinM Jan 8, 2021

Choose a reason for hiding this comment

facebook-github-bot commented Jan 12, 2021

xsacha commented Feb 8, 2021 • edited Loading

eellison commented Feb 8, 2021

eellison commented Jan 7, 2021 •

edited

Loading

facebook-github-bot commented Jan 7, 2021 •

edited

Loading

xsacha commented Feb 8, 2021 •

edited

Loading