New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[quant][graphmode][fx] Support quantization for standalone module #44074
Conversation
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit 7dd90f4 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 194 times. |
…ule" Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
…ule" Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 731a64a128fb5851c9af779b77fc3426319a94ec Pull Request resolved: #44074
…ule" Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: afdeb1374c9bf0a528ce404616faadfd5f05f77f Pull Request resolved: #44074
…ule" Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
…ule" Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 47dae57c936d9bae8e2f83485d7ac7c5ef31fb74 Pull Request resolved: #44074
Codecov Report
@@ Coverage Diff @@
## gh/jerryzh168/428/base #44074 +/- ##
=========================================================
Coverage ? 68.61%
=========================================================
Files ? 406
Lines ? 52072
Branches ? 0
=========================================================
Hits ? 35729
Misses ? 16343
Partials ? 0 Continue to review full report at Codecov.
|
…ule" Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
…ule" Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
…ule" Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
…ule" Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
…ule" Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
…ule" Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) register_traceable_module_class(StandaloneModule) m = ModelThatUsesStandaloneModule() m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
I think we can also put it in qconfig_dict, would that make sense? |
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) register_traceable_module_class(StandaloneModule) m = ModelThatUsesStandaloneModule() m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) register_traceable_module_class(StandaloneModule) m = ModelThatUsesStandaloneModule() m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) register_traceable_module_class(StandaloneModule) m = ModelThatUsesStandaloneModule() m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) class ModelThatUsesStandaloneModule(...): def __init__(self): super().__init__() self.standalone = StandaloneModule() def forward(self, x): return self.standalone(x) m = ModelThatUsesStandaloneModule() qconfig_dict = {"": qconfig, "standalone_module_name": ["standalone"]} m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) class ModelThatUsesStandaloneModule(...): def __init__(self): super().__init__() self.standalone = StandaloneModule() def forward(self, x): return self.standalone(x) m = ModelThatUsesStandaloneModule() qconfig_dict = {"": qconfig, "standalone_module_name": ["standalone"]} m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) class ModelThatUsesStandaloneModule(...): def __init__(self): super().__init__() self.standalone = StandaloneModule() def forward(self, x): return self.standalone(x) m = ModelThatUsesStandaloneModule() qconfig_dict = {"": qconfig, "standalone_module_name": ["standalone"]} m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
torch/fx/graph_module.py
Outdated
@@ -194,10 +194,20 @@ def __reduce__(self): | |||
def __deepcopy__(self, memo): | |||
fake_mod = torch.nn.Module() | |||
fake_mod.__dict__ = copy.deepcopy(self.__dict__) | |||
return GraphModule(fake_mod, self.graph) | |||
graph_module = GraphModule(fake_mod, self.graph) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are changes from #45182, will rebase on master after it is landed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! some optional nit comments inline
torch/quantization/fx/quantize.py
Outdated
@@ -199,7 +206,7 @@ def get_qconfig(module): | |||
elif node.op == 'call_module': | |||
self.qconfig_map[node.name] = get_qconfig(self.modules[node.target]) | |||
|
|||
def _prepare(self, model, qconfig_dict, inplace, is_dynamic_quant): | |||
def _prepare(self, model, qconfig_dict, inplace, is_dynamic_quant, is_child_module): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
repeating docs is worth it in some cases, at least IMO. Up to you though.
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) class ModelThatUsesStandaloneModule(...): def __init__(self): super().__init__() self.standalone = StandaloneModule() def forward(self, x): return self.standalone(x) m = ModelThatUsesStandaloneModule() qconfig_dict = {"": qconfig, "standalone_module_name": ["standalone"]} m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) class ModelThatUsesStandaloneModule(...): def __init__(self): super().__init__() self.standalone = StandaloneModule() def forward(self, x): return self.standalone(x) m = ModelThatUsesStandaloneModule() qconfig_dict = {"": qconfig, "standalone_module_name": ["standalone"]} m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) class ModelThatUsesStandaloneModule(...): def __init__(self): super().__init__() self.standalone = StandaloneModule() def forward(self, x): return self.standalone(x) m = ModelThatUsesStandaloneModule() qconfig_dict = {"": qconfig, "standalone_module_name": ["standalone"]} m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
@@ -0,0 +1,27 @@ | |||
from torch.fx import GraphModule | |||
|
|||
class ObservedStandaloneGraphModule(GraphModule): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jamesr66a @zdevito is this what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks fine to me in the absence of being able to directly customize what symbolic_trace
returns. I've filed an issue to track that: #45534
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) class ModelThatUsesStandaloneModule(...): def __init__(self): super().__init__() self.standalone = StandaloneModule() def forward(self, x): return self.standalone(x) m = ModelThatUsesStandaloneModule() qconfig_dict = {"": qconfig, "standalone_module_name": ["standalone"]} m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) class ModelThatUsesStandaloneModule(...): def __init__(self): super().__init__() self.standalone = StandaloneModule() def forward(self, x): return self.standalone(x) m = ModelThatUsesStandaloneModule() qconfig_dict = {"": qconfig, "standalone_module_name": ["standalone"]} m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) class ModelThatUsesStandaloneModule(...): def __init__(self): super().__init__() self.standalone = StandaloneModule() def forward(self, x): return self.standalone(x) m = ModelThatUsesStandaloneModule() qconfig_dict = {"": qconfig, "standalone_module_name": ["standalone"]} m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) class ModelThatUsesStandaloneModule(...): def __init__(self): super().__init__() self.standalone = StandaloneModule() def forward(self, x): return self.standalone(x) m = ModelThatUsesStandaloneModule() qconfig_dict = {"": qconfig, "standalone_module_name": ["standalone"]} m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
… module" Summary: Sometimes user need to quantize a submodule as one unit, and this submodule will be lowered to a different backend like accelerator. The submodule will be quantized with the same fx based graph mode quantization functions and will be connected with the rest of the model automatically. APIs: ```python class StandaloneModule(torch.nn.Module): def __init__(self): super().__init__() self.conv = torch.nn.Conv2d(1, 1, 1) def forward(self, x): return self.conv(x) class CustomTracer(Tracer): def is_leaf_module(self, m, module_qualified_name): return (m.__module__.startswith('torch.nn') and not isinstance(m, torch.nn.Sequential)) or \ isinstance(m, StandaloneModule) class ModelThatUsesStandaloneModule(...): def __init__(self): super().__init__() self.standalone = StandaloneModule() def forward(self, x): return self.standalone(x) m = ModelThatUsesStandaloneModule() qconfig_dict = {"": qconfig, "standalone_module_name": ["standalone"]} m = prepare_fx(m, qconfig_dict) calibrate(m, data) m = convert_fx(m) m.standalone = lower_to_acclerator(m.standalone) ``` Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23580642](https://our.internmc.facebook.com/intern/diff/D23580642) [ghstack-poisoned]
This pull request has been merged in 5539066. |
Stack from ghstack:
Summary:
Sometimes user need to quantize a submodule as one unit, and this submodule
will be lowered to a different backend like accelerator.
The submodule will be quantized with the same fx based graph mode quantization functions
and will be connected with the rest of the model automatically.
APIs:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D23580642