New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[quant][pyper] Support aten::embedding_bag quantization in graph mode #43989
Conversation
Summary: When we trace the model it produces aten::embedding_bag node in the graph, Add necessary passes in graph mode to help support quantizing it as well Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit 119b273 (more details on the Dr. CI page):
🚧 1 fixed upstream failure:These were probably caused by upstream breakages that were already fixed.
Please rebase on the
|
… graph mode" Summary: When we trace the model it produces aten::embedding_bag node in the graph, Add necessary passes in graph mode to help support quantizing it as well Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23460485](https://our.internmc.facebook.com/intern/diff/D23460485) [ghstack-poisoned]
Summary: When we trace the model it produces aten::embedding_bag node in the graph, Add necessary passes in graph mode to help support quantizing it as well Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 1686d1ba8c2eac1cc750b771f4fd542a473c2bd3 Pull Request resolved: #43989
… graph mode" Summary: When we trace the model it produces aten::embedding_bag node in the graph, Add necessary passes in graph mode to help support quantizing it as well Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23460485](https://our.internmc.facebook.com/intern/diff/D23460485) [ghstack-poisoned]
Codecov Report
@@ Coverage Diff @@
## gh/supriyar/170/base #43989 +/- ##
=======================================================
Coverage ? 69.27%
=======================================================
Files ? 381
Lines ? 47239
Branches ? 0
=======================================================
Hits ? 32724
Misses ? 14515
Partials ? 0 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lg, feel free to wait for @jerryzh168 if a deeper review is needed
@@ -259,10 +260,16 @@ bool matchArgPattern( | |||
bool isWeight(Value* v) { | |||
bool result = matchArgPattern( | |||
v, | |||
AtenFuncArgs( | |||
{{"conv1d", 1}, {"conv2d", 1}, {"conv3d", 1}, {"linear", 1}}), | |||
// ate::embedding_bag(%weight, %input, %offsets, %scale_grad_by_freq, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: aten
?
from torch.quantization import QConfigDynamic, PlaceholderObserver | ||
int4_dynamic_qconfig = QConfigDynamic(activation=PlaceholderObserver.with_args(dtype=torch.float, | ||
custom_op_name="embedding_bag_4bit"), | ||
weight=PlaceholderObserver.with_args(custom_op_name="embedding_bag_4bit")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have a placeholder observer for weights?. My understanding is that we can use real observers for 8 bit but not for 4 bit currently. Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently use real observers and torchbind classes for eager mode 8-bit embedding quant currently. For graph mode we implemented this initially using the custom prepack ops for PyPer for 8bit and 4bit, to be consistent with C2.
Going forward, in fx we can implement embeddingbag quantization using observers. I feel it is a bit of an overkill to update this code to use observers for 8-bit and placeholder observers for 4-bit. Let me know your thoughts.
… graph mode" Summary: When we trace the model it produces aten::embedding_bag node in the graph, Add necessary passes in graph mode to help support quantizing it as well Test Plan: python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D23460485](https://our.internmc.facebook.com/intern/diff/D23460485) [ghstack-poisoned]
This pull request has been merged in a0ae416. |
Stack from ghstack:
Summary:
When we trace the model it produces aten::embedding_bag node in the graph,
Add necessary passes in graph mode to help support quantizing it as well
Test Plan:
python test/test_quantization.py TestQuantizeDynamicJitOps.test_embedding_bag
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D23460485