-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[quant][fx] Embedding quantization support #46677
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: Add support for weight only embedding quantization Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_qembedding_module Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit 018e37c (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 10 times. |
def convert(self, quantizer, node, load_arg, debug=False, convert_custom_config_dict=None): | ||
qconfig = quantizer.qconfig_map[node.name] | ||
assert not activation_is_statically_quantized(qconfig) | ||
qemb = nnq.Embedding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this dynamic quant? why is this nnq instead of nnqd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is weight only quantization. Main use case for embedding quantization is recommender models which use static quant to quantize embeddings and FC (mainly). So we decided to keep this in nnq for ease quantizing both embeddings and FCs in the same call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, now that we changed api in fx, I feel we should probably put this in a separate namespace to make it clear that it is weight only quantization
e.g. nn.quantized.weight_only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks
Summary: Add support for weight only embedding quantization Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_qembedding_module Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
|
||
@register_quant_pattern(torch.nn.Embedding) | ||
class Embedding(QuantizeHandler): | ||
def __init__(self, quantizer, node): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this can be removed
Summary: Add support for weight only embedding quantization Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_qembedding_module Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D24463305](https://our.internmc.facebook.com/intern/diff/D24463305) [ghstack-poisoned]
Summary: Add support for weight only embedding quantization Test Plan: python test/test_quantization.py TestQuantizeFxOps.test_qembedding_module Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D24463305](https://our.internmc.facebook.com/intern/diff/D24463305) [ghstack-poisoned]
Codecov Report
@@ Coverage Diff @@
## gh/supriyar/200/base #46677 +/- ##
=====================================================
Coverage 68.39% 68.40%
=====================================================
Files 413 413
Lines 54569 54587 +18
=====================================================
+ Hits 37324 37341 +17
- Misses 17245 17246 +1 |
This pull request has been merged in e34c825. |
Stack from ghstack:
Summary:
Add support for weight only embedding quantization
Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_qembedding_module
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D24463305