Skip to content

Conversation

supriyar
Copy link
Contributor

@supriyar supriyar commented Oct 21, 2020

Stack from ghstack:

Summary:
Add support for weight only embedding quantization

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_qembedding_module

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D24463305

Summary:
Add support for weight only embedding quantization

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_qembedding_module

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@dr-ci
Copy link

dr-ci bot commented Oct 21, 2020

💊 CI failures summary and remediations

As of commit 018e37c (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 10 times.

def convert(self, quantizer, node, load_arg, debug=False, convert_custom_config_dict=None):
qconfig = quantizer.qconfig_map[node.name]
assert not activation_is_statically_quantized(qconfig)
qemb = nnq.Embedding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this dynamic quant? why is this nnq instead of nnqd?

Copy link
Contributor Author

@supriyar supriyar Oct 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is weight only quantization. Main use case for embedding quantization is recommender models which use static quant to quantize embeddings and FC (mainly). So we decided to keep this in nnq for ease quantizing both embeddings and FCs in the same call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, now that we changed api in fx, I feel we should probably put this in a separate namespace to make it clear that it is weight only quantization
e.g. nn.quantized.weight_only

Copy link
Contributor

@jerryzh168 jerryzh168 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks

Summary:
Add support for weight only embedding quantization

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_qembedding_module

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

@register_quant_pattern(torch.nn.Embedding)
class Embedding(QuantizeHandler):
def __init__(self, quantizer, node):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this can be removed

Summary:
Add support for weight only embedding quantization

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_qembedding_module

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D24463305](https://our.internmc.facebook.com/intern/diff/D24463305)

[ghstack-poisoned]
Summary:
Add support for weight only embedding quantization

Test Plan:
python test/test_quantization.py TestQuantizeFxOps.test_qembedding_module

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D24463305](https://our.internmc.facebook.com/intern/diff/D24463305)

[ghstack-poisoned]
@codecov
Copy link

codecov bot commented Oct 22, 2020

Codecov Report

Merging #46677 into gh/supriyar/200/base will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@                  Coverage Diff                  @@
##           gh/supriyar/200/base   #46677   +/-   ##
=====================================================
  Coverage                 68.39%   68.40%           
=====================================================
  Files                       413      413           
  Lines                     54569    54587   +18     
=====================================================
+ Hits                      37324    37341   +17     
- Misses                    17245    17246    +1     

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in e34c825.

@facebook-github-bot facebook-github-bot deleted the gh/supriyar/200/head branch October 26, 2020 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants