- 
                Notifications
    You must be signed in to change notification settings 
- Fork 357
Add scale-only version of the HQQ algorithm for IntxWeightOnlyConfig/Int8DynamicActivationIntxWeightConfig #3110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. btw, are the changes in this file tested as well? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this exisiting unit test covers them: https://github.com/pytorch/ao/blob/main/test/quantization/test_qat.py#L2321 | 
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,23 @@ | ||
| # Copyright (c) Meta Platforms, Inc. and affiliates. | ||
| # All rights reserved. | ||
| # | ||
| # This source code is licensed under the BSD 3-Clause license found in the | ||
| # LICENSE file in the root directory of this source tree. | ||
|  | ||
| from enum import Enum | ||
|  | ||
|  | ||
| # can switch to StrEnum (https://docs.python.org/3/library/enum.html#enum.StrEnum) | ||
| # after python 3.10 is end of life (https://devguide.python.org/versions/) | ||
| class IntxChooseQParamsAlgorithm(str, Enum): | ||
| There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, do we need to introduce yet another class? Can we just extend existing Int4ChooseQParamsAlgorithm and add affine and hqq_scale_only? And then rename/promote Int4ChooseQParamsAlgorithm to IntxChooseQParamsAlgorithm in a follow-up PR? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In torchao's refactor (removing AffineQuantizedTensor), the direction is for subclasses to not share higher-level abstractions, but instead define their own enums. This is how packing format works as well (intx_packing_format for the intx subclass, and int4_packing_format for the int4 subclass). I'll let @jerryzh168 comment here as well There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will defer to @jerryzh168 then There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah we want local abstractions instead of global abstractions unless it's required. | ||
| """Variant of quantization algorithm to calculate scale and zero_point""" | ||
|  | ||
| """ | ||
| Uses `torchao.quantization.quant_primitives.choose_qparams_affine` | ||
| """ | ||
| AFFINE = "affine" | ||
|  | ||
| """ | ||
| Uses `torchao.quantization.quant_primitives._choose_qparams_and_quantize_scale_only_hqq` | ||
| """ | ||
| HQQ_SCALE_ONLY = "hqq_scale_only" | ||
Uh oh!
There was an error while loading. Please reload this page.