-
Notifications
You must be signed in to change notification settings - Fork 24.7k
[quant][graphmode] FP16 quant support - Insert cast operators #40709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: Cast to kHalf and back to kFloat before the linear operator to mimic FP16 quant support Test Plan: python test/test_quantization.py test_convert_dynamic_fp16 Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
…ors" Summary: Cast to kHalf and back to kFloat before the linear operator to mimic FP16 quant support Test Plan: python test/test_quantization.py test_convert_dynamic_fp16 Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit cabf163 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 6 times. |
@@ -578,11 +589,9 @@ bool is_module( | |||
const auto& match_vmap = match.values_map; | |||
Value* relu = match_vmap.at(vmap.at(vname)); | |||
auto type = relu->type()->cast<ClassType>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be removed?
@@ -44,6 +44,9 @@ TORCH_API bool isScalar(Value* v); | |||
// Check if value is the input of the graph | |||
TORCH_API bool hitGraphInput(Value* value); | |||
|
|||
// Return the module name that corresponds to the value. | |||
TORCH_API c10::optional<std::string> get_module_name(Value* value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use the same naming convention like other functions? I just noticed this problem, probably the filter functions should follow the name convention as well. but we can do that later.
if (quant_type == QuantType::DYNAMIC && isNoopObserver(observer->input(0))) { | ||
dequant = insertFP16CastOps(g, observer_out); | ||
} else if ( | ||
quant_type == QuantType::DYNAMIC && !isWeight(module, observer_out)) { | ||
Value* dtype = g->insertGetAttr(self, qparam_names.back()); | ||
std::tie(choose_qparams, quant, dequant) = insertChooseQParamQuantDequant( | ||
g, observer_out, dtype, at::Symbol::aten(quantize_func)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably better to add a comment for else branch, it's accepting both dynamic + activation and static quant
…ors" Summary: Cast to kHalf and back to kFloat before the linear operator to mimic FP16 quant support Test Plan: python test/test_quantization.py test_convert_dynamic_fp16 Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
…ors" Summary: Cast to kHalf and back to kFloat before the linear operator to mimic FP16 quant support Test Plan: python test/test_quantization.py test_convert_dynamic_fp16 Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
This pull request has been merged in 55b5ab1. |
Stack from ghstack:
Summary:
Cast to kHalf and back to kFloat before the linear operator to mimic FP16 quant support
Test Plan:
python test/test_quantization.py test_convert_dynamic_fp16
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D22335977