-
Notifications
You must be signed in to change notification settings - Fork 802
Post-train quantization based on stats + additional modules quantized #136
Conversation
* Expose command line arguments for collecting and loading stats * Integrate in image classification sample * range_linear.py: * Load stats from YAML file or dict in post-train quantizer * Refactor layer wrappers to handle both dynamic and stats-based (aka "static") cases * collector.py: * Add dedicated collector for quantization stats * Allow None classes filter, in which case collect stats for all types * Expose abstract 'save' function instead of existing 'to_xlsx' * Fixes to typos and some comments * Extract code for setting deterministic execution into function, move to utils.py * Add ability to shuffle test dataset (required since we don't collect stats on the whole test set) * Move YAML loading functionality to utils.py * Add unit-tests for quantization based on stats
…d control * Command line argument to configure post-train quantizer from file * Integrated in image classification sample * Add ability to load specific "component" from YAML file, without a full- blown scheduler * Minor change in Quantizer initialization - change default of bits_overrides to None
* Concat + element-wise add / mult supported in post-training * Embeddings supported in quantization-aware training * Wrapped PyTorch concat + element-wise add/mult ops in Modules so they could be recognized by the Quantizer * Modified our ResNet implementation to use element-wise add modules instead of operator * Unit tests or concat, add and mult
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All in all looks good.
Commented on a few typos.
One thing to make sure before merging this, is that post training quantization works well on a CPU-only machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss controlling the amount of dataset used for training/validation/test - there are small changes I'd prefer you make (details in the specific remarks)
Thanks
Neta
return dict_config(model, optimizer, sched_dict, scheduler) | ||
except yaml.YAMLError as exc: | ||
print("\nFATAL parsing error while parsing the schedule configuration file %s" % filename) | ||
raise | ||
raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This raise
is not needed...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a weird merge artifact in GitHub, the code on the branch doesn't actually have the redundant statement. So I can't fix it now. If it sticks after the merge I'll remove it then.
* Change semantics of 'qe_calibration' argument to match the new implementation of using part of dataset
Summary of changes: (1) Post-train quantization based on pre-collected statistics (2) Quantized concat, element-wise addition / multiplication and embeddings (3) Move post-train quantization command line args out of sample code (4) Configure post-train quantization from YAML for more fine-grained control (See PR #136 for more detailed changes descriptions)
This is a big PR, will be easiest to review commit-by-commit. Not exactly best-practice, but as it is I prefer this to separate PRs.
Summary of changes:
(1) Post-train quantization based on pre-collected statistics
(2) Quantized concat, element-wise addition / multiplication and embeddings
(3) Move post-train quantization command line args out of sample code
(4) Configure post-train quantization from YAML for more fine-grained control