Skip to content
This repository has been archived by the owner on May 1, 2023. It is now read-only.

Enable weights-only and activations-only post-training quantization for conv/linear modules #439

Merged
merged 2 commits into from
Dec 8, 2019

Conversation

guyjacob
Copy link
Contributor

@guyjacob guyjacob commented Dec 5, 2019

Same functionality as #356, but decided to take a different approach, which reuses the existing PTQ wrapper modules. This reduces code duplication and keeps the quantized model "similar" (in terms of modules used) when activations aren't quantized vs. when they are.

  • Allow RangeLinearQuantWrapper to accept num_bits_acts = None, in which case it'll act as a simple pass-through during forward.
  • In RangeLinearQuantParamLayerWrapper, if bits_activations is None and num_bits_params > 0, perform quant and de-quant of the parameters instead of just quant.
  • Enable activations only quantization for conv/linear modules. When PostTrainLinearQuantizer detects # bits != None for activations and # bits == None for weights, a fake-quantization wrapper will be used.
  • Allow passing 0 in the --qe-bits-acts and --qe-bits-wts command line arguments to invoke weights-only / activations-only quantization, respectively.
  • Minor refactoring for clarity in PostTrainLinearQuantizer's internal replace_* functions

* Allow RangeLinearQuantWrapper to accept num_bits_acts = None, in
  which case it'll act as a simple pass-through during forward
* In RangeLinearQuantParamLayerWrapper, if bits_activations is None
  and num_bits_params > 0, Perform quant and de-quant of the parameters
  instead of just quant.
* Enable activations only quantization for conv/linear modules. When
  PostTrainLinearQuantizer detects # bits != None for activations and
  # bits == None for weights, a fake-quantization wrapper will be used.
* Allow passing 0 in the `--qe-bits-acts` and `--qe-bits-wts` command
  line arguments to invoke weights-only / activations-only quantization
  respectively.
* Minor refactoring for clarity in PostTrainLinearQuantizer replace_*
  functions
@guyjacob guyjacob changed the title Enable weights-only post-training quantization Enable weights-only and activations-only post-training quantization for conv/linear modules Dec 8, 2019
@guyjacob guyjacob merged commit 952028d into master Dec 8, 2019
@nzmora nzmora deleted the ptq_weights_only branch April 20, 2020 11:59
michaelbeale-IL pushed a commit that referenced this pull request Apr 24, 2023
* Weights-only PTQ:
  * Allow RangeLinearQuantWrapper to accept num_bits_acts = None, in
    which case it'll act as a simple pass-through during forward
  * In RangeLinearQuantParamLayerWrapper, if bits_activations is None
    and num_bits_params > 0, Perform quant and de-quant of the
    parameters instead of just quant.
* Activations-only PTQ:
  * Enable activations only quantization for conv/linear modules. When
    PostTrainLinearQuantizer detects # bits != None for activations 
    and # bits == None for weights, a fake-quantization wrapper will
    be used.
* Allow passing 0 in the `--qe-bits-acts` and `--qe-bits-wts` command
  line arguments to invoke weights/activations-only quantization,
  respectively.
* Minor refactoring for clarity in PostTrainLinearQuantizer's replace_*
  functions
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants