Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to do the quantization about inputs and weights ? #21

Closed
victorygogogo opened this issue May 31, 2018 · 14 comments
Closed

how to do the quantization about inputs and weights ? #21

victorygogogo opened this issue May 31, 2018 · 14 comments

Comments

@victorygogogo
Copy link

first thanks for you paper.
I want to know ,how to quantization the weights and input about about the result ??

can you tell me some code about quantization ?

@jiaxiang-wu
Copy link
Collaborator

Hi @victorygogogo
The code for the quantization phase (learning codebooks and assignments) cannot be released at the moment. You may present your detailed questions here for further discussion.

@victorygogogo
Copy link
Author

I training a model with float type.
I want to use 8bit quantization way to improve my reference time .
so the weights ,bias ,and input should be quantizated ,
there is a problem ,how about the scale for the result ?

so ,I see your paper,how do you solve the problem ???

@jiaxiang-wu
Copy link
Collaborator

8-bit quantization with weights, biases, and inputs quantized? Do you mean uniform quantization, where possible quantization values are [-k, -k+1, ..., -1, 0, 1, ..., k - 1, k]? If so, you may refer to DoReFa-Net which suits your problem better.

Our approach uses non-uniform quantization, and only weights are quantized (biases and inputs are not quantized).

@victorygogogo
Copy link
Author

only weights are quantized ?

if only weights quantized ,how to speedup the reference time ?

for example ,how to speedup the gemm conv ??

@jiaxiang-wu
Copy link
Collaborator

jiaxiang-wu commented Jun 8, 2018

Sorry for the late reply.

In our work, only weights are quantized. During the test phase, the matrix multiplication is converted into a series of table look-up operations (please refer to our paper for details). This results in a reduction in FLOPs.

@victorygogogo
Copy link
Author

can you tell where to check the matrix multiplication with table lookup from your code ??

@jiaxiang-wu
Copy link
Collaborator

For convolutional layers, please refer to:
void CaffeEva::CalcFeatMap_ConvAprx(...)

For fully-connected layers, please refer to:
void CaffeEva::CalcFeatMap_FCntAprx(...)

@victorygogogo
Copy link
Author

@jiaxiang-wu
thank you!

@victorygogogo
Copy link
Author

how to do the weights quantization ?
and how to do the LUT table ?

@jiaxiang-wu
Copy link
Collaborator

  1. Do you mean how to obtain the D (sub-codebooks) & B (sub-codeword assignments) matrices for each layer? The training code for these two is not included in this repository, and you need to implement it by yourself, under the guidance of our CVPR paper.

  2. Forward computation with look-up tables during the test phase is included in these two functions:

  • CalcFeatMap_ConvAprx()
  • CalcFeatMap_FCntAprx()

@victorygogogo
Copy link
Author

ok !
do you think about open all the source code ?

@jiaxiang-wu
Copy link
Collaborator

jiaxiang-wu commented Jun 17, 2018

We do not have such plan for the moment. Sorry.

@wuzhiyang2016
Copy link

is the whole code released ?

@jiaxiang-wu
Copy link
Collaborator

@wuzhiyang2016 No, only the inference code is released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants