-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to do the quantization about inputs and weights ? #21
Comments
Hi @victorygogogo |
I training a model with float type. so ,I see your paper,how do you solve the problem ??? |
8-bit quantization with weights, biases, and inputs quantized? Do you mean uniform quantization, where possible quantization values are [-k, -k+1, ..., -1, 0, 1, ..., k - 1, k]? If so, you may refer to DoReFa-Net which suits your problem better. Our approach uses non-uniform quantization, and only weights are quantized (biases and inputs are not quantized). |
only weights are quantized ? if only weights quantized ,how to speedup the reference time ? for example ,how to speedup the gemm conv ?? |
Sorry for the late reply. In our work, only weights are quantized. During the test phase, the matrix multiplication is converted into a series of table look-up operations (please refer to our paper for details). This results in a reduction in FLOPs. |
can you tell where to check the matrix multiplication with table lookup from your code ?? |
For convolutional layers, please refer to: For fully-connected layers, please refer to: |
@jiaxiang-wu |
how to do the weights quantization ? |
|
ok ! |
We do not have such plan for the moment. Sorry. |
is the whole code released ? |
@wuzhiyang2016 No, only the inference code is released. |
first thanks for you paper.
I want to know ,how to quantization the weights and input about about the result ??
can you tell me some code about quantization ?
The text was updated successfully, but these errors were encountered: