New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are the weights with quantization? #11
Comments
Hi, You are right, the weights are not quantized. I am not familar yet with how to run inference with a quantized model and whether it's possible with tfjs. But it would be awesome if we could reduce the model sizes that way. I will dig into it. |
The process of quantization is just changing your weights from float32 to uint8 so you get a 4 times size decrease. I usually do it trough the converter |
I know that you can quantize the weights using bazel, but do the weights simply get dequantized once you load them again? I read somewhere that the ops in the network have to be aware of the quantized weights to run inference, but I might be wrong here. In the first case, that should hopefully be easy to implement. |
I'm not sure I never did manual quantization. https://github.com/tensorflow/tfjs-converter/blob/master/python/tensorflowjs/quantization_test.py From the looks of it there could be a default scaling based on type |
Yep seems like you are right. Looking at the weight loader it's a simple scaling operation to dequantize the weights. Awesome! I will try to get this running soon, decreasing the model size from 28mb to 7mb looks promising. |
Update: So I managed to quantize the weights for the face detection and the face landmark model. Currently the changes are available on this branch. Apparently quantizing the face recognition model is not as straight forward, as it originally was not a tensorflow model. The issue here is that simply quantizing all weights will make the model unusable, in a way that it returns wrong outputs. Right now, it seems that leaving the weights for the conv64 layers uncompressed and quantize the rest does work out however. Long story short: I am still working on it. |
And here it is :)
|
Thanks, will check it out |
I was looking at the weight file sizes, they seem the same size like from original repos. Would be nice if the size could be reduced with quantization
The text was updated successfully, but these errors were encountered: