Are the weights with quantization? #11

seranus · 2018-06-17T20:54:23Z

I was looking at the weight file sizes, they seem the same size like from original repos. Would be nice if the size could be reduced with quantization

justadudewhohacks · 2018-06-18T08:33:21Z

Hi,

You are right, the weights are not quantized. I am not familar yet with how to run inference with a quantized model and whether it's possible with tfjs. But it would be awesome if we could reduce the model sizes that way. I will dig into it.

seranus · 2018-06-18T11:18:30Z

The process of quantization is just changing your weights from float32 to uint8 so you get a 4 times size decrease. I usually do it trough the converter

justadudewhohacks · 2018-06-18T11:29:41Z

I know that you can quantize the weights using bazel, but do the weights simply get dequantized once you load them again?

I read somewhere that the ops in the network have to be aware of the quantized weights to run inference, but I might be wrong here.

In the first case, that should hopefully be easy to implement.

seranus · 2018-06-18T12:02:34Z

I'm not sure I never did manual quantization.

https://github.com/tensorflow/tfjs-converter/blob/master/python/tensorflowjs/quantization_test.py

From the looks of it there could be a default scaling based on type

justadudewhohacks · 2018-06-18T15:31:09Z

Yep seems like you are right. Looking at the weight loader it's a simple scaling operation to dequantize the weights.

Awesome! I will try to get this running soon, decreasing the model size from 28mb to 7mb looks promising.

justadudewhohacks · 2018-06-22T16:59:42Z

Update: So I managed to quantize the weights for the face detection and the face landmark model. Currently the changes are available on this branch.

Apparently quantizing the face recognition model is not as straight forward, as it originally was not a tensorflow model. The issue here is that simply quantizing all weights will make the model unusable, in a way that it returns wrong outputs. Right now, it seems that leaving the weights for the conv64 layers uncompressed and quantize the rest does work out however.

Long story short: I am still working on it.

justadudewhohacks · 2018-06-23T07:44:29Z

And here it is :)

model weights have been quantized, to reduce the model size by ~75%:

face detection model: 21.7 MB -> 5.4 MB

face recognition model: 28.7 MB -> 7.0 MB

face landmark model: 21.9 MB -> 6.2 MB

plus model weights are sharded in chunks of 4 MB to allow them to be cached in the browser

seranus · 2018-06-23T10:39:27Z

Thanks, will check it out

justadudewhohacks added the enhancement New feature or request label Jun 18, 2018

kylemcdonald mentioned this issue Jun 21, 2018

5-point landmark detector #14

Closed

justadudewhohacks mentioned this issue Jun 23, 2018

Quantize weights #16

Merged

seranus closed this as completed Jun 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are the weights with quantization? #11

Are the weights with quantization? #11

seranus commented Jun 17, 2018

justadudewhohacks commented Jun 18, 2018

seranus commented Jun 18, 2018

justadudewhohacks commented Jun 18, 2018

seranus commented Jun 18, 2018

justadudewhohacks commented Jun 18, 2018

justadudewhohacks commented Jun 22, 2018

justadudewhohacks commented Jun 23, 2018

seranus commented Jun 23, 2018

Are the weights with quantization? #11

Are the weights with quantization? #11

Comments

seranus commented Jun 17, 2018

justadudewhohacks commented Jun 18, 2018

seranus commented Jun 18, 2018

justadudewhohacks commented Jun 18, 2018

seranus commented Jun 18, 2018

justadudewhohacks commented Jun 18, 2018

justadudewhohacks commented Jun 22, 2018

justadudewhohacks commented Jun 23, 2018

seranus commented Jun 23, 2018