Bug when set ENABLE_BIAS_QUANT = True #28

dathudeptrai · 2020-02-10T04:08:17Z

Hi @volcacius , thanks for ur implementation. I got a bug when set ENABLE_BIAS_QUANT = True. Can u help me debug ?.

/usr/local/lib/python3.6/dist-packages/brevitas/nn/quant_conv.py in forward(self, input)
    224 
    225         if self.compute_output_bit_width:
--> 226             assert input_bit_width is not None
    227             output_bit_width = self.max_output_bit_width(input_bit_width, quant_weight_bit_width)
    228         if self.compute_output_scale:

AssertionError:

The text was updated successfully, but these errors were encountered:

volcacius · 2020-02-10T09:41:27Z

Hello,
You need to pass a QuantTensor to the QuantConv2d layer, which happens when you enable return_quant_tensor=True in the activation before that QuantConv2d layer. For the very first conv layer, you should insert a quantized identity (a quantized hard tanh) right at the beginning of the network.

As a side note, I see you are working on a quantized MelGAN implementation. GANs can be quite tricky to quantize. We have a working 8 bit version internally that we are going to release at some point in the next couple of months.

Alessandro

dathudeptrai · 2020-02-10T10:29:37Z

@volcacius, yeah, i see. I was successful to quantize melgan to float16 on tflite, it's run 2x faster than realtime. On 8 bit, the accuracy drop much :)), there are many white noise :v. i'm still investigate ur implementation and tflite implementation. Somehow the output of tflite and your's framework is different on 8bit :D. (32bit and 16bit is almost same). If u know the difference of ur quantize and tflite procedure, pls let me know :'(. I thought it was because of the bias when i don't use fake-quantize aware for it, but remove bias didn't solved the problem :D

volcacius · 2020-02-10T10:49:57Z

It really depends on how you are setting up the quantized layers.
In general TFLite is a great tool for production-oriented quantization, while Brevitas is oriented towards research, which is why it provides many more options.
For what it's worth, our internal results at 8 bit with MelGAN on LJSpeech are on par with floating point quality. I'll be happy to share the details once the model is release.

dathudeptrai · 2020-02-10T11:08:16Z

@volcacius Looking forward to ur model :D . Noticed that my result on LJspeech using 8bit (base on this framework) are on par with float32 too (use pytorch) but have some difference when convert to tflite :D. BTW, thanks for ur great implementation again :D.

volcacius · 2020-02-10T13:07:37Z

I see now, glad to hear about your good results and thanks for the positive feedback! Please cite us if you plan to release/publish them somewhere, I would really appreciate.
On the export side, unfortunately there aren't any plans for a TFLite compatible flow at the moment. We are working on a custom ONNX based flow, but it's going to target deployment on our own FPGAs.

volcacius closed this as completed Feb 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug when set ENABLE_BIAS_QUANT = True #28

Bug when set ENABLE_BIAS_QUANT = True #28

dathudeptrai commented Feb 10, 2020 •

edited

volcacius commented Feb 10, 2020

dathudeptrai commented Feb 10, 2020 •

edited

volcacius commented Feb 10, 2020

dathudeptrai commented Feb 10, 2020 •

edited

volcacius commented Feb 10, 2020

Bug when set ENABLE_BIAS_QUANT = True #28

Bug when set ENABLE_BIAS_QUANT = True #28

Comments

dathudeptrai commented Feb 10, 2020 • edited

volcacius commented Feb 10, 2020

dathudeptrai commented Feb 10, 2020 • edited

volcacius commented Feb 10, 2020

dathudeptrai commented Feb 10, 2020 • edited

volcacius commented Feb 10, 2020

dathudeptrai commented Feb 10, 2020 •

edited

dathudeptrai commented Feb 10, 2020 •

edited

dathudeptrai commented Feb 10, 2020 •

edited