Quantization for speeding up #52

ymli39 · 2020-03-05T21:13:15Z

Hi, thanks for your code and could you help me with the following question? I have incorporated your provided layers to a DenseUNET model, I have:

conv = qnn.QuantConv2d(in_channels=params['num_channels'], out_channels=params['num_filters'],
kernel_size=(
params['kernel_h'], params['kernel_w']),
padding=(padding_h, padding_w),
stride=params['stride_conv'],
weight_quant_type=QuantType.INT,
weight_bit_width=8)

batchnorm = qnn.BatchNorm2dToQuantScaleBias(num_features=params['num_channels'],
weight_quant_type=QuantType.INT,
weight_bit_width=8)

relu = qnn.QuantReLU(quant_type=QuantType.INT, bit_width=8, max_val=6)

sigmoid = qnn.QuantSigmoid(bit_width=8, quant_type=QuantType.INT)

Those functions replaced by qnn, I did not change anything else, the model can be successfully trained but seemed the running time with GPU and CPU is actually slower than pytorch nn implementation. Did I do anything wrong? Should the model have the speed up the training and inference about 4x times?

volcacius · 2020-03-06T10:55:32Z

Hello,

Glad yo hear about your good training results. However, Brevitas is a library oriented towards research on quantization-aware (re)training, it doesn't take care of deployment. It's up to the user to export a trained model to some kind of optimized hw+sw backend. Our main open source backend (currently being developed) is FINN, which deploys quantized models as custom dataflow architectures on FPGAs.
The fact that inference is slower than torch.nn is expected, as quantization-aware operations involves exposing a differentiable integer-only datapath on top of floating point, which can be expensive.
You might want to consider moving to Pytorch official quantization tools. They won't be as good in terms of accuracy, but deployment to CPU/GPU is easier.

Alessandro

ymli39 changed the title ~~Quantization for speed up~~ Quantization for speeding up Mar 5, 2020

volcacius closed this as completed Mar 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantization for speeding up #52

Quantization for speeding up #52

ymli39 commented Mar 5, 2020 •

edited

volcacius commented Mar 6, 2020

Quantization for speeding up #52

Quantization for speeding up #52

Comments

ymli39 commented Mar 5, 2020 • edited

volcacius commented Mar 6, 2020

ymli39 commented Mar 5, 2020 •

edited