why not quantize the activation of the last conv layer in a block #20

frankgt · 2021-09-27T10:16:19Z

Hi,
Thanks for the release of your code. But I have one problem regarding the detail of the implementation.
In quant_block.py, take the following code of ResNet-18 and ResNet-34 for example.
The disable_act_quant is set True for conv2, which disables the quantization of the output of conv2.

class QuantBasicBlock(BaseQuantBlock):
    """
    Implementation of Quantized BasicBlock used in ResNet-18 and ResNet-34.
    """
    def __init__(self, basic_block: BasicBlock, weight_quant_params: dict = {}, act_quant_params: dict = {}):
        super().__init__(act_quant_params)
        self.conv1 = QuantModule(basic_block.conv1, weight_quant_params, act_quant_params)
        self.conv1.activation_function = basic_block.relu1
        self.conv2 = QuantModule(basic_block.conv2, weight_quant_params, act_quant_params, disable_act_quant=True)

        # modify the activation function to ReLU
        self.activation_function = basic_block.relu2

        if basic_block.downsample is None:
            self.downsample = None
        else:
            self.downsample = QuantModule(basic_block.downsample[0], weight_quant_params, act_quant_params,
                                          disable_act_quant=True)
        # copying all attributes in original block
        self.stride = basic_block.stride

It will cause a boost in accuracy, the following is the result I get use the your code and the same ImageNet dataset you used in the paper.
[1] and [2] denotes the modification I did to the original code.

[1]: quant_block.py→QuantBasicBlock→__init__→self.conv2=QuantModule(... , disable_act_quant=True) self.downsample = QuantModule(basic_block.downsample[0], weight_quant_params, act_quant_params, disable_act_quant=True). Change from True to False;
[2]: quant_block.py→QuantInvertedResidual→__init__→self.conv=nn.Sequential(..., QuantModule(... , disable_act_quant=True), change from True to False

But I do not think it is applicable for most of NPUs, which do quantization of every output of conv layer.
So why not quantize the activation of the last conv layer in a block? Is there any particular reason for this?
Also, for the methods you compared with in your paper, have you checked whether they do the same thing as you do or not?

The text was updated successfully, but these errors were encountered:

yhhhli · 2021-09-28T21:43:32Z

Hi thanks for your comment,

Indeed there is some disagreement on how to insert activation quantization node. As you point out, some hardware can use this computation graph (like TensorRT) and some others cannot. However, many works also choose to quantize the input and the weights of a Conv2d layer and do not deal with a shortcut layer. In this case, we believe our method will have lower accuracy. It is not possible to find the same setting for all methods we compared.

frankgt · 2021-09-29T01:10:54Z

Hi
Thanks for your reply.
Indeed, the quantization process depends on the implementation of the NPU hardware.
Another questionm, I am using single 2080Ti GPU to run your code, CUDA out of memory error occurs for the deep networks and large bacth size(64). I have to use small batch size, 32 for mobilenetV2 or even 16 for resnet50.
Do you have the same problem? Any suggestion for that?

yhhhli · 2021-09-29T01:15:54Z

Good question,

in fact the GPU memory is mainly cost by storing the input & output for block. So there are 2 ways to deal with that:

Use multi-GPU reconstruction, we release the code of this using torch.distributed.
Reduce the size of calibration set, for example 512 or 768

frankgt closed this as completed Sep 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why not quantize the activation of the last conv layer in a block #20

why not quantize the activation of the last conv layer in a block #20

frankgt commented Sep 27, 2021

yhhhli commented Sep 28, 2021

frankgt commented Sep 29, 2021

yhhhli commented Sep 29, 2021

why not quantize the activation of the last conv layer in a block #20

why not quantize the activation of the last conv layer in a block #20

Comments

frankgt commented Sep 27, 2021

yhhhli commented Sep 28, 2021

frankgt commented Sep 29, 2021

yhhhli commented Sep 29, 2021