Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some question about the noise. #2

Closed
HazardFY opened this issue Dec 16, 2021 · 1 comment
Closed

Some question about the noise. #2

HazardFY opened this issue Dec 16, 2021 · 1 comment

Comments

@HazardFY
Copy link

HazardFY commented Dec 16, 2021

Hello, thanks for your sharing of the code!
But I have some questions about the code. It seems that the noises are only applied to the BN layers instead of the conv layers in the code. According to the description in the paper, the perturbations to the weight and bias of a neuron may cancel each other out due to the BN layers. So if the network contains the BN layers, the ANP algorithm does only need to perturb the neurons in the BN layers. Otherwise, the ANP algorithm will perturb the neurons in the conv layers. Is that right? Could you please supplement the experimental code for the network that does not contain BN layers?

@csdongxian
Copy link
Owner

Thank you for your interest in our research.

The ANP algorithm only needs to perturb the neurons in the BN layers. Otherwise, the ANP algorithm will perturb the neurons in the Conv layers. Is that right?

You are correct. Let me explain in detail further:

During inference, BatchNorm layer is a linear function, and we can absorb them into Conv layers (see How to absorb batch norm layer weights into Convolution layer weights?), which means a complete layer actually consists of a Conv layer and a BatchNorm layer, so does the neurons in this layer. Taking ResNet as an example, we could use the scaling factors in BatchNorm layer to control the output of the neurons in this layer (Similar methods can be found in [1]). As a result, perturbing BN layers is more natural than perturbing Conv layers in my opinion.

[1] Zhuang Liu, Jianguo Li, Zhiqiang Shen, Gao Huang, Shoumeng Yan, Changshui Zhang. Learning Efficient Convolutional Networks through Network Slimming. In ICCV, 2017. https://arxiv.org/pdf/1708.06519.pdf

Could you please supplement the experimental code for the network that does not contain BN layers?

Sorry, I can't. This is because modern DNNs heavily rely on norm layers (BatchNorm, LayerNorm, and others) and they always fail without norm layers on the commonly-used dataset such as CIFAR-10.

Hope this could be helpful :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants