Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用fuse函数会报错 #4

Closed
xiaowanzizz opened this issue May 12, 2021 · 1 comment
Closed

使用fuse函数会报错 #4

xiaowanzizz opened this issue May 12, 2021 · 1 comment

Comments

@xiaowanzizz
Copy link

def fuse_conv_and_bn(conv, bn):
# https://tehnokv.com/posts/fusing-batchnorm-and-conv/
with torch.no_grad():
# init
fusedconv = torch.nn.Conv2d(conv.in_channels,
conv.out_channels,
kernel_size=conv.kernel_size,
stride=conv.stride,
padding=conv.padding,
bias=True)

    # prepare filters
    w_conv = conv.weight.clone().view(conv.out_channels, -1)
    w_bn = torch.diag(bn.weight.div(torch.sqrt(bn.eps + bn.running_var)))
    fusedconv.weight.copy_(torch.mm(w_bn, w_conv).view(fusedconv.weight.size()))

    # prepare spatial bias
    if conv.bias is not None:
        b_conv = conv.bias
    else:
        b_conv = torch.zeros(conv.weight.size(0))
    b_bn = bn.bias - bn.weight.mul(bn.running_mean).div(torch.sqrt(bn.running_var + bn.eps))
    fusedconv.bias.copy_(torch.mm(w_bn, b_conv.reshape(-1, 1)).reshape(-1) + b_bn)

    return fusedconv

Fusing layers...
Traceback (most recent call last):
File "test.py", line 263, in
opt.augment)
File "test.py", line 45, in test
model.fuse()
File "/home/zzf/Desktop/yolov3-dbb+representbatchnorm/models.py", line 402, in fuse
fused = torch_utils.fuse_conv_and_bn(conv, b)
File "/home/zzf/Desktop/yolov3-dbb+representbatchnorm/utils/torch_utils.py", line 83, in fuse_conv_and_bn
w_bn = torch.diag(bn.weight.div(torch.sqrt(bn.eps + bn.running_var)))
RuntimeError: matrix or a vector expected

把自己网络的batchnorm 改变后会报错麻烦解决以下。

@Hanqer
Copy link
Collaborator

Hanqer commented May 27, 2021

image
As shown in https://tehnokv.com/posts/fusing-batchnorm-and-conv/.
The frozen BN can be written as a channel-wise sparse connected convolution with bias. The first matrix represents the scaling weight of BN and the scaling affine transformation. The bias term represents the centering weight of BN and the bias of affine transformation.
So the BN can be written as 1x1 depth-wise convolution. The convolution is linear w.r.t. kernel weight and bias, so the 1x1 depth-wise convolution can be fused into the normal convolution with diag matrix weight (channel-wise sparse).
But RBN has very different operations that additional center calibration and scaling calibration are added. If you want to fuse the
RBN to convolution. The fusion result might be a dynamic version of convolution because RBN introduces instance-specific weight. Hope that you can in-depth study it.

@gasvn gasvn closed this as completed Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants