You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In fact, there are several variants of residual block. As discussed in ResNet ECCV paper (https://arxiv.org/pdf/1603.05027.pdf), the activation after the addition (original version) will impede the information propagation and cause performance degradation. You can see detailed comparison of different variants in the paper.
Thanks for the reply (and the excellent paper and code!). One follow-on question: if the goal is to use a full pre-activation configuration (to preserve the skip connection), shouldn't the norm and activation appear before the convolution in the ResNet?
Since there are several variants, our implementation does not strictly follow the pre-activation manner, but still remove the activation after the addition operator. Such implementation style is also widely adopted in research work.
I'm sorry if I'm just being silly, but shouldn't there be a ReLU (or other non-linearity) between ResNet layers (after the add)? For example, the mapping net architecture looks like:
...
(10): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(11): ReLU(inplace=True)
(12): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(13): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1))
...
The text was updated successfully, but these errors were encountered: