You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for providing an implementation for this architecture. I'm currently implementing a variant of it, and wondered why a padding is chosen in the conv layers although it is not mentioned in the LiLaNet paper (at least I couldn't find it):
self.branch1=BasicConv2d(in_channels, n, kernel_size=(7, 3), padding=(2, 0))
self.branch2=BasicConv2d(in_channels, n, kernel_size=3)
self.branch3=BasicConv2d(in_channels, n, kernel_size=(3, 7), padding=(0, 2))
self.conv=BasicConv2d(n*3, n, kernel_size=1, padding=1)
I think it might make sense that a padding is applied along the axis where the kernel size is 7, so that the spatial dimensions decrease in the same way as for the side that has a kernel size of 3. But it isn't mentioned in the paper, or am I missing something?
Also, why is a padding of 1 applied in the 1x1 convolution?
Bonus question: So both spatial dimensions are decreased by one after each LiLaBlock. Why not use an (additional) padding of 1 so that the size is preserved through the network?
I'm just a beginner in deep learning so any help is appreciated.
The text was updated successfully, but these errors were encountered:
Okay so I realized that the equal decreasing along both spatial dimensions is actually required so that all of the four parallel convolutions result in tensors of the same spatial dimensions, which is required for the LiLaNet because of the subsequent stacking.
I also realized that with the current choice of padding the spatial dimensions are indeed preserved through the network. However, this is caused by the 1-padding in the 1x1 convolution. This seems a bit strange to me, because on the edges the convolutions can only consider the zero-padding, because they are only 1x1.
What about removing this padding and instead increasing the padding for the previous conv layers, where the kernels are larger?
Thanks for providing an implementation for this architecture. I'm currently implementing a variant of it, and wondered why a padding is chosen in the conv layers although it is not mentioned in the LiLaNet paper (at least I couldn't find it):
pytorch-LiLaNet/lilanet/model/lilanet.py
Lines 78 to 81 in f68aae9
I think it might make sense that a padding is applied along the axis where the kernel size is 7, so that the spatial dimensions decrease in the same way as for the side that has a kernel size of 3. But it isn't mentioned in the paper, or am I missing something?
Also, why is a padding of 1 applied in the 1x1 convolution?
Bonus question: So both spatial dimensions are decreased by one after each LiLaBlock. Why not use an (additional) padding of 1 so that the size is preserved through the network?
I'm just a beginner in deep learning so any help is appreciated.
The text was updated successfully, but these errors were encountered: