Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Usage of padding #1

Open
ptoews opened this issue Apr 19, 2021 · 1 comment
Open

[Question] Usage of padding #1

ptoews opened this issue Apr 19, 2021 · 1 comment

Comments

@ptoews
Copy link

ptoews commented Apr 19, 2021

Thanks for providing an implementation for this architecture. I'm currently implementing a variant of it, and wondered why a padding is chosen in the conv layers although it is not mentioned in the LiLaNet paper (at least I couldn't find it):

self.branch1 = BasicConv2d(in_channels, n, kernel_size=(7, 3), padding=(2, 0))
self.branch2 = BasicConv2d(in_channels, n, kernel_size=3)
self.branch3 = BasicConv2d(in_channels, n, kernel_size=(3, 7), padding=(0, 2))
self.conv = BasicConv2d(n * 3, n, kernel_size=1, padding=1)

I think it might make sense that a padding is applied along the axis where the kernel size is 7, so that the spatial dimensions decrease in the same way as for the side that has a kernel size of 3. But it isn't mentioned in the paper, or am I missing something?

Also, why is a padding of 1 applied in the 1x1 convolution?

Bonus question: So both spatial dimensions are decreased by one after each LiLaBlock. Why not use an (additional) padding of 1 so that the size is preserved through the network?

I'm just a beginner in deep learning so any help is appreciated.

@ptoews
Copy link
Author

ptoews commented Apr 19, 2021

Okay so I realized that the equal decreasing along both spatial dimensions is actually required so that all of the four parallel convolutions result in tensors of the same spatial dimensions, which is required for the LiLaNet because of the subsequent stacking.

I also realized that with the current choice of padding the spatial dimensions are indeed preserved through the network. However, this is caused by the 1-padding in the 1x1 convolution. This seems a bit strange to me, because on the edges the convolutions can only consider the zero-padding, because they are only 1x1.
What about removing this padding and instead increasing the padding for the previous conv layers, where the kernels are larger?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant