1x1 or 3x3 stem conv? #22

lucasb-eyer · 2020-07-26T12:46:33Z

Hi, just like you, I wanted to try s2d stem after reading the isometric nets paper :)

I noticed that in your paper, Figure 1, you show using 4x4 s2d followed by a 1x1 conv64. However, in your code here you clearly follow the 4x4 s2d by a 3x3 conv64. So, which one is used for the results in the paper?

lucasb-eyer · 2020-07-26T13:20:44Z

Update:

I did try both in my setting, and 1x1 conv gives NaN loss very early, while 3x3 actually works.
Speed-wise, it seems both 1x1 and 3x3 are similar (I'm surprised by this), and both faster than the original stem.
In the isometric nets paper, Table 2 also suggests they're using 1x1, while text in bottom-left of Page4 suggests 3x3. Hard to tell without code, I'll reach out to the authors.

From my current experiment, I am guessing that the 1x1 in your paper is a typo and should be 3x3. However, it appears as 1x1 both in Fig1 and Tab2, making this unlikely. So I'm looking forward to your clarification.

mrT23 · 2020-07-27T13:13:16Z

Hi Lucas.
Thanks for the comment.
You are correct, there is a mismatch between the code and the paper.
the code is correct, it should be 3x3

don't be surprised that 1x1 and 3x3 conv are similar on GPU (from my past tests, 3x3 is usual 1.5-2 times slower).
that's because GPUs are limited by memory access, not flops. due to caching, 3x3 conv has a similar memory cost as 1x1 conv.

i don't think replacing the 3x3 conv with 1x1 should give Nans. make sure you initialize it properly

Tal

lucasb-eyer · 2020-07-27T20:49:16Z

Thanks for your quick answer! Yeah, I agree the NaNs is suspicious and unexpected.

However, using 1x1 with s2d will make the receptive-field size of the stem only 4x4, whereas original stem has 11x11 after the max-pool, and this difference will amplify a lot throughout the full network. Using 3x3 with s2d makes the receptive field of the stem 12x12, so almost exactly the same as original. So, I agree with your code, 3x3 makes a lot more sense.

mrT23 closed this as completed Aug 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1x1 or 3x3 stem conv? #22

1x1 or 3x3 stem conv? #22

lucasb-eyer commented Jul 26, 2020

lucasb-eyer commented Jul 26, 2020

mrT23 commented Jul 27, 2020 •

edited

lucasb-eyer commented Jul 27, 2020

1x1 or 3x3 stem conv? #22

1x1 or 3x3 stem conv? #22

Comments

lucasb-eyer commented Jul 26, 2020

lucasb-eyer commented Jul 26, 2020

mrT23 commented Jul 27, 2020 • edited

lucasb-eyer commented Jul 27, 2020

mrT23 commented Jul 27, 2020 •

edited