why the speed slower than pvtv2-b1? #18

lucasjinreal · 2021-12-20T06:35:03Z

Recently I trained a transformer based instance seg model, tested with different backbone, here is the result and speed test:

batchsize is training batchsize. Why the speed of poolformer is the slowest one? is that normal？

Slower than pvtv2-b1 and precision less than it...

yuweihao · 2021-12-20T06:51:30Z

Hi @jinfagang , PoolFormer here is just a tool to demonstrate MetaFormer. The implementation may not be efficient for industrial use. For example, nn.AvgPool2d may not be optimized well in CUDA. It can be replaced with DW Conv self.token_mixer = nn.Conv2d(in_channels=dim, out_channels=dim, kernel_size=3, stride=1, padding=1, groups=dim) to speed up. For GroupNorm, I still don't know how to speed up it currently.

lucasjinreal · 2021-12-20T11:29:54Z

@yuweihao Hi, does that means I should train whole model again if change nn.AvgPool2d to DW conv?

chuong98 · 2021-12-21T01:39:07Z

@jinfagang You don;t have to. In our experiment, replacing GN with BN, and then reimplement the Poollayer with a fixed, predefined DepthWise conv, gives us about 30% speed up, and the accuracy drop 1% on ImageNet.
If use BN, you can fuse the Conv-BN to speed it up.

lucasjinreal · 2021-12-21T05:04:57Z

@chuong98 can you show your pretrained fixed DepthWise conv? how to reset it's weights?

yuweihao · 2021-12-21T18:55:28Z

Hi, @jinfagang , @chuong98 , I just found it seems that CUDA much prefers NHWC rather than NCHW [1]. However, NCHW is used by default in PyTorch and PoolFormer also uses this layout. This may also be optimized to further speed it up [2].

The figure is from [1].

[1] https://docs.nvidia.com/deeplearning/performance/dl-performance-convolutional/index.html#tensor-layout
[2] https://pytorch.org/tutorials/intermediate/memory_format_tutorial.html

yuweihao closed this as completed Mar 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why the speed slower than pvtv2-b1? #18

why the speed slower than pvtv2-b1? #18

lucasjinreal commented Dec 20, 2021 •

edited

yuweihao commented Dec 20, 2021 •

edited

lucasjinreal commented Dec 20, 2021

chuong98 commented Dec 21, 2021 •

edited

lucasjinreal commented Dec 21, 2021

yuweihao commented Dec 21, 2021 •

edited

why the speed slower than pvtv2-b1? #18

why the speed slower than pvtv2-b1? #18

Comments

lucasjinreal commented Dec 20, 2021 • edited

yuweihao commented Dec 20, 2021 • edited

lucasjinreal commented Dec 20, 2021

chuong98 commented Dec 21, 2021 • edited

lucasjinreal commented Dec 21, 2021

yuweihao commented Dec 21, 2021 • edited

lucasjinreal commented Dec 20, 2021 •

edited

yuweihao commented Dec 20, 2021 •

edited

chuong98 commented Dec 21, 2021 •

edited

yuweihao commented Dec 21, 2021 •

edited