Question about GSA #9

kejie-cn · 2021-05-24T01:03:55Z

Hello, thank you very much for your excellent work. I have some questions about GSA. According to my personal understanding, GSA in the paper takes one representation from each window, so the sr_ratio should be the same as the window size ([7, 7, 7, 7]) when calculating Key and Value, but it is [8, 4, 2, 1] in the code. Is there anything wrong with my understanding?

@BACKBONES.register_module()
class alt_gvt_large(ALTGVT):
    def __init__(self, **kwargs):
        super(alt_gvt_large, self).__init__(
            patch_size=4, embed_dims=[128, 256, 512, 1024], num_heads=[4, 8, 16, 32], mlp_ratios=[4, 4, 4, 4], qkv_bias=True,
            norm_layer=partial(nn.LayerNorm, eps=1e-6), depths=[2, 2, 18, 2], wss=[7, 7, 7, 7], sr_ratios=[8, 4, 2, 1],
            extra_norm=True, drop_path_rate=0.3,
        )

cxxgtxy · 2021-05-24T03:14:48Z

Thanks for your attention.
You are right, and we will make it more clear in the next version of the paper.
If we use 7 x 7 in the last stage, there will be only 1 key (the feature map is 7 x 7). Therefore, in such case GSA is normal global self attention (see the code).
As for stage 3 (feature size 14 x 14), if we use sr=7, there will be only 4 keys, which will limit the representative power of the network.
In general, GSA is a mechanism of collecting global information efficiently. In fact, you can try different sr_ratios in your project.

kejie-cn · 2021-05-24T05:21:37Z

Thanks for your attention.
You are right and e will make it more clear in the next version of the paper.
If we use 7_7 in the last stage, there will be only 1 key (the feature map is 7_7). Therefore, in such case GSA is normal global self attention (see the code).
As for stage 3 (feature size 14*14), if we use sr=7, there will be only 4 keys, which will limit the representative power of the network.
In general, GSA is a mechanism of collecting global information efficiently. In fact, you can try different sr_ratios in your project.

I see. Thanks for your reply.

* add test tutorial * remote torch/torchvision from requirements * update getting started * rename drop_out_ratio -> dropout_ratio

kejie-cn closed this as completed May 24, 2021

littleSunlxy pushed a commit to littleSunlxy/Twins that referenced this issue Nov 4, 2021

Add test tutorial (Meituan-AutoML#9)

b975d3b

* add test tutorial * remote torch/torchvision from requirements * update getting started * rename drop_out_ratio -> dropout_ratio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about GSA #9

Question about GSA #9

kejie-cn commented May 24, 2021

cxxgtxy commented May 24, 2021 •

edited

kejie-cn commented May 24, 2021

Question about GSA #9

Question about GSA #9

Comments

kejie-cn commented May 24, 2021

cxxgtxy commented May 24, 2021 • edited

kejie-cn commented May 24, 2021

cxxgtxy commented May 24, 2021 •

edited