Question about Local Self-Attention of your code #6

Huzhen757 · 2021-10-25T08:24:52Z

Hi，I‘m very interested in your work about the Local Self-Attention and feature fusion in Transformer。But I have a doubt that Because the input image size for the image classification task in the source code is fixed, 224 or 384, in other words, the size is an integer multiple of 32. If the input size is not fixed, for example the detection task, the input is 800x1333, although the feature map can be divided into window size windows by using padding, but for the key_ padding_ mask, how should the mask be handled?

The shape of attention weights map is [bs x H/7 x W/7, 49, 49], default there window size is 7, but the key padding mask shape is
[1, HW], so how can I convert this mask to match the attention weights map。

I sincerely hope you can give me some advice about this question. Thanks !

LayneH · 2021-10-25T08:59:51Z

Hi,

Thank you for your interest.
Actually, our Local Self-Attention does not require the input size to be divisible by 32 or 7.
If the input size is not divisible by 7, we simply pad the input feature map (as in this line) to ensure each local window contains 7x7 pixels.

Huzhen757 · 2021-10-25T09:13:52Z

Thanks for your reply。
I want to know how the key padding mask matches attention weights map ？ like this：
if key_padding_mask is not None: attn_output_weights = attn_output_weights.view( bsz, num_heads, tgt_len, src_len ) attn_output_weights = attn_output_weights.masked_fill( key_padding_mask.unsqueeze(1).unsqueeze(2), float("-inf"), ) attn_output_weights = attn_output_weights.view( bsz * num_heads, tgt_len, src_len )
Line 325 to 335

LayneH · 2021-10-25T12:52:44Z

In Local Self-Attention, the key_padding_mask is always None, which means that we do not mask out the padded zeros in the self-attention operation.
If you want to use this argument for your own reasons, I suggest you first divide the mask according to the window size and then pass it as the key_padding_mask argument.

Huzhen757 · 2021-10-26T04:51:46Z

Ok, Fine, I will try it by myself.
Thanks for your carefully reply.

PkuRainBow closed this as completed Oct 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about Local Self-Attention of your code #6

Question about Local Self-Attention of your code #6

Huzhen757 commented Oct 25, 2021 •

edited

LayneH commented Oct 25, 2021

Huzhen757 commented Oct 25, 2021 •

edited

LayneH commented Oct 25, 2021

Huzhen757 commented Oct 26, 2021

Question about Local Self-Attention of your code #6

Question about Local Self-Attention of your code #6

Comments

Huzhen757 commented Oct 25, 2021 • edited

LayneH commented Oct 25, 2021

Huzhen757 commented Oct 25, 2021 • edited

LayneH commented Oct 25, 2021

Huzhen757 commented Oct 26, 2021

Huzhen757 commented Oct 25, 2021 •

edited

Huzhen757 commented Oct 25, 2021 •

edited