Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about self_attn #12

Closed
Asthestarsfalll opened this issue Feb 26, 2022 · 3 comments
Closed

Some questions about self_attn #12

Asthestarsfalll opened this issue Feb 26, 2022 · 3 comments

Comments

@Asthestarsfalll
Copy link

Hi.

  1. Why there is no premute operation before view in mode h?
# for mode h
projected_query = self.query_conv(x).premute(0, 1, 3, 2).view(*view).permute(0, 2, 1)
  1. Why use sigmoid instead of softmax?
@AngeLouCN
Copy link
Owner

Hi, I do not think need premute. For the mode = 'h', the shape of projected_query is (batch_size, height, channel*weight), projected_key is (batch_size, channel*weight, height). And the attention_map is projected_query * projected_key, and the shape is (batch_size, height, height). The shape of projected_value is (batch_size, channel*weight, height). The output is projected_value * attention_map, and we reshape the output from (batch_size, channel*weight, height) to (batch_size, channel, weight, height). For the mode = 'w' is same. Actually, you may find some repositories, they use the similar way to define self_atten.

For the sigmoid is I find that in our network, sigmoid can get better results

@Asthestarsfalll
Copy link
Author

Asthestarsfalll commented Feb 26, 2022

I think premute is necessary. Although the shape of those values are correct to calculate,it has a very different meaning for mode h comparing to mode w. Without premute, the projected_query can't actually collect the columns to the dimension with size Hight
For example:
05a71f2a6467904872c9ab25a141c6e
For mode W, the way of reshape is correct.
7fd8c2a37637c448a616b22e3780c9a
Without permute for mode H:
14f74a104c974efd101dc50aeb742bc
With permute for mode H:
16da29c8008298433cae511b9bc0b00
[0, 5, 10, 15] is the column of a.

@Asthestarsfalll
Copy link
Author

see here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants