Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The visualization of decoder attention_weight #30

Open
sally1913105 opened this issue Apr 27, 2021 · 3 comments
Open

The visualization of decoder attention_weight #30

sally1913105 opened this issue Apr 27, 2021 · 3 comments

Comments

@sally1913105
Copy link

image
I want to visualization attention_weight of decoder moudle, I take the output of multihead_attn in the last layer of decoder ,but the shape is that(bs,360,36hw) where h*w is the shape of feature map, I don't understand that there are 36 different attention_weight with the same instances of the same frame as the picture show
Can you explain what this means

@Epiphqny
Copy link
Owner

Hi @sally1913105, we compute the spatial and temporal attention, then for 36 frames sequence there are 36 attention weights for each prediction, even the prediction is for a specific frame. In this way, the features from other frames could help the segmentation of this frame.

@sally1913105
Copy link
Author

Thank you for your answer! Can I think of it as within the 36 attention weights of ith prediction only ith attention weights is for ith features and others attention weights is for other features? but How to combine these 36 attention weights?

@Epiphqny
Copy link
Owner

hi @sally1913105, for each prediction we only use the attention weights of the corresponding frame in this stage. The weights do not need to be combined. Interaction with other frames is realized by the following 3D convolutions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants