Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about Cellular Attention Mechanism/Network #109

Open
AbrahamRabinowitz opened this issue Jun 6, 2023 · 1 comment
Open

Questions about Cellular Attention Mechanism/Network #109

AbrahamRabinowitz opened this issue Jun 6, 2023 · 1 comment

Comments

@AbrahamRabinowitz
Copy link
Contributor

I have a couple of clarifying questions about cellular attention networks and the code for the attention mechanism in the conv and message passing files.

  • I may be mistaken, but it seems like there is no normalization being done for the attention coefficients in the current implementation of attention (the reference paper uses softmax for this purpose). Should we leave the current attention mechanism as is or is it worth rewriting the code to implement the normalization?
  • With regards to the tensor diagram for CAN's when it comes to the neighborhood aggregation the tensor diagram here calls for applying a non-linearity to each within neighborhood aggregation and then performing the inter-neighborhood aggregation, whereas the referenced CAN paper performs the inter-neighborhood aggregation and then applies the non-linearity. Would it be ok to go with the formula given by the paper? This would also make it so our implementation can reduce to the Hodge Laplacian layer in the referenced Rodenberry et al. paper when the option to use attention is set to false.
@mhajij
Copy link
Member

mhajij commented Jun 24, 2023

I have a couple of clarifying questions about cellular attention networks and the code for the attention mechanism in the conv and message passing files.

  • I may be mistaken, but it seems like there is no normalization being done for the attention coefficients in the current implementation of attention (the reference paper uses softmax for this purpose). Should we leave the current attention mechanism as is or is it worth rewriting the code to implement the normalization?
  • With regards to the tensor diagram for CAN's when it comes to the neighborhood aggregation the tensor diagram here calls for applying a non-linearity to each within neighborhood aggregation and then performing the inter-neighborhood aggregation, whereas the referenced CAN paper performs the inter-neighborhood aggregation and then applies the non-linearity. Would it be ok to go with the formula given by the paper? This would also make it so our implementation can reduce to the Hodge Laplacian layer in the referenced Rodenberry et al. paper when the option to use attention is set to false.

hi @AbrahamRabinowitz

(1) I would suggest you normalize as done by the original.
(2) Also the same, please go by what the original paper suggests. We will update the tensor diagram accordingly later.
@mathildepapillon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants