Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to understand “dropping all structural information” #27

Closed
guoyejun opened this issue Jun 12, 2019 · 3 comments
Closed

how to understand “dropping all structural information” #27

guoyejun opened this issue Jun 12, 2019 · 3 comments

Comments

@guoyejun
Copy link

guoyejun commented Jun 12, 2019

Hi,

I'm trying to study GAT, and awesome works!

I'm wondering how to understand "without ... depending on knowing the graph structure upfront", and "dropping all structural information".

I also see "we only compute eij for nodes i belongs Ni, where Ni is some neighborhood of node i in the graph" in the paper, i also see function adj_to_bias() in the code which requires to know the adjacent matrix (the edges).

So my understanding is that we do need to know the graph structure (edge information) at the beginning, thanks.

@guoyejun guoyejun changed the title how to understand 'dropping all structual information how to understand 'dropping all structural information Jun 12, 2019
@guoyejun guoyejun reopened this Jun 12, 2019
@guoyejun guoyejun changed the title how to understand 'dropping all structural information how to understand “dropping all structural information” Jun 12, 2019
@PetarV-
Copy link
Owner

PetarV- commented Jun 12, 2019

Hi Yejun,

Thank you for the issue, and your kind interest in GAT!

The phrase "without depending on the graph structure upfront" refers to the training/testing routine. Namely, GAT is an inductive method---the mechanism it learns is in principle not conditioned on the graph it has been trained on. This means that, at test time, you can apply GAT to any structure you'd like (including ones unseen at training time). This is in stark contrast to many methods that were published before (which were transductive, and wouldn't in theory work outside of the graph they were trained on).

Regarding the second phrase, I believe you've misread the paper a little bit. From what I recall, the phrase appears here:

"In its most general formulation, the model allows every node to attend on every other node, dropping all structural information"

This is the formulation before masked attention is introduced (i.e. we just do all-pairs self-attention as in the Transformer paper). Indeed, in this version the graph is not used at all. Afterwards we introduce the neighbourhoods, and the graph structure is injected.

So, to confirm, the GAT model does not drop all structural information. It uses the local adjacency information of every node to determine which other nodes to attend over. That being said, it only needs the local information (i.e. a node does not need to know anything about a node that is outside of its neighbourhood).

Hope that helps! Let me know more clarification is needed.

Thanks,
Petar

@guoyejun
Copy link
Author

thanks Petar!

btw, how does GAT consider about edge with an arrow. My understanding is that it depends on the definition of 'neighborhood', it means that alpha of one direction is learned/calculated, while alpha of the other direction is just zero.

@PetarV-
Copy link
Owner

PetarV- commented Jun 19, 2019

Hi Yejun,

The general answer is "it's up to you". :)

In the simplest case, as you suggested, the attention is simply not computed over one direction. Other authors like to include a notion of two "edge types" (inbound/outbound) and learn a separate set of attention heads for each edge type. I'd say -- it really depends on the problem you're trying to solve (and how expressive the edges actually are semantically), but ultimately the framework is quite flexible with respect to how you choose to approach this.

Thanks,
Petar

@PetarV- PetarV- closed this as completed Jul 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants