Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GAT model #37

Merged
merged 2 commits into from
Aug 10, 2018
Merged

GAT model #37

merged 2 commits into from
Aug 10, 2018

Conversation

jermainewang
Copy link
Member

Graph attention networks. Also fixed some bugs.

@jermainewang jermainewang mentioned this pull request Aug 9, 2018
37 tasks
@jermainewang
Copy link
Member Author

Benchmark results: https://docs.google.com/spreadsheets/d/1qhKvNfPqYLrnSbBH4CP6KLVpvm5TOsPPtrFLTI7H_NU/edit?usp=sharing

Summary:
We compare the DGL implementation with the official implementation by the author: https://github.com/PetarV-/GAT
The author has two implementation:

  • Dense: It treats the adj matrix as a tense matrix and use dense MatMult and dense softmax.
  • Sparse: It uses sparse operations like SPMV and sparse_softmax.
    Note that the author uses Tensorflow while DGL uses Pytorch backend.

image

Naive implementation (looping over every node and do message passing) is extremely bad. It is 50-100x slower than the batched implementation. The DGL-GAT-batch-GPU is not much faster than CPU, which shows the significant overhead in the python side. The TF-GAT is much faster than current DGL-GAT for two reasons:

  • Degree-bucketing is avoided in both dense and sparse implementation. For example, in the dense version, they fill the empty 0 slot with -Inf so softmax can be directly applied.
  • DGL cannot use SPMV due to the custom reducer. TF-GAT-sparse is able to use SPMV. They use a sparse matrix similar to the adj matrix but with the non-zero values equal to the attention value. They can then multiple this sparse matrix with the node features to get the attended features from neighbors.

These two optimizations cannot be easily adopted by DGL right now as it requires us to understand what softmax and the user-defined reduce function is doing. We plan this to be future work.

@jermainewang jermainewang merged commit ee24169 into master Aug 10, 2018
@jermainewang jermainewang deleted the gcn-like branch August 10, 2018 02:48
Qksidmx pushed a commit to Qksidmx/dgl that referenced this pull request Apr 25, 2022
Thanks for providing requirements.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant