Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Roadmap] v0.4 release draft #666

Open
jermainewang opened this issue Jun 17, 2019 · 8 comments

Comments

Projects
None yet
6 participants
@jermainewang
Copy link
Member

commented Jun 17, 2019

Hey everyone, with v0.3 being released last week, it's time to move forward again! Here is a draft plan for v0.4 release.

[Feature] Heterogenous graph

This has been a high-demanding feature since the birth of DGL. It is finally the time to push for this. v0.4 will be majorly about this support, this includes but not limited to:

  • A new DGLHeteroGraph class and relevant APIs.
  • Adapt our message passing kernels to heterograph scenarios.
  • Sampling on heterograph.
  • Models and tutorials.

The plan still needs refinement. Please reply with your thoughts.

[Feature] Global pooling module

Our current graph pooling (readout) support is limited, with only basic sum/max readout operation. In v0.4, we want to enrich this part. Here is a tentative list of pooling operations to be included:

  • Set2Set
  • Attention pooling
  • Diffpool
  • mean readout
  • more efficient sum/max/mean readout using dedicated kernels
  • UNet
  • Unpooling (sending readout results back to node/edge features)

[Feature] Unified graph data format and loader

This is a leftover item from v0.3 roadmap. The idea is to define our own data storage format and provide easy utilities to convert, load, save to/from such format.

  • Define the data format
  • Efficient loader and saver
  • Convert all currently hosted dataset into such format.

[Feature] Distributed KVStore for embeddings

Depending on the bandwidth, we wish to implement our own distributed KVStore that can store embeddings on multiple machines. If it's too rush, we could postpone this to the next cycle.

[Feature] Knowledge base modules

Depending on the bandwidth, we wish to provide a module that includes common algorithms for training embeddings on knowledge graph. If it's too rush, we could postpone this to the next cycle.

Models & Examples

To demonstrate our new heterograph APIs, we need models and examples. Here is a tentative list:

  • Metapath2vec
  • GCMC
  • Heterogenous Attention Network
  • GMC

NN Module

These are the leftovers from v0.3.

  • GGNN
  • GAT

Please leave your feedback. Thank you!

@jermainewang jermainewang added this to the v0.4 milestone Jun 17, 2019

@jermainewang jermainewang pinned this issue Jun 17, 2019

@mufeili

This comment has been minimized.

Copy link
Member

commented Jun 17, 2019

For the pooling module, shall we also support common clustering algorithms (KNN, spectral clustering, ...)?

@jermainewang

This comment has been minimized.

Copy link
Member Author

commented Jun 17, 2019

For the pooling module, shall we also support common clustering algorithms (KNN, spectral clustering, ...)?

I think we will focus on DL-based pooling methods in this release. For KNN and spectral, I would suggest converting our graph to numpy/scipy and use sklearn. If the conversion could be handled carefully (probably with zero-copy support), it should be very efficient.

@yzh119

This comment has been minimized.

Copy link
Member

commented Jun 17, 2019

Should GraphSage also be included in NN modules?
And Set Transformer, is also a kind of graph pooling mechanism, if we have time we could try this.

@aksnzhy

This comment has been minimized.

Copy link
Collaborator

commented Jun 18, 2019

The CPU-based kvstore can be released in 0.4. The GPU-direct kvstore could be in the next cycle.

@tbright17

This comment has been minimized.

Copy link

commented Jun 20, 2019

The self-attention graph pooling is simpler but more powerful than diffpool: https://arxiv.org/abs/1904.08082. Could be good if it's included.

@HQ01

This comment has been minimized.

Copy link
Collaborator

commented Jun 20, 2019

The self-attention graph pooling is simpler but more powerful than diffpool: https://arxiv.org/abs/1904.08082. Could be good if it's included.

Just want to mention that there is an inconsistency between DiffPool's reported experiment results and Self-attention graph pooling paper's reported DiffPool results though.

@tbright17

This comment has been minimized.

Copy link

commented Jun 20, 2019

The self-attention graph pooling is simpler but more powerful than diffpool: https://arxiv.org/abs/1904.08082. Could be good if it's included.

Just want to mention that there is an inconsistency between DiffPool's reported experiment results and Self-attention graph pooling paper's reported DiffPool results though.

Wow the gap is really big...

@mufeili

This comment has been minimized.

Copy link
Member

commented Jul 18, 2019

Depending on our bandwidth, we may want to add examples for three important applications:

  1. Molecule Property Prediction: Molecular graphs are probably among the most important applications for small graphs. For this area, Neural Message Passing for Quantum Chemistry
    can be a good example candidate. During our discussion with Tencent Alchemy team, this model has achieved the best performance among previous work on the quantum chemistry tasks they are interested. It has also been previously mentioned in the discussion forum here. I will take this.
  2. Point Cloud: An important topic for constructing graphs over non-graph data and bridging graph computing with CV and graphics, as mentioned in Issue # 719.
  3. Geometry/3D data: The latest wave of deep learning on graphs has a strong correlation with geometric data and can be collectively considered as geometric deep learning. There can be a high interest of applying graph neural networks to more general geometric data, as mentioned in a discussion thread before.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.