Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Roadmap] v0.4 release tracker #666

Closed
17 of 22 tasks
jermainewang opened this issue Jun 17, 2019 · 12 comments
Closed
17 of 22 tasks

[Roadmap] v0.4 release tracker #666

jermainewang opened this issue Jun 17, 2019 · 12 comments
Milestone

Comments

@jermainewang
Copy link
Member

jermainewang commented Jun 17, 2019

Tentative release date: 09/30

[Feature] Heterogenous graph

This has been a high-demanding feature since the birth of DGL. It is finally the time to push for this. v0.4 will be majorly about this support, this includes but not limited to:

  • A new DGLHeteroGraph class and relevant APIs.
  • Adapt our message passing kernels to heterograph scenarios.
  • Sampling on heterograph.
  • Model demonstration
    • RGCN
    • GCMC
    • One metapath-based model (e.g. HAN)

Tracker

[Feature] Global pooling module (Done in v0.3.1)

Our current graph pooling (readout) support is limited, with only basic sum/max readout operation. In v0.4, we want to enrich this part.

[Feature] Enrich NN modules (Mostly done in v0.3.1)

Tracker

@yzh119 please update.

[Feature] Unified graph data format

The idea is to define our own data storage format and provide easy utilities to convert, load, save to/from such format. (RFC #758 )

Tracker:

[Application] Knowledge base embedding

Tracker:

  • Implementation of popular network embedding methods
    • TransX, TransE, CompleX
    • DistMult
    • RotateE
  • Dataset
    • FB16K
    • Full freebase
  • Release training script of
    • Single machine multi-process
    • Single machine multi-GPU
  • Release of pre-trained embeddings

Other

  • Update tutorial (at glance and message passing tutorial). Shift the focus to built-in functions and NN modules.

Postpone to v0.5

  • [Feature] Distributed KVStore for embeddings: We wish to implement our own distributed KVStore that can store embeddings on multiple machines. If it's too rush, we could postpone this to the next cycle.
@jermainewang jermainewang added this to the v0.4 milestone Jun 17, 2019
@jermainewang jermainewang pinned this issue Jun 17, 2019
@mufeili
Copy link
Member

mufeili commented Jun 17, 2019

For the pooling module, shall we also support common clustering algorithms (KNN, spectral clustering, ...)?

@jermainewang
Copy link
Member Author

For the pooling module, shall we also support common clustering algorithms (KNN, spectral clustering, ...)?

I think we will focus on DL-based pooling methods in this release. For KNN and spectral, I would suggest converting our graph to numpy/scipy and use sklearn. If the conversion could be handled carefully (probably with zero-copy support), it should be very efficient.

@yzh119
Copy link
Member

yzh119 commented Jun 17, 2019

Should GraphSage also be included in NN modules?
And Set Transformer, is also a kind of graph pooling mechanism, if we have time we could try this.

@aksnzhy
Copy link
Contributor

aksnzhy commented Jun 18, 2019

The CPU-based kvstore can be released in 0.4. The GPU-direct kvstore could be in the next cycle.

@tbright17
Copy link

The self-attention graph pooling is simpler but more powerful than diffpool: https://arxiv.org/abs/1904.08082. Could be good if it's included.

@HQ01
Copy link
Contributor

HQ01 commented Jun 20, 2019

The self-attention graph pooling is simpler but more powerful than diffpool: https://arxiv.org/abs/1904.08082. Could be good if it's included.

Just want to mention that there is an inconsistency between DiffPool's reported experiment results and Self-attention graph pooling paper's reported DiffPool results though.

@tbright17
Copy link

The self-attention graph pooling is simpler but more powerful than diffpool: https://arxiv.org/abs/1904.08082. Could be good if it's included.

Just want to mention that there is an inconsistency between DiffPool's reported experiment results and Self-attention graph pooling paper's reported DiffPool results though.

Wow the gap is really big...

@mufeili
Copy link
Member

mufeili commented Jul 18, 2019

Depending on our bandwidth, we may want to add examples for three important applications:

  1. Molecule Property Prediction: Molecular graphs are probably among the most important applications for small graphs. For this area, Neural Message Passing for Quantum Chemistry
    can be a good example candidate. During our discussion with Tencent Alchemy team, this model has achieved the best performance among previous work on the quantum chemistry tasks they are interested. It has also been previously mentioned in the discussion forum here. I will take this.
  2. Point Cloud: An important topic for constructing graphs over non-graph data and bridging graph computing with CV and graphics, as mentioned in Issue # 719.
  3. Geometry/3D data: The latest wave of deep learning on graphs has a strong correlation with geometric data and can be collectively considered as geometric deep learning. There can be a high interest of applying graph neural networks to more general geometric data, as mentioned in a discussion thread before.

@jermainewang jermainewang changed the title [Roadmap] v0.4 release draft [Roadmap] v0.4 release tracker Sep 9, 2019
@jermainewang
Copy link
Member Author

Changed the draft to a progress tracker. The target release date is 09/30.

For all committers @zheng-da @szha @BarclayII @VoVAllen @ylfdq1118 @yzh119 @GaiYu0 @mufeili @aksnzhy @zzhang-cn @ZiyueHuang , please vote with +1 if you agree with this plan.

@aksnzhy
Copy link
Contributor

aksnzhy commented Sep 10, 2019

@jermainewang Actually the kvstore has been finished and we have already finish
a demo to training distributed DistMult on FB15k data. If we should release this demo on 0.4?

@jermainewang
Copy link
Member Author

@jermainewang Actually the kvstore has been finished and we have already finish
a demo to training distributed DistMult on FB15k data. If we should release this demo on 0.4?

Yes. Let's push for the feature, but it's OK if we think it needs more time to polish and we could highlight it in v0.5.

@jermainewang
Copy link
Member Author

v0.4 has been released. Thanks everyone for the support.

@jermainewang jermainewang unpinned this issue Oct 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants