Memory consumption of the GGNN module #852

m09 · 2019-09-10T15:23:40Z

🐛 Bug

The memory consumption of the GGNN module prevents its application to medium/big graphs.

To Reproduce

Steps to reproduce the behavior:

Run a GGNN on anything with >10k edges and it'll use gigs and gigs of memory.

Expected behavior

GGNN should not use lots of memory.

Environment

Confirmed to eat a lot of memory on the following environment:

DGL Version (e.g., 1.0): 0.3.1
Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3): PyTorch 1.1.0
OS (e.g., Linux): Ubuntu 18.04
How you installed DGL (conda, pip, source): pip install dgl-cu100
Build command you used (if compiling from source):
Python version: 3.7
CUDA/cuDNN version (if applicable): 10
GPU models and configuration (e.g. V100): GTX 1070
Any other relevant information:

Additional context

The problem stems from the formulation used in the implementation, where each edge (containing a type) is embedded: the linear transformation for each type being modelled this way, the weights are replicated as many times as there are edges of a given type. See https://github.com/src-d/formatml/blob/master/formatml/modules/graph_encoders/ggnn.py for an implementation that maintains explicit Linears and doesn't have this problem (it can probably be improved a lot btw, just giving a link here to outline the difference in implementation).

The text was updated successfully, but these errors were encountered:

yzh119 · 2019-09-10T15:44:19Z

Thanks for reporting this.
There are two kind of implementations, the first one is to save the weight on edges, and the memory cost of O(E * d_in * d_out); the second one is to first project the node feature with all relation matrices, while saving the memory footprint, the time complexity is high (O(R * d_in * d_out), when R is large, this implementation is not efficient.

We are considering balancing the time/space complexity with a set of new kernels, and we will notify you of our further updates.

m09 · 2019-09-10T15:54:00Z

Thanks for the fast reply. If both use cases (a huge number of relations + a small enough graph, and any number of relations + a medium/big graph) are common enough, it may be worth splitting into 2 different implementations and let the user decide which one is more adapted.

yzh119 · 2019-09-28T09:27:21Z

Yes, considering we have no plan of new kernels recently, I'll refactor the code and uses the implementation you mentioned. For dgl v0.5 we would have better solution with new segment ops.
Thanks!

yzh119 · 2019-09-30T07:24:04Z

@m09 how do you like this one? https://github.com/dmlc/dgl/blob/9c9ec79f63c408a53eb7005816b885b178509aab/python/dgl/nn/pytorch/conv/gatedgraphconv.py

m09 · 2019-09-30T09:09:54Z

Looks good to me 💯

yzh119 mentioned this issue Sep 30, 2019

[NN] nn modules & examples update #890

Merged

6 tasks

jermainewang closed this as completed Mar 23, 2020

smith-co mentioned this issue Apr 1, 2022

Running machine translation using different GNNs graph4ai/graph4nlp#536

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory consumption of the GGNN module #852

Memory consumption of the GGNN module #852

m09 commented Sep 10, 2019

yzh119 commented Sep 10, 2019

m09 commented Sep 10, 2019

yzh119 commented Sep 28, 2019 •

edited

yzh119 commented Sep 30, 2019

m09 commented Sep 30, 2019

Memory consumption of the GGNN module #852

Memory consumption of the GGNN module #852

Comments

m09 commented Sep 10, 2019

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

yzh119 commented Sep 10, 2019

m09 commented Sep 10, 2019

yzh119 commented Sep 28, 2019 • edited

yzh119 commented Sep 30, 2019

m09 commented Sep 30, 2019

yzh119 commented Sep 28, 2019 •

edited