[Model] GraphSAGE with control variate sampling on new sampler #1355

BarclayII · 2020-03-12T15:58:07Z

Description

This implements control variate sampling from https://arxiv.org/abs/1710.10568 with the new sampler interface.

@zheng-da @lingfanyu Please take a look if you are interested.

TODO

Profile performance (speed and memory)
Verify correctness on multi-GPU
Better document the code
~~Refactor the code with vanilla GraphSAGE and extract common logic~~ Skipping this for now.

Checklist

Please feel free to remove inapplicable items for your PR.

The PR title starts with [$CATEGORY] (such as [NN], [Model], [Doc], [Feature]])
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
Code is well-documented
To the my best knowledge, examples are either not affected by this change,
or have been fixed to be compatible with this change

lingfanyu · 2020-03-14T23:10:39Z

Wondering why not start with a non-distributed version?

BarclayII · 2020-03-16T06:25:22Z

There's not much difference between a non-distributed version and a distributed one: the only differences are to initialize a distributed context and synchronize the gradients among GPUs.

zheng-da · 2020-04-18T23:12:11Z

examples/pytorch/graphsage/train_cv.py

+    """
+    Copys features and labels of a set of nodes onto GPU.
+    """
+    blocks[0].srcdata['features'] = g.ndata['features'][blocks[0].srcdata[dgl.NID]].to(dev_id)


btw, the original paper plays a trick. it aggregates the features in the first layer in advance.

examples/pytorch/graphsage/train_cv_multi_gpu.py

zheng-da · 2020-04-18T23:21:40Z

examples/pytorch/graphsage/train_cv_multi_gpu.py

+        else:
+            assert isinstance(exception, Exception)
+            raise exception.__class__(trace)
+    return decorated_function


can we keep this somewhere inside DGL?

Totally agree. Will put it in utils.

It is everywhere in our newly contributed examples already.

I'm now wondering if utils is a good place to put this function, since it depends on PyTorch multiprocessing module.

zheng-da · 2020-04-18T23:25:25Z

Why not follow the original example?

BarclayII · 2020-04-20T14:04:37Z

I thought using push could save communication compared to using pull. My bad. Will change it.

…graphsage-cv

jermainewang · 2020-04-26T10:40:57Z

examples/pytorch/graphsage/train_cv_multi_gpu.py

+            hist_col = 'hist_%d' % (i + 1)
+
+            h_new = block.dstdata['h_new'].cpu()
+            g.ndata[hist_col][ids] = h_new


Will there be data race issue?

The children don't share the same frame apparently.

jermainewang

LGTM

jermainewang · 2020-04-26T10:45:47Z

Other suggestions:

Add the training results to the README.
Does control variate based sampling converge faster? If so, you could highlight that in the README too.
There are many code duplication in the train_cv.py and train_cv_multi_gpu.py. I can see putting everything in one script is potentially easier for our users to copy-paste. I'll leave to you to decide whether to put them in a common file.

BarclayII · 2020-04-26T13:42:59Z

I'll leave them as is.

BarclayII closed this Mar 13, 2020

BarclayII reopened this Mar 13, 2020

BarclayII added 4 commits March 28, 2020 07:22

control variate first commit

9ca918c

bug fixes

e6aa018

split to single and multi GPU

135072f

update readme

e7e2d6b

BarclayII force-pushed the graphsage-cv branch from 6b1ab16 to e7e2d6b Compare March 28, 2020 11:54

BarclayII marked this pull request as ready for review March 28, 2020 11:55

BarclayII and others added 8 commits March 29, 2020 04:23

bugfix

c02b169

Merge branch 'master' into graphsage-cv

049375a

bugfix

8cc30b4

Merge branch 'master' into graphsage-cv

028e235

Merge branch 'master' into graphsage-cv

2ea5579

Merge branch 'master' into graphsage-cv

2baed04

Merge branch 'master' into graphsage-cv

b86b076

Merge branch 'master' into graphsage-cv

f60964d

zheng-da reviewed Apr 18, 2020

View reviewed changes

BarclayII added 3 commits April 23, 2020 13:15

remove push

d8167a8

Merge branch 'graphsage-cv' of https://github.com/BarclayII/dgl into …

83c0bf4

…graphsage-cv

bugfix on multi gpu

f6e271a

jermainewang reviewed Apr 26, 2020

View reviewed changes

jermainewang approved these changes Apr 26, 2020

View reviewed changes

update README

b72223f

Merge branch 'master' into graphsage-cv

65c6e94

BarclayII merged commit 97bb85d into dmlc:master Apr 26, 2020

BarclayII deleted the graphsage-cv branch April 26, 2020 14:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] GraphSAGE with control variate sampling on new sampler #1355

[Model] GraphSAGE with control variate sampling on new sampler #1355

BarclayII commented Mar 12, 2020 •

edited

Loading

lingfanyu commented Mar 14, 2020

BarclayII commented Mar 16, 2020

zheng-da Apr 18, 2020

zheng-da Apr 18, 2020

BarclayII Apr 20, 2020

BarclayII Apr 24, 2020

zheng-da commented Apr 18, 2020

BarclayII commented Apr 20, 2020

jermainewang Apr 26, 2020

BarclayII Apr 27, 2020

jermainewang left a comment

jermainewang commented Apr 26, 2020

BarclayII commented Apr 26, 2020

[Model] GraphSAGE with control variate sampling on new sampler #1355

[Model] GraphSAGE with control variate sampling on new sampler #1355

Conversation

BarclayII commented Mar 12, 2020 • edited Loading

Description

TODO

Checklist

lingfanyu commented Mar 14, 2020

BarclayII commented Mar 16, 2020

zheng-da Apr 18, 2020

Choose a reason for hiding this comment

zheng-da Apr 18, 2020

Choose a reason for hiding this comment

BarclayII Apr 20, 2020

Choose a reason for hiding this comment

BarclayII Apr 24, 2020

Choose a reason for hiding this comment

zheng-da commented Apr 18, 2020

BarclayII commented Apr 20, 2020

jermainewang Apr 26, 2020

Choose a reason for hiding this comment

BarclayII Apr 27, 2020

Choose a reason for hiding this comment

jermainewang left a comment

Choose a reason for hiding this comment

jermainewang commented Apr 26, 2020

BarclayII commented Apr 26, 2020

BarclayII commented Mar 12, 2020 •

edited

Loading