Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/465 directed graphsage #479

Merged
merged 32 commits into from Sep 23, 2019
Merged

Conversation

@geoffj-d61
Copy link
Contributor

geoffj-d61 commented Aug 22, 2019

Updated the GraphSAGE aggregation architecture to create a directed GraphSAGE algorithm that separately handles in-node and out-node neighbourhoods. Some (temporary) demo examples have been added to check the differences in behaviour between directed and undirected GraphSAGE.

Note that example 3 (directed GraphSAGE, no in-nodes, only out-nodes) is now broken; this is not because of directionality, but because of an existing bug in the framework: the mean of a tensor with a 0 dimension produces NaNs. Undirected GraphSAGE suffers this same bug (try setting the 2nd sampling dimension to 0 in either example 1 or 2). Will need help to fix this.

geoffj-d61 added 16 commits Aug 14, 2019
…oks.
@geoffj-d61 geoffj-d61 added the ml label Aug 22, 2019
@geoffj-d61 geoffj-d61 added this to the v0.4 Sprint 11 milestone Aug 22, 2019
@geoffj-d61 geoffj-d61 requested review from adocherty and youph Aug 22, 2019
@review-notebook-app

This comment has been minimized.

Copy link

review-notebook-app bot commented Aug 22, 2019

Check out this pull request on ReviewNB: https://app.reviewnb.com/stellargraph/stellargraph/pull/479

You'll be able to see notebook diffs and discuss changes. Powered by ReviewNB.

Copy link

codeclimate bot left a comment

The PR diff size of 7185 lines exceeds the maximum allowed for the inline comments feature.

@codeclimate

This comment has been minimized.

Copy link

codeclimate bot commented Aug 22, 2019

Code Climate has analyzed commit 591da33 and detected 7 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 7

View more on Code Climate.

geoffj-d61 added 3 commits Aug 23, 2019
Copy link
Contributor

adocherty left a comment

GraphSAGEAggregator.get_config:

  • Currently the models fail when saved and loaded for directed models as there is a new parameter introduced (neigh_dim). We will need to add the new parameter to the config dictionary so that the layer can be saved/loaded. This is superceded by the updates to the aggregators I've committed.

  • Related to that we should probably have a test for saving/loading each model!

General:

  • As there is no DirectedGraphSAGELinkGenerator I am guessing that this means that currently the directed GraphSAGE only supports node inference models? I think we should make a note of this in the DirectedGraphSAGE docstring.

  • I've added the DirectedGraphSAGE class to the docs/api.txt file to include this class in the API documentation. Currently this is a manual process, so we can have a curated list of classes of which we document the API.

  • Finally, I've refactored the aggregators classes to generalize more easily to multiple groups of neighbours so that now all aggregators work in the same way by creating multiple weights/models for each neighbourhood group. I'm hoping we can use this implementation to generalize the aggregators to the HinSAGE class. This implementation also avoids the NaN issues when the inputs have zero neighbours (for example in_samples = [0, 0]) by only including the input when it is has no zero-dimensional inputs (this also allows for in_samples = [0, 4] for example); therefore, example notebook 3 now works. In addition the aggregators naturally reduces to a MLP without the need for a special method to create one.

  • Note that this allows the models to be saved without the change to add neigh_dim to the config dictionary.

  • I also replaced the _compute_input_sizes method with something more fun 😄

  • A general comment is that I think some of the files are getting too large and we should think about splitting them up, for examplemappers/node_mappers.py. I'll make an issue for this.

Notebooks:

  • I've added some code to the experim.ipynb to save & load the model to test this.

  • I think that we should move the notebook demos/node-classification/directed-graphsage/example4-directed-graph-fully-directed-graphsage.ipynb to demos/node-classification/graphsage/directed-graphsage-on-cora-example.ipynb and remove the other notebooks for now. However, I think these other notebooks are useful but shouldn't be released in the demos directory. Perhaps we can put them in the stellar-research repository under a suitable directory?

stellargraph/layer/graphsage.py Outdated Show resolved Hide resolved
stellargraph/layer/graphsage.py Outdated Show resolved Hide resolved
stellargraph/layer/graphsage.py Show resolved Hide resolved
stellargraph/layer/graphsage.py Outdated Show resolved Hide resolved
stellargraph/layer/graphsage.py Outdated Show resolved Hide resolved
stellargraph/layer/graphsage.py Outdated Show resolved Hide resolved
stellargraph/layer/graphsage.py Outdated Show resolved Hide resolved
out_layer = stage_tree[0]

# Remove neighbourhood dimension from output tensors of the stack
out_layer = Reshape(K.int_shape(out_layer)[2:])(out_layer)

This comment has been minimized.

Copy link
@adocherty

adocherty Sep 2, 2019

Contributor

This will break if the neighbourhood dimension is not one. While currently I don't know of any cases where this is true, it doesn't hurt to check:

Suggested change
out_layer = Reshape(K.int_shape(out_layer)[2:])(out_layer)
if K.int_shape(x)==1:
out_layer = Reshape(K.int_shape(out_layer)[2:])(out_layer)

This comment has been minimized.

Copy link
@geoffj-d61

geoffj-d61 Sep 9, 2019

Author Contributor

@adocherty what is x here in if K.int_shape(x)==1? The method input is xin which is the (flattened binary tree) list of tensors, so it's not that. And the out_layer we are expecting to have 2 or more dimensions. So I'm a bit confused here!

Is it meant to be if K.int_shape(out_layer)[1] == 1?

This comment has been minimized.

Copy link
@geoffj-d61

geoffj-d61 Sep 20, 2019

Author Contributor

@adocherty I think the above is the last outstanding issue.

This comment has been minimized.

Copy link
@adocherty

adocherty Sep 23, 2019

Contributor

Hi, yes I meant K.int_shape(out_layer)[1] == 1

This comment has been minimized.

Copy link
@geoffj-d61

geoffj-d61 Sep 23, 2019

Author Contributor

Added and committed!

stellargraph/layer/graphsage.py Outdated Show resolved Hide resolved
stellargraph/data/explorer.py Outdated Show resolved Hide resolved
…behaviour.
@adocherty

This comment has been minimized.

Copy link
Contributor

adocherty commented Sep 2, 2019

OK, my changes have broken the unit tests. This is due to the AttentionalAggregator which doesn't fit into the paradigm of the other aggregators. I'll commit a fix tomorrow 😟

Updated: Fixed the bug. I'm still not happy with the hacks required for AttentionalAggregator

adocherty and others added 3 commits Sep 2, 2019
@geoffj-d61

This comment has been minimized.

Copy link
Contributor Author

geoffj-d61 commented Sep 5, 2019

@adocherty I have moved example4 into graphsage, as you suggested, and deleted directed-graphsage. The example and experimental notebooks have been saved as branch feature/465-directed-graphsage on stellar-research.

geoffj-d61 added 5 commits Sep 9, 2019
geoffj-d61 added 2 commits Sep 23, 2019
@geoffj-d61

This comment has been minimized.

Copy link
Contributor Author

geoffj-d61 commented Sep 23, 2019

The one thing I forgot to check fully was the demo, which got broken by the keras -> tensorflow.keras changes! Fixed now.

adocherty added 2 commits Sep 23, 2019
…tellargraph/stellargraph into feature/465-directed-graphsage
Copy link
Contributor

adocherty left a comment

@geoffj-d61 the changes look good, let's go ahead and merge this. I also changed the notebook to add a plot with the different neighbourhood types and updated some of the text.

@geoffj-d61 geoffj-d61 merged commit 87601db into develop Sep 23, 2019
2 checks passed
2 checks passed
buildkite/stellargraph Build #1565 passed (21 minutes, 54 seconds)
Details
coverage/coveralls First build on feature/465-directed-graphsage at 84.918%
Details
@geoffj-d61 geoffj-d61 deleted the feature/465-directed-graphsage branch Sep 23, 2019
huonw added a commit that referenced this pull request Jan 13, 2020
This was accidentally missed in #479.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.