New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Reversed Graph and Transform Module #331

Merged
merged 32 commits into from Jan 4, 2019

Conversation

Projects
None yet
5 participants
@mufeili
Copy link
Member

mufeili commented Dec 31, 2018

Description

@jermainewang Here is an API for creating the reverse of a DGLGraph with node/edge features shared. This can be potentially helpful when dealing with undirected graphs. We also make a reorg by putting line_graph and reverse into a transform.py module.

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [Model], [Doc], [Feature]])
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented
  • To the my best knowledge, examples are either not affected by this change,
    or have been fixed to be compatible with this change
  • Related issue is referred in this PR

Changes

  • API implementation
  • Related docs completed
  • Test case provided
  • A small fix for the installation guide: removed an outdated TODO

mufeili added some commits Dec 31, 2018

Revert "Fix"
This reverts commit 1728826.
Fix
Fix
Fix
@jermainewang

This comment has been minimized.

Copy link
Member

jermainewang commented Dec 31, 2018

@mufeili I would make it simpler -- the dgl.reverse generates a new graph that optionally shares node/edge features with the input graph. Similar to a tensor operation that generates new tensor, our graph operation always generates new graph. We could also have a member function DGLGraph.reverse that calls dgl.reverse. In fact, I feel line_graph should be defined in such manner. We could put them in a transform.py module. The ReversedGraph is not needed and we don't need to maintain an _reverse field in DGLGraph.

@mufeili

This comment has been minimized.

Copy link
Member Author

mufeili commented Jan 1, 2019

@jermainewang If the reversed graph shares node/edge features with the original graph and the topology of the original graph changes after creating the reverse, wouldn't it be very strange that the number of the rows in the frame is different from the number of nodes/edges?

@jermainewang

This comment has been minimized.

Copy link
Member

jermainewang commented Jan 1, 2019

That's true and is the same for line graph shared mode and subgraph shared mode. This is the problem of mutable operation. For example, in pytorch:

>>> import torch as th
>>> x = th.ones((10,))
>>> y, z = x.split(5)
>>> x[0] = -1  # mutate x
>>> x[7] = -2  # mutate x
>>> y
tensor([-1.,  1.,  1.,  1.,  1.])
>>> z
tensor([ 1.,  1., -2.,  1.,  1.])
@mufeili

This comment has been minimized.

Copy link
Member Author

mufeili commented Jan 1, 2019

@jermainewang So let's say we ignore this issue for the time being, put reverse and line_graph in a transform.py and allow users to do things like g.reverse(), g.line_graph()?

@zheng-da

This comment has been minimized.

Copy link
Collaborator

zheng-da commented Jan 1, 2019

Why do we need this operation in an undirected graph? When we reverse an undirected graph, we get the same graph.

@mufeili

This comment has been minimized.

Copy link
Member Author

mufeili commented Jan 1, 2019

@zheng-da I suppose the point is that when we represent an undirected static graph using a directed graph, rather than adding edges for both directions, we can do rg=g.reverse() instead.

@zheng-da

This comment has been minimized.

Copy link
Collaborator

zheng-da commented Jan 1, 2019

Is it how we implement undirected graphs? It doesn't sound right. For example, when we do update_all, messages should be sent to all nodes. If we split an undirected graph into two direct graphs, we'll have to run update_all on two graphs and nodes in a graph can only receive some messages.

@zheng-da

This comment has been minimized.

Copy link
Collaborator

zheng-da commented Jan 1, 2019

undirected graphs are very common. We shouldn't implement undirected graphs very differently from directed graphs.

@VoVAllen

This comment has been minimized.

Copy link
Collaborator

VoVAllen commented Jan 1, 2019

What's the typical scenario for this API? Also what's the relationship between undirected graph and this?

To me it's better to design an API for sharing features rather than a new type of graph.

@jermainewang

This comment has been minimized.

Copy link
Member

jermainewang commented Jan 1, 2019

My thoughts for undirected graph:

  • Current solution for undirected graph is fine -- aka treating undirected graph as bi-directional graph.
  • What is missing is when the bi-directional edges (u, v) and (v, u) share the same features.

This motivates the dgl.reverse API. Some demo codes:

G = ...  # some directed graph
Gr = dgl.reverse(G, shared=True)  # the reverse graph where u->v in G shares feature with v->u in Gr
# message passing (suppose we want to do update_all)
# first send along two directions
G.send(ALL, ...)
Gr.send(ALL, ...)
# then recv either G or Gr to update the nodes
G.recv(ALL, ...)

This looks tedious I agree. So the next step is to provide an undirected graph class (using the reverse API as a building block):

class UndirectedDGLGraph:
  def __init__(self, graph_data):
    self.G = DGLGraph(graph_data)
    self.Gr = dgl.reverse(self.G)
  def update_all(mfunc, rfunc, afunc):
    self.G.send(ALL, mfunc)
    self.Gr.send(ALL, mfunc)
    self.G.recv(ALL, rfunc, afunc)

The rest message passing APIs can be implemented in the same way. If you guys don't like this technical solution, feel free to propose another one.

@jermainewang

This comment has been minimized.

Copy link
Member

jermainewang commented Jan 1, 2019

Hmm... I found the problem here. G.recv cannnot receive messages sent in Gr. Any suggestions?

@BarclayII

This comment has been minimized.

Copy link
Collaborator

BarclayII commented Jan 2, 2019

Do we have applications where (u, v) and (v, u) in the same graph share features?

In my mind, the reverse graph is only helpful for doing stuff such as "reversed pull" (like that in graph_nets).

@jermainewang

This comment has been minimized.

Copy link
Member

jermainewang commented Jan 2, 2019

Do we have applications where (u, v) and (v, u) in the same graph share features?

No there is not, but a high chance to be one in the future. Think about a task to learn similarity between two nodes. The "similarity" would be shared by both direction.

In my mind, the reverse graph is only helpful for doing stuff such as "reversed pull" (like that in graph_nets).

Reversed pull looks like a push on the reverse direction. It is doable in current framework (using bi-directional edges). The question is how shall we support shared edge features?

@BarclayII

This comment has been minimized.

Copy link
Collaborator

BarclayII commented Jan 2, 2019

No there is not, but a high chance to be one in the future. Think about a task to learn similarity between two nodes. The "similarity" would be shared by both direction.

Then currently one can somehow manually maintain a mapping between edges and their corresponding reverse edges. Once the mapping is obtained, one can scatter the rows into the graph. Sounds tedious but doable with the current API.

Reversed pull looks like a push on the reverse direction. It is doable in current framework (using bi-directional edges). The question is how shall we support shared edge features?

No they are different. Reversed pull is still a pull: there is only one receiver node, but instead of pulling along inbound edges like we do now, it pulls along outbound edges. Pushing is sending messages from one node to all neighbors, so there are multiple receivers.

@VoVAllen

This comment has been minimized.

Copy link
Collaborator

VoVAllen commented Jan 2, 2019

We should let user control the message passing process although it would be tedious. Since user may want do message passing on G and rG at the same time or do rG at first and so on.
The UndirectedDGLGGraph is a bit confusing since message passing itself has the direction. Is it possible to do sharing features within the scope of setting edge attribute?

@zheng-da

This comment has been minimized.

Copy link
Collaborator

zheng-da commented Jan 2, 2019

actually, all of our test graphs we use (cora, citeseer and pubmed) are undirected graphs. They aren't undirected graphs originally, but they were converted to undirected graphs by the original paper. So is the Reddit graph.

@zheng-da

This comment has been minimized.

Copy link
Collaborator

zheng-da commented Jan 2, 2019

If the goal is to handle an undirected graph better, I don't think we should use reverse to solve the problem. To be honest, it's quite counter-intuitive. In an undirected graph, a vertex needs to get messages from both in-edges and out-edges. Neither G nor Gr can receive all messages required by the following node update function.

The simplest solution, I think, is to modify the graph index to make sure (u, v) and (v, u) have the same edge id. This is the root cause of this problem. It seems to me that this is easy to achieve. Whenever adding a new edge, we can reverse the edge and assign the same edge id to the reversed edge.

@VoVAllen

This comment has been minimized.

Copy link
Collaborator

VoVAllen commented Jan 2, 2019

If edges share id, what should be the result of g.edata['feat']?

If we show all the edges attributes including "reverse edges", it seems inevitable to risk breaking the consistency on shared feature by user's mutation.

@jermainewang

This comment has been minimized.

Copy link
Member

jermainewang commented Jan 2, 2019

@zheng-da

This comment has been minimized.

Copy link
Collaborator

zheng-da commented Jan 2, 2019

so we need edge id that refers to edge data and message id that refers to messages? How about we use (edge id, u > v) as a message id? The message from u to v and message from v to u are both stored on edge (u, v), u > v refers to the right message. Is this doable?

mufeili added some commits Jan 3, 2019

Reorg transform and update reverse (#2)
* Reorg transform and update reverse

* Fix doc and test

* Update test
Fix

@mufeili mufeili changed the title [Feature] Reversed Graph with Shared Features [Feature] Reversed Graph and Transform Module Jan 3, 2019

@mufeili mufeili requested a review from jermainewang Jan 3, 2019

Show resolved Hide resolved python/dgl/graph.py Outdated
Show resolved Hide resolved python/dgl/transform.py Outdated
Show resolved Hide resolved python/dgl/transform.py Outdated
Show resolved Hide resolved python/dgl/transform.py Outdated

mufeili added some commits Jan 4, 2019

Fix
Fix
Fix
Fix
@jermainewang

This comment has been minimized.

Copy link
Member

jermainewang commented Jan 4, 2019

LGTM. Thanks murphy!

@jermainewang jermainewang merged commit 24bbdb7 into dmlc:master Jan 4, 2019

1 check passed

continuous-integration/jenkins/pr-merge This commit looks good
Details

@jermainewang jermainewang referenced this pull request Jan 4, 2019

Merged

[Test] Unify tests for different backends #333

5 of 5 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment