New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to include initial labels in weisfeiler_lehman_subgraph_hashes
#6601
Option to include initial labels in weisfeiler_lehman_subgraph_hashes
#6601
Conversation
I think the original description is correct. Each node's label is created by combining the node labels of nodes within i-hops of the node. Since those nodes at distance i have their labels created from nodes at distance i, the node labels are influenced by all nodes within What do you think? If it really should be 2i then we should change back the switch in that text from #6598. Other changes in the doc_string may or may not need adjusting too. And maybe we should add enough text to make this aspect of the description more clear. |
Just for context, I originally wrote the function & docstring, and I think this error is an easy mistake to make - which is probably why I got it wrong the first time around. But I'm quite sure it should be just The flaw in your description is that a node label at iteration At each iteration, every node receives information from its immediate neighbors only. Therefore, information cannot possibly traverse the network at a faster rate than 1 edge per iteration. The algorithm is essentially a message passing algorithm, like pregel or like graph convolutional networks. What we're describing here is the 'receptive field' of the algorithm, and literature like slide 40 in this stanford course http://snap.stanford.edu/class/cs246-2021/slides/20-GNN.pdf establish that K layers = K hop receptive field. Finally, a practical way of convincing yourself is to override the Notice that in the 2 successive iterations, never is a node label more than Here's a snippet to show it: import networkx as nx
import matplotlib.pyplot as plt
nx.algorithms.graph_hashing._hash_label = lambda u, *_: str(u)
G = nx.path_graph(5)
nx.set_node_attributes(G, {u: u for u in G}, "label0")
wl = nx.weisfeiler_lehman_subgraph_hashes(G, node_attr="label0")
nx.set_node_attributes(G, {u: wl[u][0] for u in G}, "label1")
nx.set_node_attributes(G, {u: wl[u][1] for u in G}, "label2")
_, axs = plt.subplots(3, figsize=(8, 20))
pos = nx.spring_layout(G)
nx.draw(G, pos=pos, ax=axs[0], labels=nx.get_node_attributes(G, "label0"))
nx.draw(G, pos=pos, ax=axs[1], labels=nx.get_node_attributes(G, "label1"))
nx.draw(G, pos=pos, ax=axs[2], labels=nx.get_node_attributes(G, "label2"))
plt.show() |
So, you are saying that each iteration only collects the labels from the neighbors... and not the neighbors-neighbors when i=2 and the nbrs-nbrs-nbrs when i=3... etc. OK. clearly I misunderstood. Another change that would be helpful is to go through the doc_string and use |
I think I caught all the cases @dschult - I changed |
Sorry this has taken so long to get back to -- and Thanks for responding so quickly to the new Issue! :) It looks like this PR is now conflicting with main in one of the doc strings. Can you merge/rebase the main branch into this one? Go ahead and make the changes needed to fix #7330 and flag me -- I will try to make it a fast review this time. :) |
9795796
to
055f243
Compare
networkx/algorithms/graph_hashing.py
Outdated
Lists of subgraph hashes are sorted in increasing order of depth from | ||
their root node, with the hash at index i corresponding to a subgraph | ||
of nodes at most i edges distance from u. Thus, each list will contain | ||
``iterations + 1`` elements - a hash for a subgraph at each depth, and | ||
additionally a hash of the initial node label (or equivalently a | ||
subgraph of depth 0) | ||
`iterations` elements - a hash for a subgraph at each depth. If | ||
`include_initial_labels` is set to `True`, each list will additionally | ||
have contain a hash of the initial node label (or equivalently a | ||
subgraph of depth 0) prepended, totalling `iterations + 1` elements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should I be using latex notation in this paragraph for i and u?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general guide I use is "test is better than latex except when it isn't". So symbols like i and u should be kept in text rather than LaTeX. Mathematical expressions can become unreadable in text format so that's when we should use LaTeX.
weisfeiler_lehman_subgraph_hashes
weisfeiler_lehman_subgraph_hashes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @aaronzo , just a minor nit on backticks. FWIW the rule of thumb re: backticks is single backticks for parameters and link targets, double backticks for inline literals.
The changes relating to #7330 look good to me. Thanks for getting to this so quickly! |
Co-authored-by: Ross Barnowski <rossbar@berkeley.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aaronzo and @kalekundert for this! And @rossbar for review
…s` (networkx#6601) * docstring * change n -> u for node in docstring * fix issue 7330 * Update networkx/algorithms/graph_hashing.py Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> --------- Co-authored-by: Ross Barnowski <rossbar@berkeley.edu>
Apologies for the spam folks, I thought I had caught the mistakes in #6598 but had to correct some things further down too - good catch from @dschult in the PR comments.
I've read the whole docstring through a couple of times now and it looks right.