-
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Knotty Centrality by Shanahan and Wildie #10
Conversation
I know this has been up for a while, but I'll make a few comments if the authors are still interested in getting it incorporated. The algorithm seems sound, though I haven't read the original paper in depth. I'll make a few inline comments on the code. I think this could probably be included in |
networkx/algorithms/knotty.py
Outdated
nodes, kc = _find_knotty_centre(G, compact) | ||
g = G.subgraph(nodes) | ||
|
||
return g, nodes, kc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just return g
or nodes
. One can be found from the other easily. I would advocate for just nodes
It looks like you use |
Your comments are great! I'll fix it as soon as I get a chance. |
Needs update or closing. |
a285931
to
175d2ee
Compare
Sorry for never getting back to this, my Ph.D. got in the way. I made the changes suggested above by @bjedwards but when testing I keep getting:
from the _best_perm function. I didn't get that error two years ago, so I guess the system recursion depth needs to be changed (it's default 999 apparently). I suppose it needs to be written iteratively rather than recursively. |
@swederik Sometimes getting back to code after 8 years is the charm :D I went through the PR, made the changes to be usable in the current version, rebased to main and did some minor clean ups. The resulting changes are here main...MridulS:networkx:knotty_cent_review I tried to run the code but I'm getting the |
# value of knotty-centrality | ||
|
||
if choices: | ||
choices2 = [choices[i] for i in range(0, len(choices))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the original paper, this should be
choices2 = choices[1:]
And the above will remedy the endless recursion. But I question the name of the function and its description. It's clearly iterating over combinations, not permutations.
it would be more pythonic (and not recursive) to write
def _best_perm(given, choices, G, CIJ, compact, BC):
best_nodes = given
best_kc = _compute_knotty_centrality(G, CIJ, given, compact, BC)
for r in range(1, len(choices)+1):
for subset in combinations(given, r):
nodes = np.hstack((given, subset))
kc = _compute_knotty_centrality(G, CIJ, nodes, compact, BC)
if kc > best_kc:
best_kc = kc
best_nodes = nodes
return best_nodes, best_kc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @boothby!! That indeed fixes the recursion error.
Just to check the algorithm I did a naive implementation and the results don't seem to match up:
from itertools import chain, combinations
def knotty_centrality_naive(G):
graph_subsets = _graph_powerset(set(G))
bc = nx.betweenness_centrality(G)
bc_sum = sum(bc.values())
bc = {n: val/bc_sum for n, val in bc.items()}
current_max = 0
current_max_set = {}
count = 0
for subset in graph_subsets:
kc_val = _knotty_cent(G, subset, bc)
count += 1
if kc_val > current_max:
current_max = kc_val
current_max_set = subset
print(current_max, current_max_set)
return current_max, current_max_set, count
def _graph_powerset(iterable):
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(2, len(s)+1))
def _knotty_cent(G, subset, bc):
subset_graph = G.subgraph(subset)
edges = subset_graph.number_of_edges()
nodes = subset_graph.number_of_nodes()
bc_subset = sum(bc[i] for i in subset)
return (edges*bc_subset) / ((nodes - 1) * (nodes))
Gives the highest knotty_centrality of 0.333 but the implementation in the PR gives 0.666.
And interestingly the paper states that knotty centrality is for directed graphs, @swederik do you remember by any chance what was the motivation to put in the error for a directed graph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the paper, they define the knotty centrality for subgraphs which have at least two nodes, as you have implemented in your naive version. It appears that the implementation here does not respect that condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was probably because I was not working with directed graphs at the time and did not test it. I ported over the MATLAB code after meeting the authors at the Brain Connectivity Workshop conference at the time. I don't recall if I ever ended up using it in an actual publication.
return nodes, kc | ||
|
||
def _compute_knotty_centrality(G, CIJ, nodes, compact, BC): | ||
# Returns knotty-centrality of the subgraph of CIJ comprising only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing CIJ
around seems largely wasteful here -- it looks like it's only being used to get the size n = len(G)
, after a quite expensive "binarise" step which redundant because that happens at the top level.
In retrospect, doing the powerset in a recursive (or pseudo-recursive to avoid blowing the stack limit) manner would be good, because BCtot
and density
could be done in constant / n
time respectively instead of n
and n^2
time per call.
On naming: the function |
Hi guys, Thanks so much for putting in the effort to get this merged. I am astonished that this PR still exists after eleven years! I'm not working on this topic anymore (I started this PR during my PhD when working on brain networks) and I am currently on paternity leave so I can't really help too much with this, but I think you are all amazing for picking it up and running with it, and I apologize if I was a terrible programmer 11 years ago :-) I'll tag @mpshanahan here in case he's interested in helping, though I have no idea if he is active on Github. edit: I've checked the Allow Edits by Maintainers checkbox so you are free to push to the PR branch if you want. |
Hello!
This is an implementation of the Knotty Centrality measure by Murray Shanahan and Mark Wildie.
I have ported it from their MATLAB code in the hopes of getting it into NetworkX.
"Attempts to find the sub-graph of G with the highest value for
knotty-centrality. Carries out a series of exhaustive searches on
subsets of the nodes ranked by "indirect" betweenness centrality, then
carries out a phase of hill-climbing to see whether the sub-graph can
be improved by adding further nodes."
The relevant reference is here:
Shanahan M, Wildie M (2012) Knotty-Centrality: Finding the Connective Core of a Complex Network. PLoS ONE 7(5): e36579. doi:10.1371/journal.pone.0036579
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0036579#pone.0036579.s003
and it returns results similar to the rich club coefficient.
UPDATE: I can now confirm that this works as intended...
Testing script and data can be found here:
https://www.dropbox.com/sh/irsau7ojcgtv3z9/ZZvOfckfEX