New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added girth computation function #6633
Conversation
This looks good to me, @alabarre. Would you mind adding tests in |
I added a few. Mine pass, but others fail:
I'm leaving them alone, just wanted to let you know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the tests! I just have a couple comments/questions...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took the liberty of fixing up the References formatting for the docstring. Generally LGTM, just a few minor suggestions for code/tests, most of which are nits!
Co-authored-by: Dan Schult <dschult@colgate.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took the liberty of making a few minor changes to help get this over the line:
- I merged with main to pick up the latest CI fixes
- I added
girth
to theautosummary
so it shows up in the reference docs - A few minor formatting fixups.
This LGTM, thanks @alabarre !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the "resolved" comments above were not actually implemented even though agreement was indicated. But they are minor items.
I have 2 concerns:
- Shouldn't this have a
weight
argument? Or is girth only used for unweighted graphs? I guess this could be a first PR with the default weight of 1. But I suspect people will request a weighted version eventually. - I don't think we should be mapping all the edges to frozensets. We should be able to test for an edge in an undirected graph without using the frozensets.
I will push the minor changes to remove frozensets and include the agreed upon comments above that got missed. Hopefully it will become easier to read as well, but we'll see.
I think the weighted version should perhaps be a separate PR.
I agree that the weighted version should be a separate PR. Negative weights, for example, would confound this algorithm. |
Hmmm... I'm not sure this makes sense for negative weights... Can't you just go back and forth across one of the negative edges and get as negative a cycle as you want? Or are we limiting this to simple cycles? Anyway, I was assuming we would use Dijkstra -- which would require positive weights anyway. @boothby do you think this PR is ready then? I could approve it, but would feel better with someone else looking at what I did rather than approving it myself. :} |
I presume that simple cycles are assumed. I hadn't considered the weighted girth problem today, but there is interest in negative weights in the literature. https://www.sciencedirect.com/science/article/pii/S1570866715001021 I'll re-read this PR today. |
Oh my! That article is looking for the negative "cost" cycle with the fewest edges -- not with the least cost... I would restrict any PR I do to positive weights and use dijkstra to find the shortest paths. |
Back to you, Dan. I tightened up the length bound, and reverted a change you made that iterated over all of the edges of G; all in the name of performance. I also added the trick of deleting nodes as we go, and in so doing caught a bug in handling isolated nodes. Sorry; your version was so short and sweet. |
And for what it's worth, the calls to bfs and dijkstra are throwing away a lot of information and re-computing it. I've implemented this algorithm "my way" in alabarre/networkx@girth...boothby:networkx:girth-faster but I have not been so presumptuous to push a wholesale rewrite. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the bfs_edges
approach is not ideal because it causes us to go back and find the lengths. We do have at least one function that returns both distance and predecessors. Maybe that could help us avoid including another copy of BFS by using:
pred, dist = nx.predecessor(G, n, return_seen=True)
tree = nx.Graph((p[0], u) for u, p in pred.items() if u != n)
for u, unbrs in tree.adj.items():
...
But maybe we should just write another version of BFS here as in your branch.
Okay, I gave this all a good hard stare. What this algorithm really calls for is a function analogous to BACK_EDGE = "back"
TREE_EDGE = "tree"
FORWARD_EDGE = "forward"
LEVEL_EDGE = "level"
def bfs_labeled_edges(G, sources):
if sources in G:
sources = [sources]
depth = {s: 0 for s in sources}
queue = deque((0, s) for s in sources)
label = [LEVEL_EDGE, FORWARD_EDGE, BACK_EDGE]
while queue:
du, u = queue.popleft()
for v in G[u]:
dv = depth.get(v)
if dv is None:
depth[v] = dv = du + 1
queue.append((dv, v))
yield u, v, du, dv, TREE_EDGE
else:
yield u, v, du, dv, label[dv-du] this greatly simplifies our girth computation: def girth(G):
girth = depth_limit = inf
skip_labels = (
nx.algorithms.traversal.breadth_first_search.BACK_EDGE,
nx.algorithms.traversal.breadth_first_search.TREE_EDGE,
)
forward_edge = nx.algorithms.traversal.breadth_first_search.FORWARD_EDGE
level_edge = nx.algorithms.traversal.breadth_first_search.LEVEL_EDGE
for n in G:
for u, v, du, dv, label in nx.bfs_labeled_edges(G, n):
if du > depth_limit:
break
if label in skip_labels:
pass
else:
length = du + dv + 1
if length < girth:
depth_limit = du - (label is level_edge)
girth = length
return girth This gets my K200 benchmark down to ~25ms. Should I commit it? I'm mostly seeking commentary on ...
for n in G
depth = {n:0}
for u, v, label in nx.bfs_labeled_edges(G, n):
if label is TREE_EDGE:
depth[v] = depth[u] + 1
... |
It occurs to me that we could do |
I agree that we should create a As I think about edge types I'm confused by the BACK_EDGEs. At first I couldn't get them to happen. But I think now that I've convinced myself that they will not occur for undirected graphs. And they definitely can occur for directed graphs. For the undirected graphs, you would always traverse an edge outward before it would be reached going inward. So, while I very much enjoyed(!) the triplet of edge labels indexed by I'm still thinking about whether to have the labels yield the distances -- it is certainly convenient, but also (as you mention) likely to have a way to compute it based on the previous edges. I'm leaning toward a pure version in traversal, and then the |
As I drifted off to sleep last night, the directed case occurred to me. There, back-jumps can be arbitrarily long so my triplet is certainly invalid. I think you're assuming that each edge should occur exactly once, and concluding that back-edges won't happen in the undirected case. And that's certainly achievable, but currently my code emits each undirected edge exactly twice. Perhaps this? def _undirected_bfs_labeled_edges(G, sources):
if sources in G:
sources = [sources]
depth = {s: 0 for s in sources}
visited = set()
queue = deque((0, s) for s in sources)
while queue:
du, u = queue.popleft()
visited.add(u)
for v in G[u]:
dv = depth.get(v)
if dv is None:
depth[v] = dv = du + 1
queue.append((dv, v))
yield u, v, TREE_EDGE
else:
if du == dv:
# could use either "in" or "not in" -- this emits self-loops
if v in visited:
yield u, v, LEVEL_EDGE
elif du < dv:
yield u, v, FORWARD_EDGE
def _directed_bfs_labeled_edges(G, sources):
if sources in G:
sources = [sources]
depth = {s: 0 for s in sources}
queue = deque((0, s) for s in sources)
while queue:
du, u = queue.popleft()
for v in G[u]:
dv = depth.get(v)
if dv is None:
depth[v] = dv = du + 1
queue.append((dv, v))
yield u, v, TREE_EDGE
elif du < dv:
yield u, v, FORWARD_EDGE
elif du > dv:
yield u, v, BACK_EDGE
else:
yield u, v, LEVEL_EDGE |
* added new helper function bfs_labeled_edges * added support for even girth and odd girth
Oh, drat. I was hasty in adding the option for odd&even girth. Thanks, Petersen Graph, for yet again showing us the way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!! :)
Whatever the choice for self-loops (see below), it looks like we don't need to separate directed and undirected. The use of visited
should apply in the same way to both types of graphs -- so that self.loops are handled in both cases. And the case du > dv
will not arise in the undirected case.
Thanks!
Self-loops are reported in both directed and undirected right now. The directed case does not need to track In the undirected case, every edge is traversed twice, so we need to use
Toggling either choice changes whether or not loops are reported in the undirected case. |
new benchmark
from networkx import girth, complete_graph, cycle_graph, chordal_cycle_graph, Graph
from time import perf_counter
test_graphs = complete_graph(200), cycle_graph(1000), Graph([(u, v) for u, v, _ in chordal_cycle_graph(1019).edges if u != v])
g = girth
for G in test_graphs:
t0, *girths, t1 = (
perf_counter(),
g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
perf_counter()
)
print(f"{(t1-t0)/len(girths):.4f}", end=' ')
print() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed I didn’t. I had no time to review or chime in on the latest developments, but it’s exciting to see what this humble snippet has become. Thanks to everyone involved, I’m looking forward to our next collaborations. |
* added girth computation function * added tests for girth * removed useless method and added seed to random call as recommended * Add numpydoc reference formatting to docstring. * Apply suggestions from code review * Update networkx/algorithms/tests/test_cycles.py Co-authored-by: Dan Schult <dschult@colgate.edu> * Update networkx/algorithms/tests/test_cycles.py * Blacken. * Add girth fn to reference. * avoid frozenset and include other comments * fix import order * performance improvement to girth; added test for isolated nodes * Update networkx/algorithms/cycles.py Co-authored-by: Dan Schult <dschult@colgate.edu> * took Dan's suggestion to use nx.predecessor to obviate nx.single_source... * reordered computation to only find one min-length (even/odd) cycle per root * copying is no longer worthwhile for runtime * rewrote girth for speed and hopefully clarity * added new helper function bfs_labeled_edges * added support for even girth and odd girth * doctest typos * doctest fix * doctests were a mistake * removed faulty parity argument * patched up un-tested unit test * unified directed & undirected bfs_labeled_edges * touchup on bfs_labeled_edges; added _dispatch * more performance improvements --------- Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> Co-authored-by: Mridul Seth <mail@mriduls.com> Co-authored-by: Dan Schult <dschult@colgate.edu> Co-authored-by: Kelly Boothby <boothby@dwavesys.com>
* added girth computation function * added tests for girth * removed useless method and added seed to random call as recommended * Add numpydoc reference formatting to docstring. * Apply suggestions from code review * Update networkx/algorithms/tests/test_cycles.py Co-authored-by: Dan Schult <dschult@colgate.edu> * Update networkx/algorithms/tests/test_cycles.py * Blacken. * Add girth fn to reference. * avoid frozenset and include other comments * fix import order * performance improvement to girth; added test for isolated nodes * Update networkx/algorithms/cycles.py Co-authored-by: Dan Schult <dschult@colgate.edu> * took Dan's suggestion to use nx.predecessor to obviate nx.single_source... * reordered computation to only find one min-length (even/odd) cycle per root * copying is no longer worthwhile for runtime * rewrote girth for speed and hopefully clarity * added new helper function bfs_labeled_edges * added support for even girth and odd girth * doctest typos * doctest fix * doctests were a mistake * removed faulty parity argument * patched up un-tested unit test * unified directed & undirected bfs_labeled_edges * touchup on bfs_labeled_edges; added _dispatch * more performance improvements --------- Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> Co-authored-by: Mridul Seth <mail@mriduls.com> Co-authored-by: Dan Schult <dschult@colgate.edu> Co-authored-by: Kelly Boothby <boothby@dwavesys.com>
* added girth computation function * added tests for girth * removed useless method and added seed to random call as recommended * Add numpydoc reference formatting to docstring. * Apply suggestions from code review * Update networkx/algorithms/tests/test_cycles.py Co-authored-by: Dan Schult <dschult@colgate.edu> * Update networkx/algorithms/tests/test_cycles.py * Blacken. * Add girth fn to reference. * avoid frozenset and include other comments * fix import order * performance improvement to girth; added test for isolated nodes * Update networkx/algorithms/cycles.py Co-authored-by: Dan Schult <dschult@colgate.edu> * took Dan's suggestion to use nx.predecessor to obviate nx.single_source... * reordered computation to only find one min-length (even/odd) cycle per root * copying is no longer worthwhile for runtime * rewrote girth for speed and hopefully clarity * added new helper function bfs_labeled_edges * added support for even girth and odd girth * doctest typos * doctest fix * doctests were a mistake * removed faulty parity argument * patched up un-tested unit test * unified directed & undirected bfs_labeled_edges * touchup on bfs_labeled_edges; added _dispatch * more performance improvements --------- Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> Co-authored-by: Mridul Seth <mail@mriduls.com> Co-authored-by: Dan Schult <dschult@colgate.edu> Co-authored-by: Kelly Boothby <boothby@dwavesys.com>
I implemented the algorithm mentioned on Wikipedia to compute the girth of a graph. This request seems to have been dropped a while ago (#1030), hopefully it is still of interest to some of us.