Added girth computation function #6633

alabarre · 2023-04-12T17:59:04Z

I implemented the algorithm mentioned on Wikipedia to compute the girth of a graph. This request seems to have been dropped a while ago (#1030), hopefully it is still of interest to some of us.

networkx/algorithms/cycles.py

boothby · 2023-04-14T01:25:11Z

This looks good to me, @alabarre. Would you mind adding tests in /networkx/algorithms/tests/test_cycles.py?

alabarre · 2023-04-14T05:51:09Z

This looks good to me, @alabarre. Would you mind adding tests in /networkx/algorithms/tests/test_cycles.py?

I added a few. Mine pass, but others fail:

================================================================================== short test summary info ===================================================================================
FAILED networkx/algorithms/components/tests/test_connected.py::TestConnected::test_connected_components[convert] - KeyError: 'networkx.plugins'
FAILED networkx/algorithms/tests/test_structuralholes.py::TestStructuralHoles::test_constraint_directed[convert] - KeyError: 'networkx.plugins'
======================================================================== 2 failed, 4305 passed, 233 skipped in 48.77s ========================================================================

I'm leaving them alone, just wanted to let you know.

dschult

Thanks for the tests! I just have a couple comments/questions...

networkx/algorithms/tests/test_cycles.py

rossbar

I took the liberty of fixing up the References formatting for the docstring. Generally LGTM, just a few minor suggestions for code/tests, most of which are nits!

networkx/algorithms/cycles.py

networkx/algorithms/tests/test_cycles.py

networkx/algorithms/cycles.py

networkx/algorithms/tests/test_cycles.py

Co-authored-by: Dan Schult <dschult@colgate.edu>

rossbar

I took the liberty of making a few minor changes to help get this over the line:

I merged with main to pick up the latest CI fixes
I added girth to the autosummary so it shows up in the reference docs
A few minor formatting fixups.

This LGTM, thanks @alabarre !

dschult

It looks like the "resolved" comments above were not actually implemented even though agreement was indicated. But they are minor items.

I have 2 concerns:

Shouldn't this have a weight argument? Or is girth only used for unweighted graphs? I guess this could be a first PR with the default weight of 1. But I suspect people will request a weighted version eventually.
I don't think we should be mapping all the edges to frozensets. We should be able to test for an edge in an undirected graph without using the frozensets.

I will push the minor changes to remove frozensets and include the agreed upon comments above that got missed. Hopefully it will become easier to read as well, but we'll see.

I think the weighted version should perhaps be a separate PR.

boothby · 2023-05-19T18:06:08Z

I agree that the weighted version should be a separate PR. Negative weights, for example, would confound this algorithm.

dschult · 2023-05-19T18:25:04Z

Hmmm... I'm not sure this makes sense for negative weights... Can't you just go back and forth across one of the negative edges and get as negative a cycle as you want? Or are we limiting this to simple cycles?

Anyway, I was assuming we would use Dijkstra -- which would require positive weights anyway.

@boothby do you think this PR is ready then? I could approve it, but would feel better with someone else looking at what I did rather than approving it myself. :}

boothby · 2023-05-19T18:38:03Z

I presume that simple cycles are assumed. I hadn't considered the weighted girth problem today, but there is interest in negative weights in the literature.

https://www.sciencedirect.com/science/article/pii/S1570866715001021

I'll re-read this PR today.

dschult · 2023-05-19T23:40:53Z

Oh my! That article is looking for the negative "cost" cycle with the fewest edges -- not with the least cost...
We don't even look for that with our shortest_path suite of functions. I guess if the count of edges represent the objective and the edge weights represent the constraint -- we are getting the least length of those cycles with negative cost.

I would restrict any PR I do to positive weights and use dijkstra to find the shortest paths.

boothby · 2023-05-20T00:04:55Z

Back to you, Dan.

I tightened up the length bound, and reverted a change you made that iterated over all of the edges of G; all in the name of performance. I also added the trick of deleting nodes as we go, and in so doing caught a bug in handling isolated nodes. Sorry; your version was so short and sweet.

boothby · 2023-05-20T01:46:31Z

And for what it's worth, the calls to bfs and dijkstra are throwing away a lot of information and re-computing it. I've implemented this algorithm "my way" in alabarre/networkx@girth...boothby:networkx:girth-faster but I have not been so presumptuous to push a wholesale rewrite.

dschult

I agree that the bfs_edges approach is not ideal because it causes us to go back and find the lengths. We do have at least one function that returns both distance and predecessors. Maybe that could help us avoid including another copy of BFS by using:

    pred, dist = nx.predecessor(G, n, return_seen=True)
    tree = nx.Graph((p[0], u) for u, p in pred.items() if u != n)
    for u, unbrs in tree.adj.items():
        ...

But maybe we should just write another version of BFS here as in your branch.

networkx/algorithms/cycles.py

networkx/algorithms/tests/test_cycles.py

boothby · 2023-05-22T01:18:24Z

Okay, I gave this all a good hard stare. What this algorithm really calls for is a function analogous to dfs_labeled_edges. Here's what I came up with: in our breadth first search, one sees four kinds of edges (tree, forward, back, and level). It's easy to efficiently traverse the graph and label each edge with the following:

BACK_EDGE = "back"
TREE_EDGE = "tree"
FORWARD_EDGE = "forward"
LEVEL_EDGE = "level"
def bfs_labeled_edges(G, sources):
    if sources in G:
        sources = [sources]

    depth = {s: 0 for s in sources}
    queue = deque((0, s) for s in sources)
    label = [LEVEL_EDGE, FORWARD_EDGE, BACK_EDGE]
    while queue:
        du, u = queue.popleft()
        for v in G[u]:
            dv = depth.get(v)
            if dv is None:
                depth[v] = dv = du + 1
                queue.append((dv, v))
                yield u, v, du, dv, TREE_EDGE
            else:
                yield u, v, du, dv, label[dv-du]

this greatly simplifies our girth computation:

def girth(G):
    girth = depth_limit = inf
    skip_labels = (
        nx.algorithms.traversal.breadth_first_search.BACK_EDGE,
        nx.algorithms.traversal.breadth_first_search.TREE_EDGE,
    )
    forward_edge = nx.algorithms.traversal.breadth_first_search.FORWARD_EDGE
    level_edge = nx.algorithms.traversal.breadth_first_search.LEVEL_EDGE
    for n in G:
        for u, v, du, dv, label in nx.bfs_labeled_edges(G, n):
            if du > depth_limit:
                break
            if label in skip_labels:
                pass
            else:
                length = du + dv + 1
                if length < girth:
                    depth_limit = du - (label is level_edge)
                    girth = length

    return girth

This gets my K200 benchmark down to ~25ms. Should I commit it? I'm mostly seeking commentary on bfs_labeled_edges here. The name is probably weird because I'm yielding depth information along with the edges. A more "pure" version might omit that info from the generator and recompute it (quick enough). A version seen as impure could live in cycles.py.

... 
    for n in G
        depth = {n:0}
        for u, v, label in nx.bfs_labeled_edges(G, n):
            if label is TREE_EDGE:
                depth[v] = depth[u] + 1
...

boothby · 2023-05-22T01:21:04Z

It occurs to me that we could do girth(G, parity = None/'even'/'odd') by optionally stuffing an extra label into skip_labels and thinking carefully about the stopping criterion.

dschult · 2023-05-22T14:08:23Z

I agree that we should create a bfs_labeled_edges function to match dfs_labeled_edges. It allows nice flexibility for the traversals. I have thought that before, but this seems like a good time to do it. I had always thought it should live in traversal/breadth_first_search.py, but could maybe be convinced otherwise. I also think it should work with directed and undirected graphs -- and probably multigraphs too if that doesn't prove to be too nasty.

As I think about edge types I'm confused by the BACK_EDGEs. At first I couldn't get them to happen. But I think now that I've convinced myself that they will not occur for undirected graphs. And they definitely can occur for directed graphs. For the undirected graphs, you would always traverse an edge outward before it would be reached going inward.

So, while I very much enjoyed(!) the triplet of edge labels indexed by dv-du, I think it isn't needed for the undirected case and for directed case we'll need to check if dv-du is negative rather than -1. Does this make sense?

I'm still thinking about whether to have the labels yield the distances -- it is certainly convenient, but also (as you mention) likely to have a way to compute it based on the previous edges. I'm leaning toward a pure version in traversal, and then the girth code (and maybe a doc_string example in the bfs_labeled_edges doc_string?) can show how to compute the distances. But I'd like to think/play about it some more. :}

boothby · 2023-05-22T14:53:47Z

As I drifted off to sleep last night, the directed case occurred to me. There, back-jumps can be arbitrarily long so my triplet is certainly invalid.

I think you're assuming that each edge should occur exactly once, and concluding that back-edges won't happen in the undirected case. And that's certainly achievable, but currently my code emits each undirected edge exactly twice.

Perhaps this?

def _undirected_bfs_labeled_edges(G, sources):
    if sources in G:
        sources = [sources]

    depth = {s: 0 for s in sources}
    visited = set()
    queue = deque((0, s) for s in sources)
    while queue:
        du, u = queue.popleft()
        visited.add(u)
        for v in G[u]:
            dv = depth.get(v)
            if dv is None:
                depth[v] = dv = du + 1
                queue.append((dv, v))
                yield u, v, TREE_EDGE
            else:
                if du == dv:
                    # could use either "in" or "not in" -- this emits self-loops
                    if v in visited:
                        yield u, v, LEVEL_EDGE
                elif du < dv:
                    yield u, v, FORWARD_EDGE

def _directed_bfs_labeled_edges(G, sources):
    if sources in G:
        sources = [sources]

    depth = {s: 0 for s in sources}
    queue = deque((0, s) for s in sources)
    while queue:
        du, u = queue.popleft()
        for v in G[u]:
            dv = depth.get(v)
            if dv is None:
                depth[v] = dv = du + 1
                queue.append((dv, v))
                yield u, v, TREE_EDGE
            elif du < dv:
                yield u, v, FORWARD_EDGE
            elif du > dv:
                yield u, v, BACK_EDGE
            else:
                yield u, v, LEVEL_EDGE

* added new helper function bfs_labeled_edges * added support for even girth and odd girth

boothby · 2023-05-22T22:47:49Z

Oh, drat. I was hasty in adding the option for odd&even girth. Thanks, Petersen Graph, for yet again showing us the way.

dschult

Nice!! :)

Whatever the choice for self-loops (see below), it looks like we don't need to separate directed and undirected. The use of visited should apply in the same way to both types of graphs -- so that self.loops are handled in both cases. And the case du > dv will not arise in the undirected case.

Thanks!

networkx/algorithms/traversal/tests/test_bfs.py

networkx/algorithms/traversal/breadth_first_search.py

boothby · 2023-05-23T15:15:33Z

The use of visited should apply in the same way to both types of graphs -- so that self.loops are handled in both cases

Self-loops are reported in both directed and undirected right now. The directed case does not need to track visited because edges are only traversed in a single direction. I'm reluctant to use the extra memory when not needed.

In the undirected case, every edge is traversed twice, so we need to use visited or another construct to filter edges from occurring twice (or, produce "reverse" edges -- which is also fine, but then self-loops would be reported only once). By using the visited filter, there are two choices

mark u as visited before or after iterating over G[u]
check if v in visited or v not in visited

Toggling either choice changes whether or not loops are reported in the undirected case.

boothby · 2023-05-24T22:41:01Z

new benchmark

head    K200   C1000  CC1019
5f516c6 1.7005 3.5432 0.4079 
a9f9343 1.6351 3.5444 0.4078 
bb815f6 1.7013 3.5563 0.4143 
e85f26c 1.6755 3.4549 0.3939 
344f5ea 1.6772 3.4542 0.4120 
25a4ae7 1.6477 3.4241 0.3932 
c2ac2ff 1.6864 3.5351 0.4074 
282312e 1.7071 3.5349 0.4011 
722039a 1.6862 3.4973 0.3965 
b0a63af 3.8599 2.8787 0.7263 
07e673a 0.2681 1.1041 0.0331 
54f2921 0.2803 1.1071 0.0333 
7b21cc9 0.2475 0.5321 0.0141 
40dde43 0.0340 0.5402 0.0107 
1af2d48 0.0132 1.3616 0.0120 
db297d4 0.0132 0.7919 0.0073 
2a60dd8 0.0135 0.8059 0.0076 
2b2516c 0.0134 0.7952 0.0077 
c789b37 0.0131 0.7806 0.0072 
0f44580 0.0136 0.8134 0.0078 
0d99c88 0.0138 0.8153 0.0077 
c101cf2 0.0109 0.7663 0.0069 
bcae004 0.0104 0.7665 0.0080
0e426c8 0.0093 0.4212 0.0063

from networkx import girth, complete_graph, cycle_graph, chordal_cycle_graph, Graph
from time import perf_counter

test_graphs = complete_graph(200), cycle_graph(1000), Graph([(u, v) for u, v, _ in chordal_cycle_graph(1019).edges if u != v])
g = girth
for G in test_graphs:
    t0, *girths, t1 = (
        perf_counter(),
        g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
        g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
        g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
        g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
        g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
        g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
        g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
        g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
        g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
        g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G), g(G),
        perf_counter()
    )
    print(f"{(t1-t0)/len(girths):.4f}", end=' ')
print()

boothby · 2023-05-25T22:32:13Z

I just noticed that this ticket is (now) related to #6340 and #6359 -- my approach is slightly different; it's worth looking at them holistically.

boothby · 2023-05-28T01:55:38Z

@dschult or @rossbar please take a (hopefully) final look. I should really be done picking at this.

dschult

Thanks for this @alabarre I bet you didn't expect this explosion of activity, but that shows that it is an interesting issue -- and it got us thinking about more than just girth. :)

Thanks @boothby for making this shine! :}

alabarre · 2023-05-29T12:16:35Z

Indeed I didn’t. I had no time to review or chime in on the latest developments, but it’s exciting to see what this humble snippet has become. Thanks to everyone involved, I’m looking forward to our next collaborations.

* added girth computation function * added tests for girth * removed useless method and added seed to random call as recommended * Add numpydoc reference formatting to docstring. * Apply suggestions from code review * Update networkx/algorithms/tests/test_cycles.py Co-authored-by: Dan Schult <dschult@colgate.edu> * Update networkx/algorithms/tests/test_cycles.py * Blacken. * Add girth fn to reference. * avoid frozenset and include other comments * fix import order * performance improvement to girth; added test for isolated nodes * Update networkx/algorithms/cycles.py Co-authored-by: Dan Schult <dschult@colgate.edu> * took Dan's suggestion to use nx.predecessor to obviate nx.single_source... * reordered computation to only find one min-length (even/odd) cycle per root * copying is no longer worthwhile for runtime * rewrote girth for speed and hopefully clarity * added new helper function bfs_labeled_edges * added support for even girth and odd girth * doctest typos * doctest fix * doctests were a mistake * removed faulty parity argument * patched up un-tested unit test * unified directed & undirected bfs_labeled_edges * touchup on bfs_labeled_edges; added _dispatch * more performance improvements --------- Co-authored-by: Ross Barnowski <rossbar@berkeley.edu> Co-authored-by: Mridul Seth <mail@mriduls.com> Co-authored-by: Dan Schult <dschult@colgate.edu> Co-authored-by: Kelly Boothby <boothby@dwavesys.com>

added girth computation function

5f516c6

rossbar reviewed Apr 13, 2023

View reviewed changes

networkx/algorithms/cycles.py Outdated Show resolved Hide resolved

added tests for girth

a9f9343

dschult reviewed Apr 14, 2023

View reviewed changes

networkx/algorithms/tests/test_cycles.py Outdated Show resolved Hide resolved

networkx/algorithms/tests/test_cycles.py Outdated Show resolved Hide resolved

alabarre and others added 2 commits April 14, 2023 22:28

removed useless method and added seed to random call as recommended

bb815f6

Add numpydoc reference formatting to docstring.

e85f26c

rossbar reviewed Apr 26, 2023

View reviewed changes

MridulS reviewed May 2, 2023

View reviewed changes

networkx/algorithms/cycles.py Outdated Show resolved Hide resolved

MridulS reviewed May 2, 2023

View reviewed changes

networkx/algorithms/tests/test_cycles.py Outdated Show resolved Hide resolved

Apply suggestions from code review

80e9259

dschult reviewed May 2, 2023

View reviewed changes

networkx/algorithms/tests/test_cycles.py Outdated Show resolved Hide resolved

rossbar and others added 5 commits May 18, 2023 23:29

Update networkx/algorithms/tests/test_cycles.py

344f5ea

Co-authored-by: Dan Schult <dschult@colgate.edu>

Update networkx/algorithms/tests/test_cycles.py

25a4ae7

Blacken.

c2ac2ff

Merge branch 'main' into girth

282312e

Add girth fn to reference.

722039a

rossbar approved these changes May 19, 2023

View reviewed changes

avoid frozenset and include other comments

1d4ec6f

dschult reviewed May 19, 2023

View reviewed changes

fix import order

b0a63af

performance improvement to girth; added test for isolated nodes

07e673a

dschult reviewed May 20, 2023

View reviewed changes

networkx/algorithms/cycles.py Outdated Show resolved Hide resolved

networkx/algorithms/tests/test_cycles.py Outdated Show resolved Hide resolved

rewrote girth for speed and hopefully clarity

db297d4

* added new helper function bfs_labeled_edges * added support for even girth and odd girth

boothby force-pushed the girth branch from 7275e00 to db297d4 Compare May 22, 2023 16:44

doctest typos

2a60dd8

boothby force-pushed the girth branch from 4f9b381 to 2a60dd8 Compare May 22, 2023 17:38

boothby added 2 commits May 22, 2023 10:54

doctest fix

2b2516c

doctests were a mistake

c789b37

removed faulty parity argument

0f44580

dschult reviewed May 23, 2023

View reviewed changes

networkx/algorithms/traversal/tests/test_bfs.py Show resolved Hide resolved

dschult reviewed May 23, 2023

View reviewed changes

networkx/algorithms/traversal/breadth_first_search.py Outdated Show resolved Hide resolved

patched up un-tested unit test

0d99c88

unified directed & undirected bfs_labeled_edges

c101cf2

boothby force-pushed the girth branch from 48642bf to c101cf2 Compare May 24, 2023 04:23

touchup on bfs_labeled_edges; added _dispatch

bcae004

more performance improvements

0e426c8

dschult approved these changes May 28, 2023

View reviewed changes

dschult merged commit af0a6e2 into networkx:main May 28, 2023
34 checks passed

jarrodmillman added this to the 3.2 milestone Jun 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added girth computation function #6633

Added girth computation function #6633

alabarre commented Apr 12, 2023

boothby commented Apr 14, 2023

alabarre commented Apr 14, 2023 •

edited

dschult left a comment

rossbar left a comment

rossbar left a comment

dschult left a comment •

edited

boothby commented May 19, 2023

dschult commented May 19, 2023

boothby commented May 19, 2023

dschult commented May 19, 2023

boothby commented May 20, 2023

boothby commented May 20, 2023

dschult left a comment •

edited

boothby commented May 22, 2023 •

edited

boothby commented May 22, 2023 •

edited

dschult commented May 22, 2023

boothby commented May 22, 2023 •

edited

boothby commented May 22, 2023

dschult left a comment

boothby commented May 23, 2023

boothby commented May 24, 2023 •

edited

boothby commented May 25, 2023 •

edited

boothby commented May 28, 2023

dschult left a comment

alabarre commented May 29, 2023

Added girth computation function #6633

Added girth computation function #6633

Conversation

alabarre commented Apr 12, 2023

boothby commented Apr 14, 2023

alabarre commented Apr 14, 2023 • edited

dschult left a comment

Choose a reason for hiding this comment

rossbar left a comment

Choose a reason for hiding this comment

rossbar left a comment

Choose a reason for hiding this comment

dschult left a comment • edited

Choose a reason for hiding this comment

boothby commented May 19, 2023

dschult commented May 19, 2023

boothby commented May 19, 2023

dschult commented May 19, 2023

boothby commented May 20, 2023

boothby commented May 20, 2023

dschult left a comment • edited

Choose a reason for hiding this comment

boothby commented May 22, 2023 • edited

boothby commented May 22, 2023 • edited

dschult commented May 22, 2023

boothby commented May 22, 2023 • edited

boothby commented May 22, 2023

dschult left a comment

Choose a reason for hiding this comment

boothby commented May 23, 2023

boothby commented May 24, 2023 • edited

boothby commented May 25, 2023 • edited

boothby commented May 28, 2023

dschult left a comment

Choose a reason for hiding this comment

alabarre commented May 29, 2023

alabarre commented Apr 14, 2023 •

edited

dschult left a comment •

edited

dschult left a comment •

edited

boothby commented May 22, 2023 •

edited

boothby commented May 22, 2023 •

edited

boothby commented May 22, 2023 •

edited

boothby commented May 24, 2023 •

edited

boothby commented May 25, 2023 •

edited