SCC benchmarks and use of `G._adj` in Tarjan algorithm by Peiffap · Pull Request #8064 · networkx/networkx

Peiffap · 2025-05-23T15:37:36Z

This PR is a minified version of #8056 that removes the changes to Kosaraju to simplify review of both, i.e. it:

Uses G._adj in strongly_connected_components.
Adds benchmarks for the SCC functions. These would be good to have to complement the discussion in Avoid re-exploring nodes in Kosaraju's SCC algorithm #8056.

amcandio

Looks good to me! I wonder if we should make _adj "public" at this point

MridulS · 2025-05-23T22:42:27Z

I wonder if we should make _adj "public" at this point

One of the reasons _adj isn't really public is that it's too easy to break your graph object!

In [1]: import networkx as nx

In [2]: G = nx.path_graph(4)

In [3]: G.adj[1] = 2
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[3], line 1
----> 1 G.adj[1] = 2

TypeError: 'AdjacencyView' object does not support item assignment

In [4]: G._adj[1] = 2

In [5]: print(G)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[5], line 1
----> 1 print(G)
.....
.....

TypeError: object of type 'int' has no len()

If required you can always use it, but _ by convention basically tells the users to play with this dictionary at your own risk!

What we could do is probably document this speed up if we haven't already!

Peiffap · 2025-05-23T23:18:00Z

benchmarks/benchmarks/benchmark_algorithms.py

+    _graphs = [
+        nx.erdos_renyi_graph(100, 0.1, directed=True),
+        nx.erdos_renyi_graph(100, 0.5, directed=True),
+        nx.erdos_renyi_graph(100, 0.9, directed=True),
+        nx.erdos_renyi_graph(1000, 0.01, directed=True),
+        nx.erdos_renyi_graph(1000, 0.05, directed=True),
+        nx.erdos_renyi_graph(1000, 0.09, directed=True),


These values aren't actually sufficient for measuring SCC performance at all - the graphs are too small and the edge probability is (way) too high. I'll play around with finding better values. In terms of execution time (per benchmark), what's the upper limit of what we'd want to add? A few seconds? Up to a minute?

My personal preference would be to keep runtimes of benchmarks manageable, at least within one collection of benchmarks.

Long-running benchmarks are okay in principle, but at the very least I'd advocate for them to be split off into their own class, which makes them easier to de-select with the asv CLI.

That makes sense. I'll try to keep the benchmarks so they don't go beyond the 120 second timeout (which I assume refers to the whole suite, not each individual test?).

In the spirit of your second paragraph: how come the benchmarks/ folder doesn't roughly mirror the structure of the main folder?

These values aren't actually sufficient for measuring SCC performance at all - the graphs are too small and the edge probability is (way) too high. I'll play around with finding better values.

I recently run into a similar issue when benchmarking Dijkstra's. You might need some engineering to get a challenging graph. One think I can think of here is:

Generate a random DAG of size S

Randomly assign each node to a node in the DAG

For each edge (u,v), only add it to a graph if random() < p and DAG[u]==DAG[v] or (DAG[u], DAG[v]) is an edge on the DAG

This way you guarantee that your graph has at least S strongly connected components

benchmarks/benchmarks/benchmark_algorithms.py

Co-authored-by: Ross Barnowski <rossbar@caltech.edu>

rossbar

Okay the benchmarks are now probing the behavior as expected. I see minor speedups for some benchmark cases (up to 20%) on my machine and, more importantly, no performance degradations! This is expected since the only change to code is switching from G[v] to G._adj[v], which should only improve performance.

As a quick reference for anyone who wants to run the benchmarks locally, try:

asv continuous --bench DirectedAlgorithmBenchmarks main enh/benchmark-scc

Once that's completed, you can run asv compare main enh/benchmark-scc to get a quick summary of the performance comparisons!

Thanks for putting this together @Peiffap !

Peiffap · 2025-07-08T13:23:25Z

Thanks for taking a look, @rossbar. It'd be good to get this in to unblock #8056!

Thanks for the note on running benchmarks locally, too. This kind of information around benchmarking should be part of the contributor's guide, IMO! Performance is a big consideration in most recent PRs and it's a shame that the information on how to measure it properly is so hard to find.

dschult

This looks good -- and nice to have the timing tools set up for directed case here.

* Use `G._adj` in inner loops of SCC algorithm * Add benchmarks for SCC algorithms * Use neighbor iterator to avoid re-traversing neighbors * Make generators iterate fully Co-authored-by: Ross Barnowski <rossbar@caltech.edu> --------- Co-authored-by: Ross Barnowski <rossbar@caltech.edu>

Peiffap added 3 commits May 23, 2025 17:34

Use G._adj in inner loops of SCC algorithm

2af8746

Add benchmarks for SCC algorithms

53f228e

Use neighbor iterator to avoid re-traversing neighbors

0d98749

Peiffap changed the title ~~Enh/benchmark scc~~ SCC benchmarks and use of G._adj in Tarjan algorithm May 23, 2025

rossbar added the type: Enhancements label May 23, 2025

amcandio approved these changes May 23, 2025

View reviewed changes

Peiffap commented May 23, 2025

View reviewed changes

rossbar reviewed Jun 11, 2025

View reviewed changes

benchmarks/benchmarks/benchmark_algorithms.py Outdated Show resolved Hide resolved

Peiffap and others added 2 commits June 12, 2025 14:00

Make generators iterate fully

0ce9c02

Co-authored-by: Ross Barnowski <rossbar@caltech.edu>

Merge branch 'main' into enh/benchmark-scc

67b89eb

rossbar approved these changes Jul 8, 2025

View reviewed changes

dschult approved these changes Jul 9, 2025

View reviewed changes

dschult merged commit 860948d into networkx:main Jul 9, 2025
45 checks passed

Peiffap deleted the enh/benchmark-scc branch July 9, 2025 15:11

Peiffap mentioned this pull request Jul 16, 2025

Avoid re-exploring nodes in Kosaraju's SCC algorithm #8056

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SCC benchmarks and use of `G._adj` in Tarjan algorithm#8064

SCC benchmarks and use of `G._adj` in Tarjan algorithm#8064
dschult merged 5 commits intonetworkx:mainfrom
Peiffap:enh/benchmark-scc

Peiffap commented May 23, 2025 •

edited

Loading

Uh oh!

amcandio left a comment

Uh oh!

MridulS commented May 23, 2025 •

edited

Loading

Uh oh!

Peiffap May 23, 2025

Uh oh!

rossbar May 23, 2025

Uh oh!

Peiffap May 26, 2025

Uh oh!

amcandio May 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

rossbar left a comment

Uh oh!

Peiffap commented Jul 8, 2025

Uh oh!

dschult left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

Peiffap commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amcandio left a comment

Choose a reason for hiding this comment

Uh oh!

MridulS commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Peiffap May 23, 2025

Choose a reason for hiding this comment

Uh oh!

rossbar May 23, 2025

Choose a reason for hiding this comment

Uh oh!

Peiffap May 26, 2025

Choose a reason for hiding this comment

Uh oh!

amcandio May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rossbar left a comment

Choose a reason for hiding this comment

Uh oh!

Peiffap commented Jul 8, 2025

Uh oh!

dschult left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants

Peiffap commented May 23, 2025 •

edited

Loading

MridulS commented May 23, 2025 •

edited

Loading

amcandio May 27, 2025 •

edited

Loading