simplify stack in dfs #6366

Tortar · 2023-01-15T00:25:40Z

The stack in dfs can have only parent and children, the depth can be controlled from the outside of it, making the function slightly better in terms of time and space, but also readibility in my opinion :-)

rossbar

I personally find it harder to reason about with the depth tracked separately outside of the stack, but that's admittedly pretty subjective. Out of curiosity what's the primary motivation for the proposed change?

Tortar · 2023-01-21T14:19:20Z

From the speed point of view I measured from a tiny 3% better to a somewhat more relavant 15% better in the best case, so not that much. I think that memory-wise the stack this way consumes probably 1/3 less space, which is good given that it is the second data structure in this method in terms of memory occupied, but all in all, it is not much of a difference, the primary motivation was that I found it interesting to think about this optimization :D

rossbar · 2023-01-21T20:23:05Z

I think that memory-wise the stack this way consumes probably 1/3 less space

Well, at least 1/3 the number of objects, which is not the same as memory since depth_now is just an integer. I was curious what this would look like in practice I tested with path_graph(10_000_000)¹ and used memray for memory profiling:

On main

This PR

So that's a 400 MB reduction in total usage and 300 fewer allocations for this call (which should help with performance), but accounting for the other changes it ends up being about 300 MB saving for dfs_edges (out of 6.8 GB, so ~4%).

Sorry for the bad screengrab quality - to reproduce (though you may want to use fewer nodes for systems with <16GB ram):

Create a script, e.g. dfs_memtest.py

import networkx as nx
G = nx.path_graph(10_000_000)
e = list(nx.dfs_edges(G, source=0))

Follow the memray instructions to create the flamegraph

The reasoning here was that this graph would maximize the memory saving, as each node only ever has one child, and the nodes are also integers. ↩

Tortar · 2023-01-22T01:08:41Z

that's interesting, thanks for the flamegraph and the explanation on how to reproduce it! Do you think it is worth it? I don't consider the code less readable, it is actually clearer with this edit at what level we are in the dfs in my opinion

rossbar · 2023-01-22T19:21:33Z

it is actually clearer with this edit

This is inherently subjective - there is no "right" way to do it. The reason I find it more readable is that the state (i.e. the current depth) is tracked with the current node, whereas the proposed change tracks the state separately. Though again, this is entirely subjective so we're unlikely to make any progress discussing on this front :)

A minor improvement to memory consumption is a more concrete justification for the proposed change. I'd like to hear what others think!

Tortar · 2023-01-22T20:44:56Z

indeed I was too assertive, I meant to say that it is easier to track when we change levels in the dfs, but anyway I agree that the objective motivation for the change has to be evaluated in relation to the little improvement in memory/time!

ImHereForTheCookies · 2023-02-09T02:45:14Z

This makes sense to me. 👍

boothby · 2023-05-26T23:26:08Z

FWIW I agree that a local value is preferable to loading the stack down with longer state tuples. I prefer a slightly different idiom than the one used here:

     while stack:
         parent, children = stack[-1]
         try:
             child = next(children)
             ....
         except StopIteration:
             stack.pop()

# without try/except

    while stack:
        parent, children = stack[-1]
        for child in children:
            ...
            break
        else:
            stack.pop()

In avoiding exceptions, the for,break,else tends to be a little faster.

boothby · 2023-05-27T15:57:09Z

@Tortar I have implemented my suggestion here: https://github.com/boothby/networkx/tree/tortar-patch-8 -- I would push the changes to your repo but the pre-commit hooks are currently faulty in your branch (not your fault; easily resolved with git rebase -i main) -- after rebasing I cannot push to your branch without --force which is awfully impolite.

Note that we only need to break when we grow the stack -- the for,break,else pattern is most effective in dfs_labeled_edges where non-tree edges are yielded "for free" within the iterator resulting in fewer touches on the stack:

        while stack:
            parent, children = stack[-1]
            for child in children:
                if child in visited:
                    yield parent, child, "nontree"
                else:
                    yield parent, child, "forward"
                    visited.add(child)
                    if depth_now < depth_limit:
                        stack.append((child, iter(G[child])))
                        depth_now += 1
                        break
                    else:
                        yield parent, child, "reverse-depth_limit"
            else:
                stack.pop()
                depth_now -= 1
                if stack:
                    yield stack[-1][0], parent, "reverse"

networkx/algorithms/traversal/depth_first_search.py

Tortar · 2023-05-27T16:15:02Z

thanks @boothby for the improvements! will push them in a moment and make you a co-author of the pull :-)

Co-authored-by: boothby <kelly.r.boothby@gmail.com>

dschult

There is also #6714 which looks to fix #6479...

I think we can/should go ahead and merge these changes as a first step -- and then address the strange behavior seen with depth-limited search.

I'm approving this and I'll have @boothby press the green button. There are a lot of PRs related to this and the order may impact merge conflicts. Best to have control in those cases I guess. :}

Thanks!

* simplify stack in dfs * Update depth_first_search.py * Update depth_first_search.py * Add suggestions Co-authored-by: boothby <kelly.r.boothby@gmail.com> * Update depth_first_search.py --------- Co-authored-by: boothby <kelly.r.boothby@gmail.com>

Tortar added 3 commits January 15, 2023 01:19

simplify stack in dfs

1e15812

Update depth_first_search.py

d2680c1

Update depth_first_search.py

56b2882

rossbar reviewed Jan 21, 2023

View reviewed changes

ImHereForTheCookies mentioned this pull request Feb 9, 2023

optimize generic_bfs_edges function #6359

Merged

boothby requested changes May 27, 2023

View reviewed changes

networkx/algorithms/traversal/depth_first_search.py Outdated Show resolved Hide resolved

networkx/algorithms/traversal/depth_first_search.py Outdated Show resolved Hide resolved

Add suggestions

c6c1b35

Co-authored-by: boothby <kelly.r.boothby@gmail.com>

boothby approved these changes May 27, 2023

View reviewed changes

Update depth_first_search.py

6ca4da4

boothby mentioned this pull request May 27, 2023

dfs_postorder_nodes doesn't work with depth_limit #6479

Open

dschult approved these changes May 27, 2023

View reviewed changes

boothby merged commit 8410c37 into networkx:main May 27, 2023
34 checks passed

dschult mentioned this pull request May 29, 2023

depth limited search labeled edges #6716

Open

jarrodmillman added this to the 3.2 milestone Jun 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

simplify stack in dfs #6366

simplify stack in dfs #6366

Tortar commented Jan 15, 2023 •

edited

rossbar left a comment

Tortar commented Jan 21, 2023 •

edited

rossbar commented Jan 21, 2023

Tortar commented Jan 22, 2023 •

edited

rossbar commented Jan 22, 2023

Tortar commented Jan 22, 2023 •

edited

ImHereForTheCookies commented Feb 9, 2023

boothby commented May 26, 2023 •

edited

boothby commented May 27, 2023

Tortar commented May 27, 2023

dschult left a comment

simplify stack in dfs #6366

simplify stack in dfs #6366

Conversation

Tortar commented Jan 15, 2023 • edited

rossbar left a comment

Choose a reason for hiding this comment

Tortar commented Jan 21, 2023 • edited

rossbar commented Jan 21, 2023

On main

This PR

Footnotes

Tortar commented Jan 22, 2023 • edited

rossbar commented Jan 22, 2023

Tortar commented Jan 22, 2023 • edited

ImHereForTheCookies commented Feb 9, 2023

boothby commented May 26, 2023 • edited

boothby commented May 27, 2023

Tortar commented May 27, 2023

dschult left a comment

Choose a reason for hiding this comment

Tortar commented Jan 15, 2023 •

edited

Tortar commented Jan 21, 2023 •

edited

Tortar commented Jan 22, 2023 •

edited

Tortar commented Jan 22, 2023 •

edited

boothby commented May 26, 2023 •

edited