nx.dfs_labeled_edges does not return last visited edge if depth_limit is specified #6239

ikarsokolov · 2022-11-26T08:09:13Z

nx.dfs_labeled_edges does not return last (deepest) visited edge when traversing backwards if depth_limit is specified.

Current Behavior

graph = nx.path_graph(5, nx.DiGraph)
list(nx.dfs_labeled_edges(graph, source=0))

[(0, 0, 'forward'),
 (0, 1, 'forward'),
 (1, 2, 'forward'),
 (2, 3, 'forward'),
 (3, 4, 'forward'),
 (3, 4, 'reverse'),
 (2, 3, 'reverse'),
 (1, 2, 'reverse'),
 (0, 1, 'reverse'),
 (0, 0, 'reverse')]

list(nx.dfs_labeled_edges(graph, source=0, depth_limit=4))

[(0, 0, 'forward'),
 (0, 1, 'forward'),
 (1, 2, 'forward'),
 (2, 3, 'forward'),
 (3, 4, 'forward'),
 (2, 3, 'reverse'),
 (1, 2, 'reverse'),
 (0, 1, 'reverse'),
 (0, 0, 'reverse')]

Note that edge (3, 4, 'reverse') is missing.

Expected Behavior

graph = nx.path_graph(5, nx.DiGraph)
list(nx.dfs_labeled_edges(graph, source=0, depth_limit=4))

[(0, 0, 'forward'),
 (0, 1, 'forward'),
 (1, 2, 'forward'),
 (2, 3, 'forward'),
 (3, 4, 'forward'),
 (3, 4, 'reverse'),
 (2, 3, 'reverse'),
 (1, 2, 'reverse'),
 (0, 1, 'reverse'),
 (0, 0, 'reverse')]

Environment

Python version: 3.10
NetworkX version: 2.8.8 and 3.0rc1

The text was updated successfully, but these errors were encountered:

dschult · 2022-11-26T17:13:07Z

I can verify that there is a bug in dfs_labeled_edges for the reporting of reverse edges at the requested depth_level. The fix is to add an else clause for the check on depth_now > 1. And there are almost no tests currently for "reverse" edge reporting in the test suite. So we should add those.

dschult · 2022-11-26T20:28:21Z

This may not be a bug afterall... more like a question about the definition of a "reverse" edge.

A "reverse" edge is one in which both u and v have been visited and the edge is in the DFS tree. But with the depth_limit, we don't actually visit node v when we move forward along (u, v)... we reach the depth limit and stop that branch. Because we haven't visited that node, some people could argue that we shouldn't report the "reverse" edge.

One result is that for a fixed depth_limit, the preorder of nodes reports one depth more of nodes than the post-order.

But we can define the labeled DFS edges in either manner. For depth_limit 2, we could report the forward and reverse edges that we would-have-seen without a depth_limit for the edges that we touch. Or we could report the reverse edges only if we actually visit the node at the far end of the edge. I can't find any good definition of what constitutes a reverse edge (also called a back-edge) when there is a depth_limit to the traversal.

@ikarsokolov can you describe the application you are working with and why you expect there to be a reverse edge for each forward edge when the depth_limit is set?

I can see an advantage to ensuring that "reverse" edge is reported for each "forward" edge. You gain information about when you stopped looking at a node -- but you don't have information about whether the DFS had finished visiting that node or you reached the depth_limit. We can easily limit the impact of this change to just this function. What is the right thing to report?

ikarsokolov · 2022-11-26T21:51:05Z

@dschult thanks for quick and detailed responses!

can you describe the application you are working with and why you expect there to be a reverse edge for each forward edge when the depth_limit is set?

Simple example:
I have tree and want to do DFS from the root. I also have blacklist of nodes that should be excluded from DFS result with their complete subtrees. Something like "I want filtered iterator over tree without these particular nodes and all their children".

I listen to traversal events:

"forward" edge pointed to blacklisted node => blacklisted subtree traversal is started, subsequent nodes should be dropped from the result.
Matching "reverse" edge => exit from blacklisted node, subtree traversal is finished, subsequent nodes should be kept in the result.

If depth_limit is set and blacklisted node happens to be one level above limit I will never receive signal that node + it's subtree traversal is complete.

A "reverse" edge is one in which both u and v have been visited and the edge is in the DFS tree. But with the depth_limit, we don't actually visit node v when we move forward along (u, v)... we reach the depth limit and stop that branch. Because we haven't visited that node, some people could argue that we shouldn't report the "reverse" edge.

I see your point. I think in this case we should have another label besides "forward". Something like "examine".
See how this is done in boost graph library.
They handled traversal with visitor pattern but the principle is the same.

vis.examine_edge(e, g) is invoked on every out-edge of each vertex after it is discovered.
vis.tree_edge(e, g) is invoked on each edge as it becomes a member of the edges that form the search tree. If you wish to record predecessors, do so at this event point.
vis.back_edge(e, g) is invoked on the back edges in the graph.

examine_edge is missing label in NetworkX implementation.

The only drawback I see here is wasted computational power. "Examined" nodes on last depth level will not be included in final result. At least on my naive implementation. My graph has about ~1M nodes so a lot of nodes can be enumerated here,

dschult · 2022-11-27T08:52:47Z

With our code, you can get the "examine" trait of an edges by looking for etype in ("forward", "nontree"). So, I'm not sure it helps all that much. The difference I see is that the boost code has a "finish_vertex" moment. That's really what you are looking for: a point when you can be sure the tree from that node has been explored completely. That's what we use "reverse" to indicate.

But we have to figure out whether we consider the subtree-from-that-node to have been fully explored when the depth_limit stopped the exploration. Currently, we don't indicate that we stopped exploring the tree-from-that-node. I think it would be good to report the edge as "reverse-depth_limit" to make a note that a reverse edge should be indicated (we are done exploring that node), but that it isn't the same as a normal "reverse" edge. I'll put together a PR for this.

In the meantime, a workaround is for you to track the depth yourself, and when a "forward" edge reaches the depth_limit, the corresponding "reverse" edge is guaranteed to immediately follow that "forward" edge. So, you don't have to enter blacklisted mode at all. You know that no subtree nodes will be reported.
Something like:

if etype == "forward":
    if v in excluded_nodes:
        if depth < depth_limit:  # no need to blacklist the subtree if depth == depth_limit
            blacklisted_subtree = True

Fixes networkx#6239

ikarsokolov · 2022-11-28T19:50:28Z

@dschult I tried your patch and it works exactly like I wanted.
Thanks.

dschult added a commit to dschult/networkx that referenced this issue Nov 27, 2022

Add dfs_labeled_edges reporting of reverse edges due to depth_limit.

a8cde41

Fixes networkx#6239

dschult mentioned this issue Nov 27, 2022

Add dfs_labeled_edges reporting of reverse edges due to depth_limit. #6240

Merged

MridulS closed this as completed in #6240 Dec 1, 2022

dschult mentioned this issue Mar 11, 2023

dfs_postorder_nodes doesn't work with depth_limit #6479

Open

dschult mentioned this issue May 29, 2023

depth limited search labeled edges #6716

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nx.dfs_labeled_edges does not return last visited edge if depth_limit is specified #6239

nx.dfs_labeled_edges does not return last visited edge if depth_limit is specified #6239

ikarsokolov commented Nov 26, 2022

dschult commented Nov 26, 2022

dschult commented Nov 26, 2022

ikarsokolov commented Nov 26, 2022 •

edited

dschult commented Nov 27, 2022

ikarsokolov commented Nov 28, 2022

nx.dfs_labeled_edges does not return last visited edge if depth_limit is specified #6239

nx.dfs_labeled_edges does not return last visited edge if depth_limit is specified #6239

Comments

ikarsokolov commented Nov 26, 2022

Current Behavior

Expected Behavior

Environment

dschult commented Nov 26, 2022

dschult commented Nov 26, 2022

ikarsokolov commented Nov 26, 2022 • edited

dschult commented Nov 27, 2022

ikarsokolov commented Nov 28, 2022

ikarsokolov commented Nov 26, 2022 •

edited