Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bfs_layers function to the bfs_* module #5870

Closed
dschult opened this issue Jul 15, 2022 · 5 comments
Closed

Add bfs_layers function to the bfs_* module #5870

dschult opened this issue Jul 15, 2022 · 5 comments

Comments

@dschult
Copy link
Member

dschult commented Jul 15, 2022

The PR #5788 by @kpetridis24 on VF2++ has a nice function called BFS_levels which in slightly generalized form would be a good addition to the breadth_first_search.py module. It is very similar to the topological_generations function but not restricted to DAGs. The function in #5788 processes each level, while I think the generalized function should just yield the levels as sets of nodes like the topological_generations function does. I suppose it could be lists instead of sets since there is a graph ordering of the nodes.

What's the right name? The right signature? The right generated output?
One idea:
def bfs_layers(G, source, reverse=False): yielding sets of nodes starting with {source}
But maybe bfs_levels, or bfs_generations?

Should it be allowed to have multiple source nodes? (topological generations treats all nodes with no incoming edges as being in the initial generation) The other BFS functions require a single source node.

@rossbar
Copy link
Contributor

rossbar commented Jul 16, 2022

I agree this would be a really nice addition! Re: API I'm not sure what's best, though I do think it makes sense for the "layers" to be lists instead of sets to preserve information about the visit order within each layer (IIUC).

@whym1here
Copy link
Contributor

whym1here commented Jul 17, 2022

Should it be implemented using layer-wise bfs and return a list(or generator) with the nodes in each layer? This will also work with multiple sources.

def bfs_layers(G, source, depth_limit=None, reverse=False):
  """To be added soon and reverse needs to be handled too"""
  neighbors = lambda node: iter(_neighbors(node))
  visited = {}
  if isintance(source, Iterable): # from collection.abc import Iterable
    visited = set(source)
  else:
    visisted = {source} 
  if depth_limit is None:
        depth_limit = len(G)
  # from collections import deque 
  queue = deque([(node, depth_limit, neighbors(node))] for node in visited)
  while queue:
    layer = set({})
    while queue:
      parent, depth_now, children = queue[0]
      try:
        child = next(children)
        if child not visited:
          layer.add(child)
          visited.add(child)
          if depth_now > 1:
            queue.append((child, depth_now - 1, neighbors(child)))
      except StopIteration:
          queue.popleft()
      yield layer

something like this?

@dschult
Copy link
Member Author

dschult commented Jul 17, 2022

@rossbar I agree that returning lists would be better than sets because they retain the information for the order in which the BFS is progressing. I think we might need a lookup in the current layer for each node. The current layer is smaller than the whole graph, but it feels wrong to do a lookup on a list. I guess a dict is a good combination of list and set: ordered and yet fast lookup.

@still-n0thing I think the code to handle a single source and multiple sources can be cleaner (and match other parts of NetworkX) with the idiom:

def bfs_layers(G, sources, ...)
    ...
    if sources in G:
        sources = [sources]
    ...

I think the function descendants_at_distance() in breadth_first_search.py already provides much of the framework, and we could take the code from there to make the bfs_layers generator and have descendants_at_distance call bfs_layers. So, the heart of bfs_layers would be:

    while current_layer:
        yield current_layer
        next_layer = []
        for node in current_layer:
            for child in G[node]:
                if child not in visited:
                    visited.add(child)
                    next_layer.append(child)
        current_layer = next_layer

@whym1here
Copy link
Contributor

whym1here commented Jul 17, 2022

@dschult can you take a look at #5879 and help me with the testing if possible?

@dschult
Copy link
Member Author

dschult commented Aug 19, 2022

Fixed in #5879. Thanks @still-n0thing!

@dschult dschult closed this as completed Aug 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants