-
Notifications
You must be signed in to change notification settings - Fork 36
Open
Description
largest_connected_component() crashes with ZeroDivisionError on empty graphs
Repository: microsoft/DeepGNN
Description
The largest_connected_component() function in src/python/deepgnn/graph_engine/snark/preprocess/sampler/metric.py does not guard against empty graphs (0 nodes), causing two crashes:
def largest_connected_component(g: nx.digraph) -> float:
"""Return the scaled size of the largest strongly connected component."""
largest_strongly_connected_component = g.subgraph(
max(nx.strongly_connected_components(g), key=len)
#^^^ ValueError: max() arg is an empty sequence (when g has 0 nodes)
)
return nx.number_of_nodes(
largest_strongly_connected_component
) / nx.number_of_nodes(g)
# ^^^ ZeroDivisionError (when g has 0 nodes)max()raisesValueErroron the empty iterator fromstrongly_connected_components()of an empty graphnx.number_of_nodes(g)returns 0 →ZeroDivisionError
Empty graphs are a realistic edge case in graph preprocessing pipelines — e.g., when a filtered subgraph has no matching nodes, or when processing an empty graph partition.
The sibling function densification() in the same file has the same issue with math.log(..., nx.number_of_nodes(g)).
Suggested fix
def largest_connected_component(g: nx.digraph) -> float:
"""Return the scaled size of the largest strongly connected component."""
n = nx.number_of_nodes(g)
if n == 0:
return 0.0
largest_strongly_connected_component = g.subgraph(
max(nx.strongly_connected_components(g), key=len)
)
return nx.number_of_nodes(largest_strongly_connected_component) / nThe same guard should be added to densification(), diameter(), and max_adjacency() in this file.
How this was found
This was identified via static analysis using a3-python, which flagged the unguarded division as a DSE-confirmed reachable DIV_ZERO.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels