-
Notifications
You must be signed in to change notification settings - Fork 224
Open
Description
Summary
Separate from the RAPIDS graph-build regression, PyGraphistry should consider a local optimization feature for reusing a built cuGraph graph across repeated GPU algorithm calls on unchanged topology.
This is a product/perf feature request, not a bug report.
Motivation
We currently pay to_cugraph() / from_cudf_edgelist() graph-build cost each time a cuGraph algorithm is run unless the caller manually threads G=... through compute_cugraph() in graphistry/plugins/cugraph.py.
The current branch investigation showed:
- graph build is often the dominant cost
- the main RAPIDS regression is upstream-facing
- but local reuse of an already-built
Gis still a valid PyGraphistry optimization for repeated algorithm calls
Scope
Intended scope:
- repeated
compute_cugraph()calls - repeated
layout_cugraph()calls - GFQL
CALL graphistry.cugraph.*reuse on unchanged topology
Explicitly out of scope:
- generic GFQL
MATCH/hoptraversal - dataframe-based search stages
- trying to fix the RAPIDS renumber regression by caching
Design direction
Potential design:
- cache built cuGraph graph state on the
Plottable - key it by:
- edge table identity
- source/destination bindings
- edge-weight binding
directedkind- relevant
from_cudf_edgelistoptions
- invalidate on topology or option changes
Likely invalidators:
edges(...)- filtering
hop()- rebinding source/destination/weight
reset_caches()
Existing hook
We already have a manual reuse hook:
compute_cugraph(..., G=...)ingraphistry/plugins/cugraph.py
This request is about making that reuse practical and automatic when safe.
Requested outcome
- Decide whether this should be an internal optimization first or a public feature immediately.
- If implemented, keep it separate from the RAPIDS regression work.
- Add explicit tests for cache hits, misses, and invalidation on topology changes.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels