Problem
Bounded MATCH ... WITH ... MATCH ... reentry currently works, but the internal design is hard to reason about.
Today the same concept is spread across multiple mechanisms:
start_nodes_query in graphistry/compute/gfql/cypher/lowering.py
- hidden
__cypher_reentry_* columns and expression rewrites in graphistry/compute/gfql/cypher/lowering.py
_cypher_entity_projection_meta side-channel metadata
_compiled_query_reentry_state() stitching logic in graphistry/compute/gfql_unified.py
That makes the compiler/runtime contract implicit instead of explicit. A senior compiler / graph language / GPU engineer joining the project would have to reconstruct the model from several places at once.
Why This Matters
- harder to audit vectorization and backend purity
- harder to extend to the next row-seeded features
- hidden invariants across compiler + runtime increase maintenance risk
lowering.py and gfql_unified.py are longer and conceptually denser than they need to be
Proposed Refactor
Treat bounded reentry as a first-class plan/runtime concept rather than a protocol assembled from side channels.
Recommended steps:
-
Introduce an explicit ReentryPlan (or SeededMatchPlan) dataclass.
- carried alias
- id column
- carried scalar outputs
- ordering contract
- trailing match alias contract
-
Replace the current hidden-property rewrite protocol.
- stop encoding carried scalars as synthetic
__cypher_reentry_* property accesses
- instead carry an explicit scalar mapping in the plan contract
-
Move runtime stitching into a dedicated reentry module.
- keep
gfql_unified.py as dispatch/orchestration
- move reentry-specific assembly/validation into a smaller targeted runtime helper module
-
Make row-order and seed-row semantics explicit.
- preserve order as part of the contract, not as an inferred merge behavior
-
Split lowering.py by concern where useful.
- general lowering
- result projection planning
- bounded reentry planning
Non-Goals
Success Criteria
- existing bounded-reentry semantics stay green
- current pandas + cudf bounded-reentry tests stay green
- the reentry contract becomes readable from one place
- low-hundreds LOC reduction across
lowering.py + gfql_unified.py is plausible from collapsing duplicate protocol layers
- follow-on work for multi-alias row carriers / optional null-extension becomes easier to reason about
Context
Current bounded-reentry hardening/validation work is in PR #975.
This issue is the follow-on cleanup/refactor lane, not a request to reopen that PR scope.
Problem
Bounded
MATCH ... WITH ... MATCH ...reentry currently works, but the internal design is hard to reason about.Today the same concept is spread across multiple mechanisms:
start_nodes_queryingraphistry/compute/gfql/cypher/lowering.py__cypher_reentry_*columns and expression rewrites ingraphistry/compute/gfql/cypher/lowering.py_cypher_entity_projection_metaside-channel metadata_compiled_query_reentry_state()stitching logic ingraphistry/compute/gfql_unified.pyThat makes the compiler/runtime contract implicit instead of explicit. A senior compiler / graph language / GPU engineer joining the project would have to reconstruct the model from several places at once.
Why This Matters
lowering.pyandgfql_unified.pyare longer and conceptually denser than they need to beProposed Refactor
Treat bounded reentry as a first-class plan/runtime concept rather than a protocol assembled from side channels.
Recommended steps:
Introduce an explicit
ReentryPlan(orSeededMatchPlan) dataclass.Replace the current hidden-property rewrite protocol.
__cypher_reentry_*property accessesMove runtime stitching into a dedicated reentry module.
gfql_unified.pyas dispatch/orchestrationMake row-order and seed-row semantics explicit.
Split
lowering.pyby concern where useful.Non-Goals
WITHsemantics hereSuccess Criteria
lowering.py+gfql_unified.pyis plausible from collapsing duplicate protocol layersContext
Current bounded-reentry hardening/validation work is in PR #975.
This issue is the follow-on cleanup/refactor lane, not a request to reopen that PR scope.