GFQL monolith shrinkdown S3: split lowering.py projection + reentry concerns#1303
Merged
GFQL monolith shrinkdown S3: split lowering.py projection + reentry concerns#1303
Conversation
This was referenced May 5, 2026
lmeyerov
added a commit
that referenced
this pull request
May 5, 2026
…5-05) Master moved ahead with #1303 (GFQL monolith shrinkdown S3 — split lowering.py projection + reentry concerns into cypher/projection_planning.py and cypher/reentry/runtime.py). Diff overlap with T3 was zero (T3 lives in ir/, #1303 in cypher/; CHANGELOG was the only candidate-conflict file and #1303 didn't touch it). Both lanes share the conceptual surface "lowering produces LogicalPlan-shaped output that the IR verifies", so adding TestSeamWith1303LoweringSplit (4 tests) to pin the seam: - Co-import of #1303 split modules + T3 metadata module without circular surprise; lowering still re-exposes the split delegators. - Realistic NodeScan→PatternMatch→Filter→Project chain — invariant 6 passes through carve-outs and column drops. - OPTIONAL MATCH plan with non-nullable scalar — pins that invariant 5 fires exactly once; invariant 6 stays silent (no false-positive double signal). - Helper contract on a RowSchema shape mirroring projection_planning.py's whole-row + scalar-projection emits (NodeRef + ScalarType columns). Refresh CHANGELOG count from 57 to 61. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 tasks
lmeyerov
added a commit
that referenced
this pull request
May 5, 2026
…lable User raised that this PR was net additive (+869/-1) — surprising next to the #1260 monolith-shrinkdown lane. Audit found two safe in-house consolidations: invariants 5 and 6 in verifier.py both contained inline `isinstance(typ, ScalarType) and ... typ.nullable` checks that T3's new is_nullable helper now expresses directly. Apply both. Two further candidates were inspected and intentionally deferred: - binder.py:151 nullable-OR union merge — T2 #1302 hot zone - lowering.py:571 BoundVariable nullable+null_extended flatten — #1295/#1303 hot zone Both deferrals are documented in CHANGELOG and plan.md so the T3.b follow-on slice has a concrete punch list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
lmeyerov
added a commit
that referenced
this pull request
May 5, 2026
User pressure-tested PR #1304 against #1260-style code-reduction goals and repo-quality / adoption goals. Honest critiques landed; applying the cheap mitigations: - Item 1: re-export the eight metadata helpers from `graphistry.compute.gfql.ir` so external adopters can discover them via the package surface, not the deep `ir.metadata` path. - Item 2: drop `assert hasattr(...)` private-symbol coupling from the TestSeamWith1303LoweringSplit co-import test. #1303's projection_planning and reentry/runtime modules `globals().update(vars(lowering))` to inherit the shared symbol table, which clobbers `__name__` and `__file__`. Use sys.modules registration as the right load-success witness — purely behavioral, no internal-symbol coupling. - Item 5: verifier.py docstring now disclosures honestly that `verify(...)` is currently called only by the test suite — invariants document intended semantics, not always-on runtime safety. Callers integrating strict-mode validation (T2 #1302) or arrow coercion (T4) should plan on calling verify explicitly. - Item 3: opened **#1309** as the T3.b tracking issue for the deferred binder.py:151 + lowering.py:571 consolidations onto helpers. Linked from CHANGELOG so the punch list lives in GitHub, not just in plan.md. Items 4 (PR description framing on `ScalarType.kind="unknown"`) and 6 (coordinator handoff messaging on indirect GraphistryGPT relevance) are non-code edits handled separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lmeyerov
added a commit
that referenced
this pull request
May 5, 2026
…5-05) Master moved ahead with #1303 (GFQL monolith shrinkdown S3 — split lowering.py projection + reentry concerns into cypher/projection_planning.py and cypher/reentry/runtime.py). Diff overlap with T3 was zero (T3 lives in ir/, #1303 in cypher/; CHANGELOG was the only candidate-conflict file and Both lanes share the conceptual surface "lowering produces LogicalPlan-shaped output that the IR verifies", so adding TestSeamWith1303LoweringSplit (4 tests) to pin the seam: - Co-import of #1303 split modules + T3 metadata module without circular surprise; lowering still re-exposes the split delegators. - Realistic NodeScan→PatternMatch→Filter→Project chain — invariant 6 passes through carve-outs and column drops. - OPTIONAL MATCH plan with non-nullable scalar — pins that invariant 5 fires exactly once; invariant 6 stays silent (no false-positive double signal). - Helper contract on a RowSchema shape mirroring projection_planning.py's whole-row + scalar-projection emits (NodeRef + ScalarType columns). Refresh CHANGELOG count from 57 to 61. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lmeyerov
added a commit
that referenced
this pull request
May 5, 2026
…lable User raised that this PR was net additive (+869/-1) — surprising next to the #1260 monolith-shrinkdown lane. Audit found two safe in-house consolidations: invariants 5 and 6 in verifier.py both contained inline `isinstance(typ, ScalarType) and ... typ.nullable` checks that T3's new is_nullable helper now expresses directly. Apply both. Two further candidates were inspected and intentionally deferred: - binder.py:151 nullable-OR union merge — T2 #1302 hot zone - lowering.py:571 BoundVariable nullable+null_extended flatten — #1295/#1303 hot zone Both deferrals are documented in CHANGELOG and plan.md so the T3.b follow-on slice has a concrete punch list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lmeyerov
added a commit
that referenced
this pull request
May 5, 2026
User pressure-tested PR #1304 against #1260-style code-reduction goals and repo-quality / adoption goals. Honest critiques landed; applying the cheap mitigations: - Item 1: re-export the eight metadata helpers from `graphistry.compute.gfql.ir` so external adopters can discover them via the package surface, not the deep `ir.metadata` path. - Item 2: drop `assert hasattr(...)` private-symbol coupling from the TestSeamWith1303LoweringSplit co-import test. #1303's projection_planning and reentry/runtime modules `globals().update(vars(lowering))` to inherit the shared symbol table, which clobbers `__name__` and `__file__`. Use sys.modules registration as the right load-success witness — purely behavioral, no internal-symbol coupling. - Item 5: verifier.py docstring now disclosures honestly that `verify(...)` is currently called only by the test suite — invariants document intended semantics, not always-on runtime safety. Callers integrating strict-mode validation (T2 #1302) or arrow coercion (T4) should plan on calling verify explicitly. - Item 3: opened **#1309** as the T3.b tracking issue for the deferred binder.py:151 + lowering.py:571 consolidations onto helpers. Linked from CHANGELOG so the punch list lives in GitHub, not just in plan.md. Items 4 (PR description framing on `ScalarType.kind="unknown"`) and 6 (coordinator handoff messaging on indirect GraphistryGPT relevance) are non-code edits handled separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lmeyerov
added a commit
that referenced
this pull request
May 5, 2026
…5-05) Master moved ahead with #1303 (GFQL monolith shrinkdown S3 — split lowering.py projection + reentry concerns into cypher/projection_planning.py and cypher/reentry/runtime.py). Diff overlap with T3 was zero (T3 lives in ir/, #1303 in cypher/; CHANGELOG was the only candidate-conflict file and Both lanes share the conceptual surface "lowering produces LogicalPlan-shaped output that the IR verifies", so adding TestSeamWith1303LoweringSplit (4 tests) to pin the seam: - Co-import of #1303 split modules + T3 metadata module without circular surprise; lowering still re-exposes the split delegators. - Realistic NodeScan→PatternMatch→Filter→Project chain — invariant 6 passes through carve-outs and column drops. - OPTIONAL MATCH plan with non-nullable scalar — pins that invariant 5 fires exactly once; invariant 6 stays silent (no false-positive double signal). - Helper contract on a RowSchema shape mirroring projection_planning.py's whole-row + scalar-projection emits (NodeRef + ScalarType columns). Refresh CHANGELOG count from 57 to 61. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lmeyerov
added a commit
that referenced
this pull request
May 5, 2026
…lable User raised that this PR was net additive (+869/-1) — surprising next to the #1260 monolith-shrinkdown lane. Audit found two safe in-house consolidations: invariants 5 and 6 in verifier.py both contained inline `isinstance(typ, ScalarType) and ... typ.nullable` checks that T3's new is_nullable helper now expresses directly. Apply both. Two further candidates were inspected and intentionally deferred: - binder.py:151 nullable-OR union merge — T2 #1302 hot zone - lowering.py:571 BoundVariable nullable+null_extended flatten — #1295/#1303 hot zone Both deferrals are documented in CHANGELOG and plan.md so the T3.b follow-on slice has a concrete punch list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lmeyerov
added a commit
that referenced
this pull request
May 5, 2026
User pressure-tested PR #1304 against #1260-style code-reduction goals and repo-quality / adoption goals. Honest critiques landed; applying the cheap mitigations: - Item 1: re-export the eight metadata helpers from `graphistry.compute.gfql.ir` so external adopters can discover them via the package surface, not the deep `ir.metadata` path. - Item 2: drop `assert hasattr(...)` private-symbol coupling from the TestSeamWith1303LoweringSplit co-import test. #1303's projection_planning and reentry/runtime modules `globals().update(vars(lowering))` to inherit the shared symbol table, which clobbers `__name__` and `__file__`. Use sys.modules registration as the right load-success witness — purely behavioral, no internal-symbol coupling. - Item 5: verifier.py docstring now disclosures honestly that `verify(...)` is currently called only by the test suite — invariants document intended semantics, not always-on runtime safety. Callers integrating strict-mode validation (T2 #1302) or arrow coercion (T4) should plan on calling verify explicitly. - Item 3: opened **#1309** as the T3.b tracking issue for the deferred binder.py:151 + lowering.py:571 consolidations onto helpers. Linked from CHANGELOG so the punch list lives in GitHub, not just in plan.md. Items 4 (PR description framing on `ScalarType.kind="unknown"`) and 6 (coordinator handoff messaging on indirect GraphistryGPT relevance) are non-code edits handled separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lmeyerov
added a commit
that referenced
this pull request
May 5, 2026
…Plan seams (#1304) * feat(gfql-ir): type/null metadata propagation contract (#1300) Add a stable helper surface over the type/nullability metadata BoundIR / LogicalPlan already carry, plus a propagation-continuity invariant in the LogicalPlan verifier. T3 of #1262 / umbrella #1046. graphistry/compute/gfql/ir/metadata.py exposes seven helpers: is_nullable, with_nullable, widen_to_nullable, column_logical_type, column_is_nullable, merge_types, bound_variable_type. ScalarType is the only nullable carrier today; structural types (NodeRef, EdgeRef, PathType, ListType) pass through unchanged. verifier.py gains invariant 6: for unary ops, columns shared between input.output_schema and op.output_schema must agree on kind family; ScalarType nullability is monotone-widening, with Filter as the sole carve-out (it may drop NULL rows). The check is skipped when either schema has no columns so existing planner-emitted plans that initialise output_schema=RowSchema() remain valid. Adds graphistry/tests/compute/gfql/test_ir_type_propagation.py (48 tests) covering helper contract surface and verifier pass/fail cases (kind mismatch, nullability narrowing on Project/Distinct/OrderBy, Filter carve-out, dropped columns, structural pass-through, list-element recursion). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * review-w1: address PatternMatch carve-out + parametrize structural tests Wave 1 review found: - I1 IMPORTANT: PatternMatch with predicates can narrow nullability the same way Filter does (predicates may drop rows where pattern matches fail). Add PatternMatch to _NULLABILITY_NARROWING_OPS. Optional arms remain locked nullable=True via invariant 5; this carve-out only changes invariant 6 narrowing-detection on the *non-optional* lane. - I3 SUGGESTION→IMPORTANT: bound_variable_type structural pass-through was only tested for NodeRef. Parametrize across NodeRef / EdgeRef / PathType / ListType to pin the documented contract. Adds positive PatternMatch carve-out test in TestPropagationContinuity. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * review-w2: extend narrowing carve-out + add bound_variable_is_nullable Wave 2 amplification on wave-1 fixes was clean. New IMPORTANT findings: - B-I1: SemiApply / AntiSemiApply (Cypher EXISTS / NOT EXISTS subquery filters) are filter-shaped row-droppers. Same forward-correctness risk as wave-1 PatternMatch carve-out. Add both to _NULLABILITY_NARROWING_OPS, lowercase the type alias to match the rest of verifier.py, and update the file-header invariant 6 doc. - B-I2: bound_variable_type silently dropped bv.nullable for structural variables (the binder records nullable=True on optional-arm whole-row aliases even though NodeRef/EdgeRef have no nullable dimension). Adding bound_variable_is_nullable as the canonical "is this variable nullable" helper that returns bv.nullable directly, with the docstring in bound_variable_type now pointing callers there. - B-S2: CHANGELOG test count refresh after wave 1 added 4 tests. Adds parametrized TestBoundVariableIsNullable across all 5 LogicalType families so the contract is pinned for both nullable=True and =False on every kind. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * review-w3: generalize narrowing diagnostic + refresh CHANGELOG count Wave 3 converged the review (waves 2 + 3 both non-significant advance, 0 BLOCKER + 0 IMPORTANT new findings in either). Applying the two most actionable SUGGESTIONs from wave-3: - B-S1: invariant-6 error message previously read "only Filter may drop NULL rows" — wave-2 extended the carve-out to PatternMatch / SemiApply / AntiSemiApply, so the user-visible diagnostic now lists the full set derived from `_NULLABILITY_NARROWING_OPS`. Existing test substring assertions (`"narrowed nullability"`) still match. - A3: CHANGELOG test count refreshed from "≈60" to the actual 57 (wave-2 B-S2 specifically asked for an accurate count). Two additional fixes that fell out of wave-3 review naturally are kept as deferred-cosmetic SUGGESTIONs (e.g. parametrizing the carve-out tests across all four narrowing ops); they would be additive only and do not gate convergence. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(t3): seam amplification post-#1303 lowering split (rebase 2026-05-05) Master moved ahead with #1303 (GFQL monolith shrinkdown S3 — split lowering.py projection + reentry concerns into cypher/projection_planning.py and cypher/reentry/runtime.py). Diff overlap with T3 was zero (T3 lives in ir/, #1303 in cypher/; CHANGELOG was the only candidate-conflict file and Both lanes share the conceptual surface "lowering produces LogicalPlan-shaped output that the IR verifies", so adding TestSeamWith1303LoweringSplit (4 tests) to pin the seam: - Co-import of #1303 split modules + T3 metadata module without circular surprise; lowering still re-exposes the split delegators. - Realistic NodeScan→PatternMatch→Filter→Project chain — invariant 6 passes through carve-outs and column drops. - OPTIONAL MATCH plan with non-nullable scalar — pins that invariant 5 fires exactly once; invariant 6 stays silent (no false-positive double signal). - Helper contract on a RowSchema shape mirroring projection_planning.py's whole-row + scalar-projection emits (NodeRef + ScalarType columns). Refresh CHANGELOG count from 57 to 61. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(verifier): consolidate inline nullability checks onto is_nullable User raised that this PR was net additive (+869/-1) — surprising next to the #1260 monolith-shrinkdown lane. Audit found two safe in-house consolidations: invariants 5 and 6 in verifier.py both contained inline `isinstance(typ, ScalarType) and ... typ.nullable` checks that T3's new is_nullable helper now expresses directly. Apply both. Two further candidates were inspected and intentionally deferred: - binder.py:151 nullable-OR union merge — T2 #1302 hot zone - lowering.py:571 BoundVariable nullable+null_extended flatten — #1295/#1303 hot zone Both deferrals are documented in CHANGELOG and plan.md so the T3.b follow-on slice has a concrete punch list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(t3): pressure-test action items 1, 2, 5 + T3.b tracking (#1309) User pressure-tested PR #1304 against #1260-style code-reduction goals and repo-quality / adoption goals. Honest critiques landed; applying the cheap mitigations: - Item 1: re-export the eight metadata helpers from `graphistry.compute.gfql.ir` so external adopters can discover them via the package surface, not the deep `ir.metadata` path. - Item 2: drop `assert hasattr(...)` private-symbol coupling from the TestSeamWith1303LoweringSplit co-import test. #1303's projection_planning and reentry/runtime modules `globals().update(vars(lowering))` to inherit the shared symbol table, which clobbers `__name__` and `__file__`. Use sys.modules registration as the right load-success witness — purely behavioral, no internal-symbol coupling. - Item 5: verifier.py docstring now disclosures honestly that `verify(...)` is currently called only by the test suite — invariants document intended semantics, not always-on runtime safety. Callers integrating strict-mode validation (T2 #1302) or arrow coercion (T4) should plan on calling verify explicitly. - Item 3: opened **#1309** as the T3.b tracking issue for the deferred binder.py:151 + lowering.py:571 consolidations onto helpers. Linked from CHANGELOG so the punch list lives in GitHub, not just in plan.md. Items 4 (PR description framing on `ScalarType.kind="unknown"`) and 6 (coordinator handoff messaging on indirect GraphistryGPT relevance) are non-code edits handled separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * review-w6: fix verifier docstring + consolidate query_graph onto helper Wave 6 found 1 BLOCKER + 1 IMPORTANT: BLOCKER (Track A): verifier.py docstring "honesty" wording from the pressure-test commit overshot. Claimed `verify(...)` is "currently invoked only by the test suite; no production caller in the repo runs it on planner-emitted plans" — wrong. `passes/manager.py:71,86` runs verify after every tier-1 / tier-2 pass and treats any diagnostic as fatal; `cypher/lowering.py:429` runs it via `_verify_selected_logical_plan` to gate LogicalPlan routing for covered Cypher shapes. So the invariants ARE real safety nets on those paths. Replaced the docstring section with an accurate "Production callers" list and a narrower "Scope caveat for invariant 6" note: the kind/nullability check short-circuits when output_schema columns are empty, which is most planner-emitted Project nodes today — that part of the framing is still correct. IMPORTANT (Track B): missed consolidation candidate at `ir/query_graph.py:224` (`is_required = not var.nullable` on a BoundVariable). Same package as T3, no sibling-PR hot-zone risk. Apply `bound_variable_is_nullable(var)` directly. CHANGELOG bumped from "two consolidations" to "three consolidations within `ir/`". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements #1301 (S3 under #1260): split
cypher/lowering.pyby concern boundaries while preserving behavior and compatibility.Concern split
graphistry/compute/gfql/cypher/projection_planning.pygraphistry/compute/gfql/cypher/reentry/runtime.pylowering.pyvia delegator shims.Scope constraints honored
CompiledCypher*in this slice.Metrics
graphistry/compute/gfql/cypher/lowering.py: 9659 -> 8667 LOC (-992 LOC)Validation
./bin/ruff.sh graphistry/compute/gfql/cypher/lowering.py graphistry/compute/gfql/cypher/projection_planning.py graphistry/compute/gfql/cypher/reentry/runtime.py./bin/pytest.sh graphistry/tests/compute/gfql/cypher/test_lowering.py -q./bin/pytest.sh graphistry/tests/compute/gfql/cypher -qResults:
1188 passed, 89 skipped, 15 xfailedPart of #1260
Closes #1301