fix(gfql/row): resolve bare relationship alias in row expressions (#1072)#1249
Merged
fix(gfql/row): resolve bare relationship alias in row expressions (#1072)#1249
Conversation
0de4097 to
cd955fe
Compare
) Bare edge aliases in select / where_rows / order_by row expressions raised "unsupported token in row expression" because _gfql_resolve_token only had a node-id fallback for the bare-alias case. Add an edge fallback: prefer the edge-id column, otherwise render the relationship as a Cypher-style [:TYPE {props}] string, mirroring what cypher RETURN <relAlias> already produces via result_postprocess. Direct GFQL chains and cypher-lowered chains now see the same column shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…entry (#1071) Pre-rewrite the prefix WITH stage in _compile_bounded_reentry_query so secondary whole-row aliases are demoted to scalar property carries (S.X AS __cypher_reentry_<S>_<X>__) and downstream S.X references compose with the existing single-whole-row machinery (#1047/#1068). RETURN of a secondary whole-row alias, and re-binding a secondary alias as a node variable in the trailing MATCH, remain unsupported with precise errors. Closes #1071. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sions Records the user-facing fix and amplifications for #1072 under ## [Development] → ### Fixed: bare edge-alias resolution in row expressions, shared vectorized renderer in row/entity_props.py, edge-alias gating via _gfql_rows_edge_aliases, string-escape / float / empty-type rendering normalization, and re-entry WHERE secondary-alias predicate rewriting via synthesized text. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ring Wave 4 finding T2 (IMPORTANT): TestRelationshipAliasInRowExpression had 14 new tests in PR #1249 but zero cuDF coverage on the new edge-alias rendering path, despite Wave 2 vectorization being specifically motivated by cuDF support (Wave 1 finding #4). Adds test_select_bare_relationship_alias_renders_on_cudf_when_available following the sibling pattern at test_row_pipeline_vectorized_cudf_when_available (pytest.importorskip("cudf")). Asserts: (a) result keeps cuDF backend, (b) bare-alias rendering produces "[:WORKS_AT {workFrom: 2010}]", (c) select/return cell-for-cell parity holds on cuDF. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wave 4 finding D1 (SUGGESTION): the conditional at pipeline.py:2310 and the raise at pipeline.py:2312 produce identical errors. After the node-id branch (lines 2306-2309) returns, the only remaining outcome is "unsupported token", regardless of edge_aliases membership — both branches raised the same f-string. Collapsed to a single raise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
11 tasks
cd955fe to
5770312
Compare
lmeyerov
added a commit
that referenced
this pull request
May 3, 2026
…ing failfast (post-#1249 merge) After merging origin/master (which brought in #1249's row pipeline changes), the chained-reentry-same-primary positive test now hits the pre-existing "unique carried node rows" failfast at gfql_unified.py:~1091. This is not a regression introduced by slice 4.3d.2 — the carry-uniqueness constraint is fundamental to the current scalar-carry runtime model. Multiple rows sharing the same `a` value (one per friend after the first reentry) cannot be carried through to a second reentry that re-uses `a` as the source. The rebinding shape (Q2-style `(a)-[:R]->(friend) ... (friend)-[:S]->(c)`) remains validated and continues to pass — that's the actual cross-boundary case the slice closes. Retarget the test to assert the unique-rows failfast fires, locking the known limitation so a future slice that lifts the constraint must update this test alongside the runtime change. Refs #1256. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lmeyerov
added a commit
that referenced
this pull request
May 3, 2026
…ry (#1256) (#1258) * feat(cypher): forward secondary whole-row carry through chained reentry (#1256) Closes the chained-reentry portion of slice 4.3d (#1256) so multi-alias whole-row carry survives more than one reentry boundary. Two extensions to `_demote_secondary_whole_row_aliases` (the active rewrite path on master after #1071 superseded slice 4.3b's `_rewrite_multi_whole_row_prefix`): 1. **Forwarding-item drop** — bare-identifier projection items in downstream `WITH` stages whose name is a secondary alias are dropped at compile time *before* `_collect_secondary_property_refs` walks them for bare references. Same intent as slice 4.3c, integrated into the active path. Without this, forwarding patterns like `WITH a, x, friend, ...` tripped the bare-ref failfast even though the property carry already lives as a hidden scalar. 2. **Hidden-column forwarding** — for every `(secondary_alias, prop)` collected from trailing clauses, the synthesized `__cypher_reentry_<S>_<X>__` hidden alias is appended as a bare passthrough item to every downstream `WITH` stage so each recursive `compile_cypher_query` call sees it as a scalar carry. Without this, the inner compile failed alias resolution on the rewritten hidden identifier once a chained stage narrowed projection scope. New positive tests cover three previously-blocked shapes: * `test_string_cypher_admits_multi_stage_secondary_alias_carry_through_chained_reentry` — chained reentry where the trailing MATCH continues to use the primary * `test_string_cypher_admits_secondary_alias_carry_across_reentry_source_rebinding` — source rebound between boundaries (`(a)-[:R]->(friend) ... (friend)-[:S]->(c)`) * `test_string_cypher_admits_multi_alias_distinct_forwarding_through_reentry` — multi-alias `DISTINCT` forwarding through a single boundary New failfast test pins the remaining gap (free-form intermediate MATCH that does not start from any carried alias — the LDBC SNB IC3 prefix shape) to the existing scoped error so future closure is regression-locked. Validation: * 780 cypher lowering tests pass (was 776; +4 new). * Touched-module mypy clean. * Repro `plans/1256-cross-reentry-boundary-carry/repro/repro_ic3.py`: simple rebinding cases now compile + execute correctly; literal IC3 still fails at the free-form intermediate MATCH gate (separate follow-up). Refs #1256 (closes the chained-reentry portion; free-form intermediate MATCH remains open), #999 (IC3 partial), #989 (row-carrier IR umbrella). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test+docs(cypher): pin aggregate-failfast interaction with chained-reentry carry (#1256 W1-I1) Wave-1 review flagged that hidden-column forwarding could in principle silently mutate grouping when the appended `__cypher_reentry_<S>_<X>__` bare item lands inside a `WITH a, friend, count(*) AS n` chain — the carry would join the group key set. Empirically those queries already fail at the pre-existing "aggregate would need repeated MATCH rows from a relationship pattern" failfast, so no silent wrong-result path exists today. Pin that behavior with a regression-lock test so a future change that lifts the aggregate failfast must reckon with carry-grouping interaction explicitly. Also document the DISTINCT/aggregate interaction in `_demote_secondary_whole_row_aliases` inline so the next reader sees the design tradeoff. Refs #1256. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cypher): scoped failfast for chained-reentry carry + aggregate WITH (#1256 W2-IMPORTANT-1) Wave-2 review uncovered a silent wrong-result for the corner shape `MATCH (a),(x) WITH a, x MATCH (a) WITH a, x, count(*) AS n RETURN x.id, n` (node-only trailing MATCH + aggregating downstream WITH + carry forward). PR HEAD returned `xid=None, n=1`; master rejects the same query at the secondary-whole-row failfast. Add a guard in `_demote_secondary_whole_row_aliases`: when carry forwarding is about to append a hidden column to a downstream WITH stage, refuse if any item in that stage contains an aggregate function call. Raise a scoped #1256 failfast pointing at the gap. The relationship-pattern aggregate path is also now covered by this guard (the existing "aggregate would need repeated MATCH rows" failfast still fires for non-carry queries; the scoped error is clearer when a carry is actually involved). Test changes: - Rename `..._with_aggregate_hits_existing_aggregate_failfast` to `..._with_aggregate_relationship_match_failfast` and update the regex to match the new scoped error. - New `..._with_aggregate_node_only_match_failfast` regression-locks the W2-IMPORTANT-1 case. Validation: - 782 cypher lowering tests pass (was 780 before slice 4.3d.2; +2 net). - Touched-module mypy clean. - Scoped failfast wording is symmetric with the slice 4.3a/b admit-gate failfast (#989) so error consumers can pattern-match. Refs #1256. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(cypher): retarget chained-reentry-same-primary test to pre-existing failfast (post-#1249 merge) After merging origin/master (which brought in #1249's row pipeline changes), the chained-reentry-same-primary positive test now hits the pre-existing "unique carried node rows" failfast at gfql_unified.py:~1091. This is not a regression introduced by slice 4.3d.2 — the carry-uniqueness constraint is fundamental to the current scalar-carry runtime model. Multiple rows sharing the same `a` value (one per friend after the first reentry) cannot be carried through to a second reentry that re-uses `a` as the source. The rebinding shape (Q2-style `(a)-[:R]->(friend) ... (friend)-[:S]->(c)`) remains validated and continues to pass — that's the actual cross-boundary case the slice closes. Retarget the test to assert the unique-rows failfast fires, locking the known limitation so a future slice that lifts the constraint must update this test alongside the runtime change. Refs #1256. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #1072. Bare edge aliases in row expressions (e.g.
select([("rel", "workAt")]),where_rows("workAt IS NOT NULL")) raisedunsupported token in row expression: 'workAt'because_gfql_resolve_token's bare-alias fallback only knew about node aliases ({alias}.{node_id}lookup).Before / after
Before — direct GFQL chain over a bindings row table:
After: returns a Cypher-style rendering, matching what
RETURN <relAlias>already produces viaresult_postprocess:Change
graphistry/compute/gfql/row/pipeline.py:_gfql_resolve_tokenso an edge alias resolves to either the edge-id column ({alias}.{_edge}) when present, or to a rendered[:TYPE {props}]string via the new_gfql_render_relationship_aliashelper._source/_destination/_edge/__*columns and the alias-indicator column ({alias}.{alias}); rows whose alias columns are all null becomepd.NA.Tests
graphistry/tests/compute/gfql/test_row_pipeline_ops.py::TestRelationshipAliasInRowExpression(3 new tests):select(items=[("rel", "workAt")])renders the Cypher string,select(items=[("yr", "workAt.workFrom")])keeps working (regression guard),where_rows("workAt IS NOT NULL")keeps the row.Full
graphistry/tests/compute/gfql/suite: 1576 passed, 87 skipped, 15 xfailed.Test plan
python -m pytest graphistry/tests/compute/gfql/./bin/lint.shand./bin/mypy.shon touched files