Skip to content

fix(gfql): preserve alias property columns through non-final WITH aggregate (#1054)#1057

Merged
lmeyerov merged 3 commits intomasterfrom
feat/1054-entity-blob-group-key
Apr 5, 2026
Merged

fix(gfql): preserve alias property columns through non-final WITH aggregate (#1054)#1057
lmeyerov merged 3 commits intomasterfrom
feat/1054-entity-blob-group-key

Conversation

@lmeyerov
Copy link
Copy Markdown
Contributor

@lmeyerov lmeyerov commented Apr 4, 2026

Fixes #1054.

Problem

When a multi-alias bindings-row query chains a non-final WITH aggregate stage (e.g. WITH tag, sum(cd) AS total) before a RETURN tag.name, the tag.* property columns are dropped by the post-aggregate projection, making tag.name inaccessible in the subsequent stage.

Fix

In _lower_match_alias_aggregate_stage, on the bindings-row path (scope.allowed_match_aliases non-empty), preserve alias.* property columns through the aggregate stage so the next stage can resolve alias.property references.

Status

🚧 WIP — investigation and implementation in progress.

Test

  • test_string_cypher_multi_alias_with_four_stage_chain (currently xfail → will pass)

🤖 Generated with Claude Code

lmeyerov and others added 2 commits April 4, 2026 15:46
…regate on bindings-row tables (#1054)

Three root causes fixed in `_lower_match_alias_aggregate_stage`:
1. Group key used bare `"id"` instead of `"tag.id"` on the bindings-row path
   (each row had a unique entity blob, making grouping per-row instead of per-alias)
2. Entity blob (`__node_entity__`) was added to `key_names`, serializing all
   columns and making every row unique; skip entity blob on bindings-row path
3. `projection_fn(visible_projection_items)` dropped `tag.*` property columns
   after `group_by`; replaced with `group_by(key_prefixes=[f"{alias}."])` +
   `drop_cols(hidden_keys)` to keep alias-prefixed columns for the next stage

New runtime ops: `drop_cols` (ast/pipeline/validation) and `key_prefixes`
parameter on `group_by` to dynamically add alias-prefixed columns as group keys.

Also fixed a follow-on issue in `_lower_row_column_stage`: alias-prefixed short-
circuit expressions (e.g., `"tag.name"`) were being recorded in `next_projected_columns`,
causing subsequent RETURN stages to re-inline them via `_rewrite_expr_to_projected_sources`
and fail when `_row_expr_arg` tried to parse `tag` as a Cypher identifier.

Adds 3 amplification tests: multiple agg functions in non-final WITH, non-final
agg + scalar stage accessing alias property, ORDER BY on alias.property after agg.
Full lowering suite: 686 passed, 56 skipped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@lmeyerov lmeyerov marked this pull request as ready for review April 4, 2026 23:29
…d non-final WITH aggregate (#1054)

Unit tests (test_row_pipeline_ops.py):
- TestDropCols: basic drop, multiple cols, ignore-missing, empty list, dotted column names
- TestGroupByKeyPrefixes: prefix expansion, multiple prefixes, key_prefixes=None unchanged
- TestRowPipelineSafelist: drop_cols and key_prefixes validation (valid params, type errors)

Cypher integration tests (test_lowering.py):
- non_final_agg_two_aliases_survive: tag.id accessible alongside tag.name after non-final agg
- non_final_agg_where_on_alias_property: alias.property accessible in RETURN after non-final agg

845 passed, 58 skipped.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GFQL: non-final WITH aggregate uses entity blob as group key, losing per-alias grouping and property access in subsequent RETURN

1 participant