Skip to content

Fix relationship-row aggregate multiplicity for Cypher collect/count/sum (#1343)#1348

Merged
lmeyerov merged 9 commits intomasterfrom
issue-1343-rel-multiplicity-aggregates
May 8, 2026
Merged

Fix relationship-row aggregate multiplicity for Cypher collect/count/sum (#1343)#1348
lmeyerov merged 9 commits intomasterfrom
issue-1343-rel-multiplicity-aggregates

Conversation

@lmeyerov
Copy link
Copy Markdown
Contributor

@lmeyerov lmeyerov commented May 7, 2026

Summary

Fixes #1343 by routing multiplicity-sensitive Cypher aggregates on relationship-pattern MATCH queries through bindings-row execution so row multiplicity is preserved before aggregation.

What changed

  • Added aggregate-shape detection in lowering for relationship-pattern MATCH + multiplicity-sensitive aggregate usage.
  • For those shapes, force bindings-row path (rows(binding_ops=...)) so repeated relationship rows survive to group_by.
  • Scoped the old repeated MATCH rows aggregate guard to non-bindings paths.
  • Added admission constraints so forced bindings routing is limited to safe lanes (preserving existing fail-fast behavior for known unsupported multi-source/whole-row overlap lanes).
  • Added regression coverage for:
    • grouped/global relationship-row count(*), count(alias), sum(1), avg(edge_prop)
    • OPTIONAL MATCH ... WITH p, collect(uni.name)
    • OPTIONAL MATCH ... WITH p, collect(CASE ...)
    • grouped relationship-row WITH a, count/sum/avg(...)
  • Added changelog entry in Development.

CI guard update

  • Updated bin/ci_cypher_surface_guard_baseline.json to match intentional lowering.py growth from this feature and follow-up safety constraints.
  • Verified guard locally with python bin/ci_cypher_surface_guard.py (pass).

Validation

  • ./bin/ruff.sh graphistry/compute/gfql/cypher/lowering.py graphistry/tests/compute/gfql/cypher/test_lowering.py
  • python -m pytest -q graphistry/tests/compute/gfql/cypher/test_lowering.py
  • python -m pytest -q tests/gfql/ref/test_df_executor_core.py -k "missing_alias_raises or missing_where_column_raises_during_input_build or where_missing_column_after_prior_call_still_rejected"
  • python bin/ci_cypher_surface_guard.py

Notes

  • This PR focuses on relationship-row multiplicity for aggregate expressions over scalar/property projections.
  • Whole-row carrier grouping shape (RETURN a, count(*)) remains on the legacy representation path and is not expanded in this change.

@lmeyerov lmeyerov merged commit 6440056 into master May 8, 2026
137 checks passed
@lmeyerov lmeyerov deleted the issue-1343-rel-multiplicity-aggregates branch May 8, 2026 00:33
lmeyerov added a commit that referenced this pull request May 8, 2026
Rebased onto master after #1348 (#1343 aggregate fix) landed; the
narrow-shape flatten wiring at compile_cypher_query adds 12 lines to
lowering.py, so 8688 → 8700.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lmeyerov added a commit that referenced this pull request May 8, 2026
Rebased onto master after #1348 (#1343 aggregate fix) landed; the
narrow-shape flatten wiring at compile_cypher_query adds 12 lines to
lowering.py, so 8688 → 8700.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cypher aggregates (collect, count(*), sum) over relationship-pattern rows rejected: runtime collapses rows before aggregation

1 participant