Cypher/GFQL: introduce a first-class row-carrier IR for multi-stage vectorized row semantics

## Problem

The current local Cypher/GFQL execution model does not have a first-class row-carrier IR.

As a result, row-seeded semantics are handled in feature-specific ways instead of through one reusable contract. The bounded reentry work in PR #975 is the clearest example, but the same architectural pressure will recur for:

- multi-stage `MATCH ... WITH ... MATCH ...`
- multi-alias `WITH` / `RETURN` / `ORDER BY`
- `OPTIONAL MATCH` null extension
- grouped row-preserving aggregation
- future vectorized GPU/backend audits

Today those semantics are spread across compiler rewrites, projection metadata, hidden columns, and runtime stitching. That is workable for a bounded slice, but it is not a good long-term model for a graph query compiler/runtime.

## Why This Matters

- makes row semantics harder to extend cleanly
- encourages feature-by-feature protocol glue instead of a reusable IR
- complicates vectorization reasoning and GPU/backend parity review
- increases the chance that future Cypher support growth expands `lowering.py` and `gfql_unified.py` in ad hoc ways

## Proposed Direction

Introduce a first-class row-carrier / seeded-row IR for local Cypher/GFQL execution.

Core ideas:

1. Represent carried row state explicitly.
   - row ids / seed ids
   - bound aliases
   - carried scalar columns
   - ordering contract
   - null-extension contract where applicable

2. Lower row-seeded features into that IR instead of feature-specific side channels.

3. Keep the implementation columnar/vectorized.
   - pandas/cudf-friendly
   - no generic Python row-loop fallback

4. Let specific features become clients of the same row model.
   - bounded reentry
   - later multi-alias `WITH`
   - later `OPTIONAL MATCH`
   - later multiplicity-preserving grouped aggregation

## Relationship To Other Issues

- #987 is the smaller, immediate cleanup issue for the current bounded-reentry implementation.
- This issue is the broader architecture lane that goes beyond bounded reentry and aims at a reusable row-semantics model.
- The two should inform each other, but this issue is intentionally larger in scope.

## Non-Goals

- not a request to reopen PR #975
- not a demand for immediate semantic expansion
- not a license to add non-vectorized fallback execution

## Success Criteria

- bounded reentry can be expressed as a normal client of the row IR
- future row-seeded Cypher features stop requiring bespoke hidden-column / metadata handshakes
- vectorization/backend expectations are clearer to audit
- `lowering.py` and runtime orchestration can shrink over time instead of accumulating one-off row mechanics

## Context

PR #975 is landing the bounded-reentry feature/hardening slice.
Issue #987 tracks the narrower follow-on cleanup for that implementation.
This issue tracks the broader architectural direction beyond that one feature.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cypher/GFQL: introduce a first-class row-carrier IR for multi-stage vectorized row semantics #989

Problem

Why This Matters

Proposed Direction

Relationship To Other Issues

Non-Goals

Success Criteria

Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cypher/GFQL: introduce a first-class row-carrier IR for multi-stage vectorized row semantics #989

Description

Problem

Why This Matters

Proposed Direction

Relationship To Other Issues

Non-Goals

Success Criteria

Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions