fix: Phase 7 — pattern predicates, var-length exact, OPTIONAL MATCH correctness#376
Conversation
…orrectness (#306, #302) - Add PatternPredicate AST node and grammar rule for WHERE (n)-[:R]->() as boolean existence checks - Add var_length_exact (*N) grammar for fixed-count var-length patterns - Add _evaluate_pattern_predicate and _var_length_reachable in evaluator for pattern predicate evaluation - Add UndefinedVariable validation for named vars in pattern predicates - Add UnexpectedSyntax validation blocking pattern predicates in RETURN/WITH/SET contexts - Add ApplyOptional operator for atomic multi-hop OPTIONAL MATCH left-join semantics - Add optional flag to ExpandVariableLength for null-src propagation in OPTIONAL context - Add dst_labels field to OptionalExpandEdges to inline label filter (prevents post-filter from dropping null rows) - Fix _null_fill to preserve pre-bound variables (triadic friend-of-friend regression fix) - Fix _execute_variable_expand to respect bound dst_var (only yield paths to specific target node) - Fix var-length OPTIONAL MATCH to null-fill when no paths found (not drop the row) - Fix conftest raw Python list handling in comparable value helper Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
WalkthroughAdds a PatternPredicate AST node and end-to-end support: parser + transformer, evaluator traversal (including var-length BFS, direction/type/label checks), planner/operator changes for OPTIONAL semantics (ApplyOptional, null-filling), and executor adaptations for optional/variable-length expansion and ApplyOptional execution. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Parser
participant Planner
participant Evaluator
participant Executor
participant Graph
Client->>Parser: Parse query with WHERE pattern predicate
Parser->>Planner: Emit PatternPredicate AST
Planner->>Evaluator: Include PatternPredicate in plan/WHERE
Evaluator->>Executor: Request predicate evaluation with ExecutionContext
Executor->>Graph: Resolve source variable binding
Graph-->>Executor: Return node / NULL / unbound
alt source resolved and not NULL
Executor->>Graph: Traverse hops (direction/type/var-length)
Graph-->>Executor: Return candidate paths
Executor->>Evaluator: Apply dst label / pre-bound dst checks
alt matching full path found
Evaluator-->>Client: CypherBool(True)
else
Evaluator-->>Client: CypherBool(False)
end
else source NULL
Evaluator-->>Client: CypherNull()
else source unbound
Evaluator-->>Client: CypherBool(False)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/graphforge/planner/planner.py (1)
1226-1263:⚠️ Potential issue | 🟠 Major | ⚡ Quick winTrack only newly introduced vars for
ApplyOptional, and includepath_var.
introduced_varscurrently appends everyrel_var/dst_vareven when that name was already bound in the outer scope, whilepath_varis never added before theApplyOptionalis built. On a failed multi-hop optional, that can both clobber outer bindings and leave the optional path variable missing entirely.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/planner/planner.py` around lines 1226 - 1263, Only newly-introduced variables should be passed to ApplyOptional and the path_var must be included if it was bound by this pattern: when creating introduced_vars, check self._type_context.has_variable(var) before appending rel_var or dst_var (bind/validate as you do, but only add to introduced_vars when it was not previously bound), and ensure path_var is added to introduced_vars (or added to the list right after you bind it with self._type_context.bind_variable(path_var)) so ApplyOptional(subplan=subplan, null_fill_vars=introduced_vars) contains the newly introduced rel/dst vars and the path_var.
🧹 Nitpick comments (1)
src/graphforge/ast/expression.py (1)
486-497: Consider usingtupleforpartsto fully enforce immutability and prevent unintended mutations.While
frozen=Trueprevents attribute reassignment, thepartslist field can still be mutated in place. The guideline requires immutable frozen models for AST nodes. Switching toparts: tuple[Any, ...]would fully enforce immutability and be more explicit. Note that other collection fields in nearby AST nodes (e.g.,FunctionCall.args,CaseExpression.when_clauses) currently uselist, so this change would diverge from the existing pattern—consider aligning similar collection fields across AST nodes if pursuing this refactoring.♻️ Suggested adjustment
- parts: list[Any] = Field(..., description="Pattern parts (alternating NodePattern/RelationshipPattern)") + parts: tuple[Any, ...] = Field(..., description="Pattern parts (alternating NodePattern/RelationshipPattern)")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/ast/expression.py` around lines 486 - 497, PatternPredicate currently declares parts as a mutable list which can be mutated even though model_config sets frozen=True; change the field type on PatternPredicate (the parts attribute) from list[Any] to an immutable tuple type (tuple[Any, ...]) and update its Field description accordingly, then update any code that constructs PatternPredicate instances to pass a tuple (and consider adjusting related AST nodes like FunctionCall.args and CaseExpression.when_clauses for consistency if you choose to standardize on tuples) so the model is truly immutable at runtime.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/graphforge/executor/evaluator.py`:
- Around line 1705-1741: The current DFS uses a node-based `visited` set (stack
initialized with `(start, 0, {start.id})` and checks `if nxt_id not in visited`)
which prevents revisiting the same node via different edges and breaks cyclic
variable-length path matches; change the visitation tracking to be
path/edge-based instead: replace the node-id `visited` set with a set of edge
ids (or edge tuples) and update the checks and pushes accordingly (use `edge.id`
or `(edge.src.id, edge.dst.id)` when checking/adding instead of `nxt_id`, and
propagate `visited | {edge_id}` when pushing to `stack`), leaving the rest of
the traversal logic (edge selection via `executor.graph.get_outgoing_edges` /
`get_incoming_edges`, direction handling, and depth checks) intact so branches
can revisit nodes via different edges while still preventing infinite edge-reuse
loops.
- Around line 1652-1659: The var-length branch currently extends next_nodes with
the raw results of _var_length_reachable(executor, node, rel_pattern.types,
dir_val, min_h, max_h) without applying the destination label and
bound-destination checks used in the fixed-hop path; update the var-length
handling so you filter the reachable candidates through the same
destination-constraint logic (the same predicates applied in the fixed-hop
branch around the fixed-hop path handling) before calling next_nodes.extend,
i.e. call _var_length_reachable, then for each candidate apply the destination
label checks and bound-destination validation (using the same helper/condition
used in the fixed-hop code) and only append those that pass.
- Around line 1606-1608: The current logic in the pattern predicate evaluator
treats an unbound source variable the same as an explicit NULL; change it so an
unbound variable (i.e., src_var not in ctx.bindings) returns CypherBool(False)
while an explicitly bound NULL (src_node is None or isinstance(src_node,
CypherNull)) still returns CypherNull(); update the block that reads src_node =
ctx.get(src_var) if src_var in ctx.bindings else None to first check membership
in ctx.bindings and return CypherBool(False) for unbound, and only after
confirming bound, treat None/CypherNull as CypherNull() (adjust analogous logic
for any dst/target handling if present), referencing src_var, src_node,
ctx.bindings, CypherNull and CypherBool in evaluator.py.
In `@src/graphforge/parser/parser.py`:
- Around line 1119-1127: The transformer now emits PatternPredicate from
pattern_predicate, but _parse_rel_parts still filters only certain expression
classes and thus drops embedded predicates; update _parse_rel_parts to recognize
graphforge.ast.expression.PatternPredicate (in addition to existing
node/relationship expression types) when harvesting pattern_where and to attach
such a PatternPredicate into the RelationshipPattern parts (or predicate slot)
so constructs like ()-[:R WHERE (a)-[:S]->()]->() are preserved; locate
_parse_rel_parts and add PatternPredicate to its accepted-types check and the
code path that appends/assigns parts/predicate to RelationshipPattern.
In `@src/graphforge/planner/planner.py`:
- Around line 1312-1319: Replace the shallow isinstance check over
with_clause.items that only catches top-level PatternPredicate with a call to
the recursive validator: remove the current loop that checks
isinstance(item.expression, PatternPredicate) and instead call
_validate_no_pattern_predicate(with_clause) (the existing recursive validator
used by _validate_return_clause()) so nested pattern predicates (e.g. WITH NOT
((n)-[:R]->()) AS ok) are rejected as well.
- Around line 1190-1198: The current code flattens dst_pattern.labels into
flat_dst_labels which collapses OR-of-ANDs into a single list and causes
patterns like (:B|C) to be treated as (:B:C); instead preserve the disjunctive
groups and pass a list-of-conjunctions to OptionalExpandEdges and ScanNodes.
Replace the flat_dst_labels logic with code that keeps each conj as a list (e.g.
for each conj in dst_pattern.labels use conj if it's already a list else [conj])
and set inline_dst_labels to that list-of-lists when dst_was_prebound is false
(otherwise keep it as []). Ensure you update references to inline_dst_labels and
remove flat_dst_labels so the executor still receives the OR-of-ANDs structure
it expects.
- Around line 1154-1165: The planner currently only folds relationship
properties into combined_where and never applies destination-node constraints,
causing OPTIONAL MATCH to drop dst_pattern.properties and label filters for
newly introduced destination variables; update the block around combined_where
(and the analogous block at the other occurrence) to also handle
dst_pattern.properties: if dst_pattern.properties exists ensure dst_var is
created (generate one if None), build dst_prop_pred via
self._properties_to_predicate(dst_var, dst_pattern.properties), and then combine
that predicate with the existing combined_where using the same BinaryOp("AND")
logic so both rel and destination node predicates are enforced for single-hop,
multi-hop and var-length optional planning paths (also ensure any label
filtering for the destination var is applied consistently regardless of
is_single_hop).
- Around line 1111-1123: When src_var is already bound the code currently skips
type checks and label/property predicates; update the branch where
self._type_context.has_variable(src_var) is True to still call
self._type_context.validate_compatible(src_var, VariableType.NODE) and, if
src_pattern.labels or src_pattern.properties exist, build the same predicate via
self._properties_to_predicate(src_var, src_pattern.properties) (and include
label checks as done for unbound vars) and append a Filter(predicate=...) so
existing variables are validated against the pattern rather than silently
expanding (affecting OPTIONAL MATCH null-filling semantics); keep existing
binding logic only in the unbound branch but mirror validation/predicate
application for the bound case.
---
Outside diff comments:
In `@src/graphforge/planner/planner.py`:
- Around line 1226-1263: Only newly-introduced variables should be passed to
ApplyOptional and the path_var must be included if it was bound by this pattern:
when creating introduced_vars, check self._type_context.has_variable(var) before
appending rel_var or dst_var (bind/validate as you do, but only add to
introduced_vars when it was not previously bound), and ensure path_var is added
to introduced_vars (or added to the list right after you bind it with
self._type_context.bind_variable(path_var)) so ApplyOptional(subplan=subplan,
null_fill_vars=introduced_vars) contains the newly introduced rel/dst vars and
the path_var.
---
Nitpick comments:
In `@src/graphforge/ast/expression.py`:
- Around line 486-497: PatternPredicate currently declares parts as a mutable
list which can be mutated even though model_config sets frozen=True; change the
field type on PatternPredicate (the parts attribute) from list[Any] to an
immutable tuple type (tuple[Any, ...]) and update its Field description
accordingly, then update any code that constructs PatternPredicate instances to
pass a tuple (and consider adjusting related AST nodes like FunctionCall.args
and CaseExpression.when_clauses for consistency if you choose to standardize on
tuples) so the model is truly immutable at runtime.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 318014d4-390e-40bd-9b68-ac50186fc2b1
📒 Files selected for processing (8)
src/graphforge/ast/expression.pysrc/graphforge/executor/evaluator.pysrc/graphforge/executor/executor.pysrc/graphforge/parser/cypher.larksrc/graphforge/parser/parser.pysrc/graphforge/planner/operators.pysrc/graphforge/planner/planner.pytests/tck/conftest.py
| if rel_pattern.min_hops is not None or rel_pattern.max_hops is not None: | ||
| min_h = rel_pattern.min_hops if rel_pattern.min_hops is not None else 1 | ||
| max_h = rel_pattern.max_hops # None = unbounded | ||
| reachable = _var_length_reachable( | ||
| executor, node, rel_pattern.types, dir_val, min_h, max_h | ||
| ) | ||
| next_nodes.extend(reachable) | ||
| continue |
There was a problem hiding this comment.
Apply destination constraints to var-length candidates before appending
Lines 1652-1659 append _var_length_reachable(...) results directly, skipping the destination label and bound-destination checks used in the fixed-hop path (Lines 1671-1682). This can produce false positives for constrained predicates.
Suggested fix
if rel_pattern.min_hops is not None or rel_pattern.max_hops is not None:
min_h = rel_pattern.min_hops if rel_pattern.min_hops is not None else 1
max_h = rel_pattern.max_hops # None = unbounded
reachable = _var_length_reachable(
executor, node, rel_pattern.types, dir_val, min_h, max_h
)
- next_nodes.extend(reachable)
+ for dst_node in reachable:
+ if dst_pattern.labels:
+ node_labels = executor.graph.get_node_labels(dst_node.id)
+ if not all(
+ lbl in node_labels
+ for lbl_group in dst_pattern.labels
+ for lbl in lbl_group
+ ):
+ continue
+ if dst_pattern.variable and dst_pattern.variable in ctx.bindings:
+ bound_dst = ctx.bindings[dst_pattern.variable]
+ if isinstance(bound_dst, NodeRef) and bound_dst.id != dst_node.id:
+ continue
+ next_nodes.append(dst_node)
continue🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/executor/evaluator.py` around lines 1652 - 1659, The
var-length branch currently extends next_nodes with the raw results of
_var_length_reachable(executor, node, rel_pattern.types, dir_val, min_h, max_h)
without applying the destination label and bound-destination checks used in the
fixed-hop path; update the var-length handling so you filter the reachable
candidates through the same destination-constraint logic (the same predicates
applied in the fixed-hop branch around the fixed-hop path handling) before
calling next_nodes.extend, i.e. call _var_length_reachable, then for each
candidate apply the destination label checks and bound-destination validation
(using the same helper/condition used in the fixed-hop code) and only append
those that pass.
| stack = [(start, 0, {start.id})] | ||
| while stack: | ||
| node, depth, visited = stack.pop() | ||
| if min_hops <= depth: | ||
| result.append(node) | ||
| if max_hops is not None and depth >= max_hops: | ||
| continue | ||
|
|
||
| if direction == "OUT": | ||
| edges = executor.graph.get_outgoing_edges(node.id) | ||
| elif direction == "IN": | ||
| edges = executor.graph.get_incoming_edges(node.id) | ||
| else: | ||
| outgoing = executor.graph.get_outgoing_edges(node.id) | ||
| incoming = executor.graph.get_incoming_edges(node.id) | ||
| seen_ids = set() | ||
| edges = [] | ||
| for e in outgoing + incoming: | ||
| if e.id not in seen_ids: | ||
| edges.append(e) | ||
| seen_ids.add(e.id) | ||
|
|
||
| if edge_types: | ||
| edges = [e for e in edges if e.type in edge_types] | ||
|
|
||
| for edge in edges: | ||
| if direction == "IN": | ||
| nxt_id = edge.src.id | ||
| elif direction == "OUT": | ||
| nxt_id = edge.dst.id | ||
| else: | ||
| nxt_id = edge.dst.id if edge.src.id == node.id else edge.src.id | ||
|
|
||
| if nxt_id not in visited: | ||
| nxt_node = executor.graph.get_node(nxt_id) | ||
| if nxt_node: | ||
| stack.append((nxt_node, depth + 1, visited | {nxt_id})) |
There was a problem hiding this comment.
Node-based visited tracking can miss valid cyclic var-length paths
The traversal state uses visited node ids (Line 1705 / Line 1738), which blocks revisiting a node via different edges. That can incorrectly fail existence checks for cyclic patterns at exact/deeper hop counts.
A safer approach is to track path edge usage (or path state) instead of global node-visited pruning for each branch.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/executor/evaluator.py` around lines 1705 - 1741, The current
DFS uses a node-based `visited` set (stack initialized with `(start, 0,
{start.id})` and checks `if nxt_id not in visited`) which prevents revisiting
the same node via different edges and breaks cyclic variable-length path
matches; change the visitation tracking to be path/edge-based instead: replace
the node-id `visited` set with a set of edge ids (or edge tuples) and update the
checks and pushes accordingly (use `edge.id` or `(edge.src.id, edge.dst.id)`
when checking/adding instead of `nxt_id`, and propagate `visited | {edge_id}`
when pushing to `stack`), leaving the rest of the traversal logic (edge
selection via `executor.graph.get_outgoing_edges` / `get_incoming_edges`,
direction handling, and depth checks) intact so branches can revisit nodes via
different edges while still preventing infinite edge-reuse loops.
| def pattern_predicate(self, items): | ||
| """Transform pattern predicate expression. | ||
|
|
||
| Syntax: (n)-[:REL]->() used as a boolean existence check in WHERE. | ||
| Items alternate: NodePattern, RelationshipPattern, NodePattern, ... | ||
| """ | ||
| from graphforge.ast.expression import PatternPredicate | ||
|
|
||
| return PatternPredicate(parts=list(items)) |
There was a problem hiding this comment.
Thread PatternPredicate through relationship-pattern predicate parsing.
This transformer now emits PatternPredicate, but _parse_rel_parts() still only recognizes a fixed set of expression classes when harvesting pattern_where. A predicate like ()-[:R WHERE (a)-[:S]->()]->() will be parsed and then silently dropped instead of being attached to the RelationshipPattern. Please add PatternPredicate to that path before shipping this feature.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/parser/parser.py` around lines 1119 - 1127, The transformer
now emits PatternPredicate from pattern_predicate, but _parse_rel_parts still
filters only certain expression classes and thus drops embedded predicates;
update _parse_rel_parts to recognize graphforge.ast.expression.PatternPredicate
(in addition to existing node/relationship expression types) when harvesting
pattern_where and to attach such a PatternPredicate into the RelationshipPattern
parts (or predicate slot) so constructs like ()-[:R WHERE (a)-[:S]->()]->() are
preserved; locate _parse_rel_parts and add PatternPredicate to its
accepted-types check and the code path that appends/assigns parts/predicate to
RelationshipPattern.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…306/#302) - Only add newly-introduced vars to ApplyOptional null_fill_vars (pre-bound dst/rel were incorrectly included, causing spurious nulls) - Include path_var in null_fill_vars before building ApplyOptional (was bound after ApplyOptional creation, so never null-filled on miss) - Unbound src in pattern predicate returns CypherBool(False) not CypherNull (correct semantics: unbound var → pattern cannot exist → False) - WITH clause pattern predicate check uses recursive _validate_no_pattern_predicate instead of shallow isinstance (catches nested pattern predicates in expressions) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (4)
src/graphforge/planner/planner.py (4)
1154-1170:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDestination node property predicates are still not planned in OPTIONAL MATCH.
This block only combines relationship-side predicates;
dst_pattern.propertiesare not incorporated, so destination inline node predicates can be dropped in optional plans.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/planner/planner.py` around lines 1154 - 1170, The code currently only folds relationship properties into combined_where but omits destination node inline predicates (dst_pattern.properties), causing them to be lost for OPTIONAL MATCH; update the same block in planner.py so that if dst_pattern.properties exists you generate/ensure a destination variable (e.g., dst_var similar to rel_var or via self._generate_anonymous_variable()), call self._properties_to_predicate(dst_var, dst_pattern.properties) to get a dst_prop_pred, and then AND it into combined_where (using the same BinaryOp op="AND" pattern used for rel_prop_pred) so both rel_prop_pred and dst_prop_pred are preserved in combined_where when hop_idx == 0 and is_single_hop.
1111-1123:⚠️ Potential issue | 🟠 Major | ⚡ Quick winValidate pre-bound
src_varagainst source pattern constraints.When
src_varis already in scope, this branch skips type compatibility and source label/property predicates, soOPTIONAL MATCH (a:Label {…})-...can expand from non-matchingainstead of null-filling the optional pattern.Suggested fix
- # Emit ScanNodes only if src is not already bound - if not self._type_context.has_variable(src_var): + # Always validate type compatibility for src + self._type_context.validate_compatible(src_var, VariableType.NODE) + if not self._type_context.has_variable(src_var): operators.append( ScanNodes( variable=src_var, labels=src_pattern.labels if src_pattern.labels else None, ) ) - self._type_context.validate_compatible(src_var, VariableType.NODE) self._type_context.bind_variable(src_var, VariableType.NODE) - if src_pattern.properties: - predicate = self._properties_to_predicate(src_var, src_pattern.properties) - operators.append(Filter(predicate=predicate)) + elif src_pattern.labels: + operators.append(ScanNodes(variable=src_var, labels=src_pattern.labels)) + if src_pattern.properties: + predicate = self._properties_to_predicate(src_var, src_pattern.properties) + operators.append(Filter(predicate=predicate))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/planner/planner.py` around lines 1111 - 1123, When src_var is already bound (i.e., self._type_context.has_variable(src_var) is True) ensure you still validate it against the source pattern: call self._type_context.validate_compatible(src_var, VariableType.NODE) to enforce node type, check that the existing variable's labels include src_pattern.labels and if not raise or handle a mismatch, and if src_pattern.properties exists build the predicate via self._properties_to_predicate(src_var, src_pattern.properties) and append a Filter(predicate=...) so pre-bound variables are filtered by the pattern’s labels/properties just like newly scanned variables; use the same helpers (_type_context, validate_compatible, _properties_to_predicate, Filter) and do not skip these checks when has_variable returns True.
2279-2285:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse recursive pattern-predicate validation in RETURN items.
This top-level
isinstance(..., PatternPredicate)check still misses nested cases;_validate_no_pattern_predicate()already exists for recursive traversal.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/planner/planner.py` around lines 2279 - 2285, The current top-level isinstance check on return_clause.items misses nested PatternPredicate cases; instead call the recursive validator _validate_no_pattern_predicate on each item's expression (e.g., inside the loop over return_clause.items call self._validate_no_pattern_predicate(item.expression) or the equivalent validator function) so nested pattern predicates are discovered and a SyntaxError is raised consistently for RETURN projections; remove the plain isinstance(..., _PPType) check and use _validate_no_pattern_predicate for each item.expression.
1199-1205:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon’t flatten disjunctive destination label groups.
Flattening
dst_pattern.labelschanges OR-of-AND semantics (e.g.(:B|C)becomes(:B:C)downstream).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/planner/planner.py` around lines 1199 - 1205, The current logic flattens dst_pattern.labels into flat_dst_labels which destroys OR-of-AND semantics; instead preserve the original list-of-conjunctions structure and stop merging inner lists. Replace the flat_dst_labels construction with using dst_pattern.labels (or [] if None) directly and set inline_dst_labels = dst_pattern.labels if not dst_was_prebound else []; update any downstream uses that expect flat lists to handle the conjunction-list shape or convert there with an explicit deliberate transform rather than here.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/graphforge/executor/executor.py`:
- Around line 3403-3406: The loop over op.subplan passes a constant op_index of
0 to each sub-operator, causing position-dependent operators (e.g.,
_execute_aggregate which checks op_index < total_ops - 1) to behave incorrectly;
update the loop in Executor._execute_operator where it iterates "for sub_op in
op.subplan" to use enumerate and pass the actual index and total_ops (e.g.,
replace the constant 0 with the enumerated index and keep len(op.subplan) as
total_ops) so each sub-operator receives its correct op_index (affecting
ApplyOptional subplans and any operator relying on position).
In `@src/graphforge/planner/planner.py`:
- Around line 1155-1156: The absorbed WHERE (where_predicate) is only being
passed into combined_where for single-hop (combined_where = where_predicate if
(hop_idx == 0 and is_single_hop) else None) so var-length and multi-hop
OptionalExpandEdges lose the predicate; fix by either gating absorption at the
caller (stop absorbing into where_predicate unless caller confirms single-hop)
or by threading combined_where (or an equivalent predicate) into the var-length
and multi-hop branches and into the OptionalExpandEdges construction so the
predicate is applied for all branches; update usages in _plan_simple_query, the
OptionalExpandEdges creation sites, and any logic that checks
hop_idx/is_single_hop to ensure predicates aren’t silently dropped.
---
Duplicate comments:
In `@src/graphforge/planner/planner.py`:
- Around line 1154-1170: The code currently only folds relationship properties
into combined_where but omits destination node inline predicates
(dst_pattern.properties), causing them to be lost for OPTIONAL MATCH; update the
same block in planner.py so that if dst_pattern.properties exists you
generate/ensure a destination variable (e.g., dst_var similar to rel_var or via
self._generate_anonymous_variable()), call
self._properties_to_predicate(dst_var, dst_pattern.properties) to get a
dst_prop_pred, and then AND it into combined_where (using the same BinaryOp
op="AND" pattern used for rel_prop_pred) so both rel_prop_pred and dst_prop_pred
are preserved in combined_where when hop_idx == 0 and is_single_hop.
- Around line 1111-1123: When src_var is already bound (i.e.,
self._type_context.has_variable(src_var) is True) ensure you still validate it
against the source pattern: call self._type_context.validate_compatible(src_var,
VariableType.NODE) to enforce node type, check that the existing variable's
labels include src_pattern.labels and if not raise or handle a mismatch, and if
src_pattern.properties exists build the predicate via
self._properties_to_predicate(src_var, src_pattern.properties) and append a
Filter(predicate=...) so pre-bound variables are filtered by the pattern’s
labels/properties just like newly scanned variables; use the same helpers
(_type_context, validate_compatible, _properties_to_predicate, Filter) and do
not skip these checks when has_variable returns True.
- Around line 2279-2285: The current top-level isinstance check on
return_clause.items misses nested PatternPredicate cases; instead call the
recursive validator _validate_no_pattern_predicate on each item's expression
(e.g., inside the loop over return_clause.items call
self._validate_no_pattern_predicate(item.expression) or the equivalent validator
function) so nested pattern predicates are discovered and a SyntaxError is
raised consistently for RETURN projections; remove the plain isinstance(...,
_PPType) check and use _validate_no_pattern_predicate for each item.expression.
- Around line 1199-1205: The current logic flattens dst_pattern.labels into
flat_dst_labels which destroys OR-of-AND semantics; instead preserve the
original list-of-conjunctions structure and stop merging inner lists. Replace
the flat_dst_labels construction with using dst_pattern.labels (or [] if None)
directly and set inline_dst_labels = dst_pattern.labels if not dst_was_prebound
else []; update any downstream uses that expect flat lists to handle the
conjunction-list shape or convert there with an explicit deliberate transform
rather than here.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 52976807-6243-4eda-8d9a-5364a242fe97
📒 Files selected for processing (4)
src/graphforge/ast/expression.pysrc/graphforge/executor/evaluator.pysrc/graphforge/executor/executor.pysrc/graphforge/planner/planner.py
🚧 Files skipped from review as they are similar to previous changes (2)
- src/graphforge/executor/evaluator.py
- src/graphforge/ast/expression.py
- executor: use enumerate in ApplyOptional subplan loop so each sub-op receives its correct op_index (fixes Aggregate position detection) - planner: pre-bound src_var in OPTIONAL MATCH now validates type and applies label/property filters (previously skipped entirely) - planner: change OptionalExpandEdges.dst_labels to list[list[str]] to preserve disjunction-of-conjunctions semantics; fix executor check to use _node_matches_labels instead of naive all() - planner: thread absorbed WHERE predicate into var-length and multi-hop OPTIONAL MATCH branches (previously only applied to single-hop) - planner: fold dst_pattern.properties into combined_where for single-hop OptionalExpandEdges; emit as Filter for var-length/multi-hop so dst_var is bound before evaluation - planner: use recursive _validate_no_pattern_predicate for RETURN clause (catches nested PatternPredicates, not just top-level) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/graphforge/executor/executor.py (1)
3286-3338:⚠️ Potential issue | 🟠 Major | ⚡ Quick winApply
dst_labelsbefore accepting a pre-bound destination.This branch filters by edge reachability when
dst_varis already bound, but it never validatesop.dst_labelsagainst that bound node. A pattern like(a)-[:R]->(b:Label)will therefore match any reachable pre-boundb, even whenbdoes not satisfy the inline destination labels.🐛 Proposed fix
if dst_bound: from graphforge.types.graph import NodeRef as _NodeRef bound_dst = ctx.get(op.dst_var) if isinstance(bound_dst, _NodeRef): + if op.dst_labels and not self._node_matches_labels(bound_dst, op.dst_labels): + result.append(_null_fill(ctx)) + continue bound_dst_id = bound_dst.id🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/executor/executor.py` around lines 3286 - 3338, When dst_var is pre-bound (the branch starting with dst_bound = op.dst_var in ctx.bindings), you must enforce op.dst_labels against the bound_dst before accepting matches: after obtaining bound_dst = ctx.get(op.dst_var) and confirming it's a NodeRef, check that the node's labels include all labels in op.dst_labels (if any); if the bound node does not satisfy op.dst_labels, treat it as no match (append the same _null_fill(ctx) and skip further edge/path handling), otherwise proceed with the existing matching_edges, where_predicate evaluation, CypherPath binding, and result population. Ensure you reference op.dst_var, op.dst_labels, ctx.get, bound_dst, _null_fill, edges, op.direction, src_node, evaluate_expression, ExecutionContext and CypherPath when locating and updating the code.
🧹 Nitpick comments (2)
src/graphforge/planner/operators.py (1)
706-709: ⚡ Quick winReject empty
ApplyOptional.subplanat construction.An empty
subplancurrently becomes a no-op branch in_execute_apply_optional, so a malformed plan would silently drop OPTIONAL null-filling instead of failing fast.♻️ Proposed fix
- subplan: list[Any] = Field(..., description="Inner operators to execute per outer row") + subplan: list[Any] = Field( + ..., min_length=1, description="Inner operators to execute per outer row" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/planner/operators.py` around lines 706 - 709, ApplyOptional currently allows an empty subplan which makes _execute_apply_optional silently skip null-filling; add validation to reject empty subplan at construction by enforcing that ApplyOptional.subplan is non-empty (e.g., a pydantic `@validator` or __init__ check) and raise a clear ValueError if len(subplan) == 0; update any tests that construct ApplyOptional to provide at least one inner operator. Ensure the check references the ApplyOptional.subplan field so malformed plans fail fast.src/graphforge/planner/planner.py (1)
1296-1307: 💤 Low value
dst_was_prebound_checkmay incorrectly identify anonymous destinations as pre-bound.When
dst_pattern.variableisNone(anonymous node), the checkdst_pattern.variable not in introduced_vars[-1:]evaluates toTrue(sinceNoneis never in the list), causing the code to treat anonymous destinations as pre-bound and emit a redundantScanNodesfor label filtering.Consider guarding with
dst_pattern.variable is not None:dst_was_prebound_check = ( dst_pattern.variable is not None - and dst_pattern.variable not in introduced_vars[-1:] + and dst_pattern.variable not in introduced_vars )The simplified check
not in introduced_vars(without slicing) is clearer and handles the case where the variable was introduced in an earlier hop of the same pattern.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/planner/planner.py` around lines 1296 - 1307, The dst_was_prebound_check logic can misclassify anonymous destinations as pre-bound; change the guard so you only consider a destination pre-bound when dst_pattern.variable is not None and that variable truly wasn't introduced earlier (use dst_pattern.variable is not None and dst_pattern.variable not in introduced_vars rather than the current slice check). Update the condition around ScanNodes creation (referencing dst_was_prebound_check, dst_pattern.variable, introduced_vars, ScanNodes, OptionalExpandEdges) so anonymous nodes are skipped and pre-bound detection correctly checks the full introduced_vars history; keep the same append behavior to operators or subplan depending on is_single_hop.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/graphforge/executor/executor.py`:
- Around line 1111-1142: When short-circuiting OPTIONAL var-length hops (in the
src_node is CypherNull branch and the "Bound dst is null" branch), preserve
existing bindings from ctx.bindings and only inject CypherNull for those
variables that were not already bound: copy ctx.bindings into null_ctx.bindings,
then if op.dst_var not already in ctx.bindings bind op.dst_var -> CypherNull(),
if op.edge_var exists and op.edge_var not already in ctx.bindings bind
op.edge_var -> CypherNull(), and if op.path_var exists bind op.path_var ->
CypherNull(); return that null_ctx. This ensures ExecutionContext, CypherNull,
op.dst_var, op.edge_var, op.path_var and ctx.bindings behave consistently with
the later found_rows/optional fallback.
---
Outside diff comments:
In `@src/graphforge/executor/executor.py`:
- Around line 3286-3338: When dst_var is pre-bound (the branch starting with
dst_bound = op.dst_var in ctx.bindings), you must enforce op.dst_labels against
the bound_dst before accepting matches: after obtaining bound_dst =
ctx.get(op.dst_var) and confirming it's a NodeRef, check that the node's labels
include all labels in op.dst_labels (if any); if the bound node does not satisfy
op.dst_labels, treat it as no match (append the same _null_fill(ctx) and skip
further edge/path handling), otherwise proceed with the existing matching_edges,
where_predicate evaluation, CypherPath binding, and result population. Ensure
you reference op.dst_var, op.dst_labels, ctx.get, bound_dst, _null_fill, edges,
op.direction, src_node, evaluate_expression, ExecutionContext and CypherPath
when locating and updating the code.
---
Nitpick comments:
In `@src/graphforge/planner/operators.py`:
- Around line 706-709: ApplyOptional currently allows an empty subplan which
makes _execute_apply_optional silently skip null-filling; add validation to
reject empty subplan at construction by enforcing that ApplyOptional.subplan is
non-empty (e.g., a pydantic `@validator` or __init__ check) and raise a clear
ValueError if len(subplan) == 0; update any tests that construct ApplyOptional
to provide at least one inner operator. Ensure the check references the
ApplyOptional.subplan field so malformed plans fail fast.
In `@src/graphforge/planner/planner.py`:
- Around line 1296-1307: The dst_was_prebound_check logic can misclassify
anonymous destinations as pre-bound; change the guard so you only consider a
destination pre-bound when dst_pattern.variable is not None and that variable
truly wasn't introduced earlier (use dst_pattern.variable is not None and
dst_pattern.variable not in introduced_vars rather than the current slice
check). Update the condition around ScanNodes creation (referencing
dst_was_prebound_check, dst_pattern.variable, introduced_vars, ScanNodes,
OptionalExpandEdges) so anonymous nodes are skipped and pre-bound detection
correctly checks the full introduced_vars history; keep the same append behavior
to operators or subplan depending on is_single_hop.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3d934b35-2a07-4aa8-ac5a-cca8842b44fe
📒 Files selected for processing (3)
src/graphforge/executor/executor.pysrc/graphforge/planner/operators.pysrc/graphforge/planner/planner.py
- executor: guard null-ctx bindings in var-length null-src and bound-null-dst branches — only inject CypherNull for vars not already in ctx.bindings - executor: enforce op.dst_labels against pre-bound dst node in _execute_optional_expand before filtering edges; null-fill if label check fails - operators: add @field_validator on ApplyOptional.subplan to reject empty subplans at construction time (fail-fast, prevents silent null-fill skip) - planner: fix dst_was_prebound_check slice bug — was checking only the last element of introduced_vars; now checks the full list Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/graphforge/executor/executor.py`:
- Around line 1255-1262: The null-fill block creates a copy ExecutionContext
(null_ctx) but always overwrites op.path_var with CypherNull(), which clobbers
outer bindings; change the logic to only call null_ctx.bind(op.path_var,
CypherNull()) if op.path_var is not already present in ctx.bindings (same check
used for op.dst_var and op.edge_var). Update the identical pattern elsewhere in
the file (the other null-fill block) so both places preserve pre-bound path_var
before binding CypherNull().
- Around line 1168-1170: The code binds var-length relationship results as a raw
Python list (edge_path) via new_ctx.bind(op.edge_var, edge_path); change this to
wrap the list in the CypherList wrapper so downstream expressions see a proper
Cypher list (e.g., new_ctx.bind(op.edge_var, CypherList(edge_path))). Also audit
places in executor functions where query results are bound (search for
new_ctx.bind usage and op.edge_var handling in src/graphforge/executor/*.py) and
ensure all scalar and collection results are wrapped in the appropriate
CypherValue subclasses (CypherString, CypherInt, CypherFloat, CypherBool,
CypherNull, CypherList, CypherMap) before binding.
- Around line 1130-1145: The code currently only propagates prior CypherNull
bindings when op.optional is true, but when op.dst_var is bound to CypherNull
and op.optional is false you must drop that row so the DFS/ExpandEdges logic
doesn't treat it as an unbound node; inside the dst_bound handling in
executor.py (the block that inspects ctx.get(op.dst_var) and sets
bound_dst_node) add an explicit branch for elif isinstance(_bound_dst,
CypherNull) and not getattr(op, "optional", False): simply continue to skip
emitting/processing this context (i.e., drop the NULL-constrained row) so
bound_dst_node remains not set but the path is not explored.
In `@src/graphforge/planner/planner.py`:
- Around line 1205-1211: Optional variable-length relationship patterns drop
inline predicates when properties are folded into combined_where because the
predicate argument uses combined_where or rel_pattern.predicate as a fallback;
update the predicate construction (where this call is made for optional=True in
planner.py) to AND the two when both exist: if combined_where is not None and
rel_pattern.predicate is not None create a _BinaryOp(op="AND",
left=rel_pattern.predicate, right=combined_where), otherwise use combined_where
or rel_pattern.predicate; ensure you change the predicate= argument accordingly
so inline predicates are preserved for optional variable-length expansions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1c5907aa-021a-4cda-82e6-30e864dcdebb
📒 Files selected for processing (3)
src/graphforge/executor/executor.pysrc/graphforge/planner/operators.pysrc/graphforge/planner/planner.py
| dst_bound = op.dst_var in ctx.bindings | ||
| bound_dst_node = None | ||
| if dst_bound: | ||
| _bound_dst = ctx.get(op.dst_var) | ||
| if isinstance(_bound_dst, _NodeRefCls): | ||
| bound_dst_node = _bound_dst | ||
| elif isinstance(_bound_dst, CypherNull) and getattr(op, "optional", False): | ||
| # Bound dst is null from prior optional — propagate null | ||
| null_ctx = ExecutionContext() | ||
| null_ctx.bindings = dict(ctx.bindings) | ||
| if op.edge_var and op.edge_var not in ctx.bindings: | ||
| null_ctx.bind(op.edge_var, CypherNull()) | ||
| if op.path_var and op.path_var not in ctx.bindings: | ||
| null_ctx.bind(op.path_var, CypherNull()) | ||
| result.append(null_ctx) | ||
| continue |
There was a problem hiding this comment.
Drop NULL-constrained rows in regular var-length MATCH.
When op.dst_var is already bound to CypherNull() and op.optional is false, this falls through with bound_dst_node = None, so the DFS can emit paths and overwrite the existing NULL binding. ExpandEdges treats that case as "no match"; this path should too.
Suggested fix
if dst_bound:
_bound_dst = ctx.get(op.dst_var)
if isinstance(_bound_dst, _NodeRefCls):
bound_dst_node = _bound_dst
- elif isinstance(_bound_dst, CypherNull) and getattr(op, "optional", False):
- # Bound dst is null from prior optional — propagate null
- null_ctx = ExecutionContext()
- null_ctx.bindings = dict(ctx.bindings)
- if op.edge_var and op.edge_var not in ctx.bindings:
- null_ctx.bind(op.edge_var, CypherNull())
- if op.path_var and op.path_var not in ctx.bindings:
- null_ctx.bind(op.path_var, CypherNull())
- result.append(null_ctx)
+ elif isinstance(_bound_dst, CypherNull):
+ if getattr(op, "optional", False):
+ # Bound dst is null from prior optional — propagate null
+ null_ctx = ExecutionContext()
+ null_ctx.bindings = dict(ctx.bindings)
+ if op.edge_var and op.edge_var not in ctx.bindings:
+ null_ctx.bind(op.edge_var, CypherNull())
+ if op.path_var and op.path_var not in ctx.bindings:
+ null_ctx.bind(op.path_var, CypherNull())
+ result.append(null_ctx)
+ # A regular MATCH cannot satisfy a NULL-constrained destination.
continue🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/executor/executor.py` around lines 1130 - 1145, The code
currently only propagates prior CypherNull bindings when op.optional is true,
but when op.dst_var is bound to CypherNull and op.optional is false you must
drop that row so the DFS/ExpandEdges logic doesn't treat it as an unbound node;
inside the dst_bound handling in executor.py (the block that inspects
ctx.get(op.dst_var) and sets bound_dst_node) add an explicit branch for elif
isinstance(_bound_dst, CypherNull) and not getattr(op, "optional", False):
simply continue to skip emitting/processing this context (i.e., drop the
NULL-constrained row) so bound_dst_node remains not set but the path is not
explored.
| # Bind edge list if variable provided | ||
| if op.edge_var: | ||
| new_ctx.bind(op.edge_var, edge_path) |
There was a problem hiding this comment.
Wrap var-length relationship bindings in CypherList.
edge_path is being stored as a raw Python list here, so the relationship variable for [*] paths has a different runtime type than other executor outputs and can bypass Cypher list semantics in downstream expressions.
Suggested fix
# Bind edge list if variable provided
if op.edge_var:
- new_ctx.bind(op.edge_var, edge_path)
+ new_ctx.bind(op.edge_var, CypherList(list(edge_path)))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/executor/executor.py` around lines 1168 - 1170, The code binds
var-length relationship results as a raw Python list (edge_path) via
new_ctx.bind(op.edge_var, edge_path); change this to wrap the list in the
CypherList wrapper so downstream expressions see a proper Cypher list (e.g.,
new_ctx.bind(op.edge_var, CypherList(edge_path))). Also audit places in executor
functions where query results are bound (search for new_ctx.bind usage and
op.edge_var handling in src/graphforge/executor/*.py) and ensure all scalar and
collection results are wrapped in the appropriate CypherValue subclasses
(CypherString, CypherInt, CypherFloat, CypherBool, CypherNull, CypherList,
CypherMap) before binding.
| null_ctx = ExecutionContext() | ||
| null_ctx.bindings = dict(ctx.bindings) | ||
| if op.dst_var not in ctx.bindings: | ||
| null_ctx.bind(op.dst_var, CypherNull()) | ||
| if op.edge_var and op.edge_var not in ctx.bindings: | ||
| null_ctx.bind(op.edge_var, CypherNull()) | ||
| if op.path_var: | ||
| null_ctx.bind(op.path_var, CypherNull()) |
There was a problem hiding this comment.
Preserve pre-bound path_var on OPTIONAL null-fill.
Both null-fill paths still overwrite an already-bound path_var with CypherNull(). That loses outer-scope bindings even though these helpers are otherwise preserving pre-bound vars.
Suggested fix
- if op.path_var:
+ if op.path_var and op.path_var not in ctx.bindings:
null_ctx.bind(op.path_var, CypherNull())- if op.path_var:
+ if op.path_var and op.path_var not in base_ctx.bindings:
null_ctx.bind(op.path_var, CypherNull())Also applies to: 3249-3250
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/executor/executor.py` around lines 1255 - 1262, The null-fill
block creates a copy ExecutionContext (null_ctx) but always overwrites
op.path_var with CypherNull(), which clobbers outer bindings; change the logic
to only call null_ctx.bind(op.path_var, CypherNull()) if op.path_var is not
already present in ctx.bindings (same check used for op.dst_var and
op.edge_var). Update the identical pattern elsewhere in the file (the other
null-fill block) so both places preserve pre-bound path_var before binding
CypherNull().
| predicate=( | ||
| combined_where | ||
| if combined_where is not None | ||
| else rel_pattern.predicate | ||
| ), | ||
| optional=True, | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if RelationshipPattern can have both properties and predicate
rg -n "class RelationshipPattern" -A 30 src/graphforge/ast/pattern.py | head -50Repository: DecisionNerd/graphforge
Length of output: 1660
🏁 Script executed:
cd src/graphforge/planner && sed -n '1160,1220p' planner.py | cat -nRepository: DecisionNerd/graphforge
Length of output: 3981
🏁 Script executed:
cd src/graphforge/planner && sed -n '975,995p' planner.py | cat -nRepository: DecisionNerd/graphforge
Length of output: 1481
🏁 Script executed:
cd src/graphforge/planner && sed -n '940,1000p' planner.py | grep -A 5 -B 5 "properties"Repository: DecisionNerd/graphforge
Length of output: 49
🏁 Script executed:
cd src/graphforge/planner && sed -n '900,980p' planner.py | cat -nRepository: DecisionNerd/graphforge
Length of output: 4739
🏁 Script executed:
cd src/graphforge/planner && sed -n '945,1010p' planner.py | cat -nRepository: DecisionNerd/graphforge
Length of output: 4284
Fix inline relationship predicates being dropped in optional variable-length patterns.
When a relationship pattern has both inline properties and an inline predicate (e.g., -[r:TYPE {prop: value} WHERE r.x > 5]->), the predicate is silently dropped in OPTIONAL MATCH variable-length expansions. The properties are folded into combined_where, but rel_pattern.predicate is only used as a fallback when combined_where is None—dropping the predicate when properties exist.
In regular MATCH, properties and predicates are handled independently (properties via Filter, predicates passed to the operator). For consistency and correctness, they should also be ANDed together in the optional path:
predicate=(
_BinaryOp(op="AND", left=rel_pattern.predicate, right=combined_where)
if combined_where is not None and rel_pattern.predicate is not None
else combined_where or rel_pattern.predicate
),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/planner/planner.py` around lines 1205 - 1211, Optional
variable-length relationship patterns drop inline predicates when properties are
folded into combined_where because the predicate argument uses combined_where or
rel_pattern.predicate as a fallback; update the predicate construction (where
this call is made for optional=True in planner.py) to AND the two when both
exist: if combined_where is not None and rel_pattern.predicate is not None
create a _BinaryOp(op="AND", left=rel_pattern.predicate, right=combined_where),
otherwise use combined_where or rel_pattern.predicate; ensure you change the
predicate= argument accordingly so inline predicates are preserved for optional
variable-length expansions.
Summary
WHERE (n)-[:R]->()as boolean existence checks) — fixes issues fix: filter node patterns in WHERE clause #306 and fix: triadic friend-of-friend pattern matching #302var_length_exactgrammar support (*2for fixed-count traversals)ApplyOptionaloperator for atomically left-joining multi-hop OPTIONAL MATCH patterns_null_fillin optional expand to preserve pre-bound variables (triadic regressions)OptionalExpandEdges.dst_labelsto prevent post-filter from dropping null rowsTCK Impact
~+13 net tests fixed in pattern/optional/triadic categories. Zero new regressions introduced. All 4,257 unit + integration tests pass.
Known Deferred (tracked in #309)
var_length_optional_relationships: edge-tracking cycle detection needed (B→LOOP→B returns B once instead of twice)optional_match_and_bound_nodes: unbound src OPTIONAL MATCH needs ApplyOptional wrappingoptionally_matching_relationship_with_a_relationship_that_is_already_bound: bound-edge lookup not yet implemented{ bare_pattern }(no MATCH keyword): grammar alternative neededCloses #306
Closes #302
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Behavior / Validation
Tests