fix: Phase 5 write clause edge cases — SET, DELETE, MERGE#372
Conversation
SET fixes: - SET n.prop = null now removes the property (openCypher semantics) - SET on a null entity is now a no-op instead of crashing - Setting a list of maps as a property now raises InvalidPropertyType DELETE fixes: - DELETE now passes rows through for RETURN clauses (was returning []) - Track deleted entity IDs in ExecutionContext.deleted_ids - Property access on deleted entities raises DeletedEntityAccess - labels() on deleted nodes raises DeletedEntityAccess - type() on deleted relationships still works (returns type string) Grammar fixes: - UNWIND ... CREATE ... RETURN ... SKIP/LIMIT now parses as single query - CALL ... CREATE ... RETURN ... SKIP/LIMIT now parses as single query Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…#303) - Support expression operands in DELETE (subscript/property access) - Fix DELETE+WITH no-RETURN result suppression for write ops after aggregation - Fix MERGE to match ALL existing nodes (not just first), enabling correct row multiplication semantics per openCypher spec - Fix MERGE relationship pattern to return tuple (src, rel, dst) for path binding - Preserve original clause order in write-op pipeline (DELETE before MERGE) - Fix grammar: allow MERGE after UNWIND, standalone DELETE+RETURN, CREATE in update queries - Fix TCK conftest: add map literal support in parameter substitution (_parse_param_value, _split_list_items now tracks {}, _param_to_cypher_literal handles dicts) - Delete operator now accepts list[Any] (expressions) not just list[str] - 3,697 TCK scenarios passing (+50 vs Phase 4), all unit tests passing Closes #341, #342, #303 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
WalkthroughThis PR extends DELETE to accept expressions beyond variable names, implements Changes
Sequence Diagram(s)sequenceDiagram
participant Executor as QueryExecutor
participant Context as ExecutionContext
participant Evaluator as Evaluator
participant PropertyAccess as PropertyAccess<br/>(Expression)
rect rgba(200, 220, 255, 0.5)
Note over Executor,PropertyAccess: DELETE with deleted_ids tracking
Executor->>Executor: _execute_delete(delete_targets, ctx)
Executor->>Executor: process each target (variable or expression)
Executor->>Context: record target ID to deleted_ids
Context->>Context: deleted_ids.add(entity_id)
end
rect rgba(255, 220, 200, 0.5)
Note over Executor,PropertyAccess: Subsequent PropertyAccess evaluation
Executor->>Evaluator: _evaluate_expression(node_ref.prop, ctx)
Evaluator->>Evaluator: evaluate PropertyAccess(node_ref, 'prop')
Evaluator->>Context: check if node_ref.id in deleted_ids
alt Entity was deleted
Context-->>Evaluator: True (deleted)
Evaluator-->>Executor: raise EntityNotFoundError
else Entity exists
Context-->>Evaluator: False (not deleted)
Evaluator-->>Executor: return property_value
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related issues
Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 44 minutes and 36 seconds.Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/graphforge/planner/planner.py (1)
1110-1117:⚠️ Potential issue | 🔴 Critical
OPTIONAL MATCHpath binding is planner-only here.
path_varis added to the type context, butOptionalExpandEdgesinsrc/graphforge/planner/operators.pystill has nopath_varfield, so nothing in the plan can actually populatepforOPTIONAL MATCH p = (a)-[r]->(b). That makes the variable look valid during planning and then disappear at execution time.Based on learnings "Implement Cypher features across all four layers: Parser (cypher.lark + parser.py), AST (ast/ dataclasses), Planner (planner.py + operators.py), Executor (executor.py)".
Also applies to: 1127-1130
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/planner/planner.py` around lines 1110 - 1117, The planner adds a path variable (path_var) for OPTIONAL MATCH but the operator OptionalExpandEdges (referenced in planner.py) does not accept or store path_var, so the plan cannot populate the path at execution; update the operator usage and definition: modify the OptionalExpandEdges constructor call(s) in planner.py (the occurrences around the current diff and the other block at 1127-1130) to pass path_var, and modify the OptionalExpandEdges class in src/graphforge/planner/operators.py to accept, store, and expose a path_var field (and propagate it through any serialization/execution hooks) so the executor can receive and populate the bound path.
🧹 Nitpick comments (2)
src/graphforge/executor/executor.py (2)
2753-2762: Docstring is outdated after return type change.The docstring says "Returns: True if pattern was created, False if matched" but the method now returns a 4-tuple
(was_created, src, rel, dst). Update the docstring to reflect the new return signature.📝 Proposed docstring fix
def _merge_relationship_pattern(self, pattern_parts: list, ctx: ExecutionContext) -> tuple: """Merge a relationship pattern (node-rel-node). Args: pattern_parts: List of pattern parts (nodes and relationships) ctx: Execution context Returns: - True if pattern was created, False if matched + Tuple of (was_created, src_node, rel_edge, dst_node): + - was_created: True if pattern was created, False if matched + - src_node: Source NodeRef + - rel_edge: Relationship EdgeRef + - dst_node: Destination NodeRef """🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/executor/executor.py` around lines 2753 - 2762, Update the _merge_relationship_pattern docstring to reflect the new return value: replace the old "Returns: True if pattern was created, False if matched" with a description that the method returns a 4-tuple (was_created, src, rel, dst), where was_created is a boolean indicating whether the pattern was created, src is the source node object, rel is the relationship object, and dst is the destination node object; keep existing Args and brief purpose intact and ensure the return section documents types and semantics for each tuple element.
2707-2719: ON MATCH SET correctly applied, but code flow is non-obvious.The extra matched nodes (beyond the first) have ON MATCH SET applied immediately (line 2718) and are appended directly to
result. The first match'snew_ctxhas ON MATCH SET applied later at line 2746. While functionally correct, this dual-path handling makes the code harder to follow.Consider adding a comment clarifying that the first match is handled by the common ON MATCH path at line 2746.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/executor/executor.py` around lines 2707 - 2719, The loop over matched_nodes applies ON MATCH SET to extra contexts immediately (extra_node -> extra_ctx via ExecutionContext and _execute_set_items when op.on_match) while the first match's new_ctx has ON MATCH SET applied later at new_ctx handling; make this explicit by adding a concise inline comment above the loop (or above the extra-node branch) stating that extra matches are processed here but the first match is handled by the common ON MATCH path later (referencing matched_nodes, extra_node, ExecutionContext, op.on_match, _execute_set_items, new_ctx, node_pattern.variable, path_variable, CypherPath, and result) so readers understand the dual-path flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/graphforge/executor/evaluator.py`:
- Around line 80-81: ExecutionContext initializes self.deleted_ids but
cloned/derived contexts created with ExecutionContext(...) do not copy it,
allowing deleted-entity checks to be bypassed; update every
ExecutionContext(...) construction in this file (e.g., places creating child
contexts for comprehensions/subqueries and any use sites near evaluator methods)
to propagate the parent.deleted_ids into the new context by assigning
new_ctx.deleted_ids = parent.deleted_ids (or pass it through the constructor if
available) so that DeletedEntityAccess checks remain effective in derived
contexts.
- Around line 4686-4690: The deleted-entity guard currently checks
ctx.deleted_ids only for LABELS; update the same guard logic wherever graph
property-style functions read entity data so PROPERTIES and KEYS also reject
access to deleted entities. Locate the evaluator code paths handling the
constants/operations LABELS, PROPERTIES, and KEYS (and any helper functions that
read entity properties or keys) and add the same conditional: if ctx is not None
and hasattr(ctx, "deleted_ids") and arg.id in ctx.deleted_ids: raise
EntityNotFoundError("DeletedEntityAccess: Tried to access properties/keys of a
deleted entity"), ensuring the error message mentions properties/keys
appropriately and reusing the existing EntityNotFoundError type.
In `@src/graphforge/executor/executor.py`:
- Around line 2521-2526: The DELETE handler currently evaluates expressions via
evaluate_expression(var_ref, ctx, self) and then silently skips when element is
neither CypherNull nor None; change this to validate the runtime type and raise
an error for non-element values: after element = evaluate_expression(var_ref,
ctx, self) keep the CypherNull/None skip, but if element is not an instance of
NodeRef or EdgeRef raise a TypeError (or the project’s standard executor error)
with a clear message indicating DELETE received a non-node/non-relationship
value (include the evaluated type and the var_ref for context); ensure this
check is implemented in the same block handling deletion so DELETE semantics
follow the openCypher spec.
In `@src/graphforge/parser/cypher.lark`:
- Around line 61-63: update_query is over-permissive because all write/return
clauses are optional, allowing a bare MATCH to validate; update the grammar for
update_query (referencing update_query, match_clause, optional_match_clause,
set_clause, remove_clause, create_clause, merge_clause, delete_clause,
return_clause) so that at least one "write or output" clause is required after
MATCH/OPTIONAL MATCH — e.g., replace the trailing optional groups (* and ?
combinations) with a required alternative that includes at least one of
set_clause|remove_clause|create_clause|merge_clause|delete_clause|return_clause
(or make the final return/delete/merge mandatory) so a bare MATCH no longer
matches.
In `@src/graphforge/planner/planner.py`:
- Around line 292-296: The current pre-pass loops over create_clauses and calls
_validate_create_patterns(create.patterns) followed by
_bind_pattern_types(create.patterns), which binds CREATE variables too early and
makes them visible to earlier write validations; change this so you only
pre-validate CREATE patterns (call _validate_create_patterns) in the global
pre-pass and remove the early _bind_pattern_types call there, and instead
perform _bind_pattern_types(create.patterns) inline when handling each write
clause in the ordered write-clause pre-pass (the same pass that validates
SET/DELETE/MERGE), ensuring CREATE bindings happen in order and not before
earlier write validations (update all occurrences where _bind_pattern_types is
currently invoked during the global pre-pass for create_clauses, including the
similar block later).
In `@tests/tck/conftest.py`:
- Around line 1051-1061: The map-parsing loop in conftest.py (inside the block
that detects value_str as a map) can call stripped.index(":") and raise a
generic ValueError for malformed items; update the loop in the map branch to
check for a colon (e.g., if ":" not in stripped) before calling index/slicing
and raise a clear ValueError mentioning the offending raw_item and context
(e.g., "invalid map entry: '<item>'"), then continue parsing valid entries using
the existing key extraction and _parse_param_value call so failures are explicit
and debuggable.
- Around line 298-300: The map key is emitted verbatim in
_param_to_cypher_literal, which can produce invalid Cypher when keys are not
valid identifiers; update the dict branch in _param_to_cypher_literal to check
each key: if it matches a valid Cypher identifier pattern (e.g.
/^[A-Za-z_][A-Za-z0-9_]*$/) emit as-is, otherwise emit the key as a properly
escaped single-quoted string (escape internal single quotes) so non-identifier
keys produce valid Cypher map literals.
---
Outside diff comments:
In `@src/graphforge/planner/planner.py`:
- Around line 1110-1117: The planner adds a path variable (path_var) for
OPTIONAL MATCH but the operator OptionalExpandEdges (referenced in planner.py)
does not accept or store path_var, so the plan cannot populate the path at
execution; update the operator usage and definition: modify the
OptionalExpandEdges constructor call(s) in planner.py (the occurrences around
the current diff and the other block at 1127-1130) to pass path_var, and modify
the OptionalExpandEdges class in src/graphforge/planner/operators.py to accept,
store, and expose a path_var field (and propagate it through any
serialization/execution hooks) so the executor can receive and populate the
bound path.
---
Nitpick comments:
In `@src/graphforge/executor/executor.py`:
- Around line 2753-2762: Update the _merge_relationship_pattern docstring to
reflect the new return value: replace the old "Returns: True if pattern was
created, False if matched" with a description that the method returns a 4-tuple
(was_created, src, rel, dst), where was_created is a boolean indicating whether
the pattern was created, src is the source node object, rel is the relationship
object, and dst is the destination node object; keep existing Args and brief
purpose intact and ensure the return section documents types and semantics for
each tuple element.
- Around line 2707-2719: The loop over matched_nodes applies ON MATCH SET to
extra contexts immediately (extra_node -> extra_ctx via ExecutionContext and
_execute_set_items when op.on_match) while the first match's new_ctx has ON
MATCH SET applied later at new_ctx handling; make this explicit by adding a
concise inline comment above the loop (or above the extra-node branch) stating
that extra matches are processed here but the first match is handled by the
common ON MATCH path later (referencing matched_nodes, extra_node,
ExecutionContext, op.on_match, _execute_set_items, new_ctx,
node_pattern.variable, path_variable, CypherPath, and result) so readers
understand the dual-path flow.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 70bbfa5c-d9ad-4cf0-b45c-34278c30d0bd
📒 Files selected for processing (9)
src/graphforge/ast/clause.pysrc/graphforge/executor/evaluator.pysrc/graphforge/executor/executor.pysrc/graphforge/parser/cypher.larksrc/graphforge/parser/parser.pysrc/graphforge/planner/operators.pysrc/graphforge/planner/planner.pytests/tck/conftest.pytests/unit/executor/test_executor_error_paths.py
| # Track deleted entity IDs (set of int) so property access raises DeletedEntityAccess | ||
| self.deleted_ids: set[int | str] = set() |
There was a problem hiding this comment.
Propagate deleted_ids into derived execution contexts.
deleted_ids is initialized on ExecutionContext, but derived contexts created during expression evaluation don’t copy it. That allows deleted-entity access checks to be bypassed inside comprehensions/subqueries (e.g., property access evaluated in a child context).
Suggested fix
class ExecutionContext:
@@
def __init__(self):
@@
self.deleted_ids: set[int | str] = set()
+
+ `@classmethod`
+ def from_parent(cls, parent: "ExecutionContext") -> "ExecutionContext":
+ child = cls()
+ child.bindings = dict(parent.bindings)
+ child.aggregate_cache = parent.aggregate_cache
+ child.deleted_ids = set(parent.deleted_ids)
+ return child- new_ctx = ExecutionContext()
- new_ctx.bindings = dict(ctx.bindings)
- new_ctx.aggregate_cache = ctx.aggregate_cache
+ new_ctx = ExecutionContext.from_parent(ctx)Apply the same replacement to all other ExecutionContext() cloning sites in this file.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/executor/evaluator.py` around lines 80 - 81, ExecutionContext
initializes self.deleted_ids but cloned/derived contexts created with
ExecutionContext(...) do not copy it, allowing deleted-entity checks to be
bypassed; update every ExecutionContext(...) construction in this file (e.g.,
places creating child contexts for comprehensions/subqueries and any use sites
near evaluator methods) to propagate the parent.deleted_ids into the new context
by assigning new_ctx.deleted_ids = parent.deleted_ids (or pass it through the
constructor if available) so that DeletedEntityAccess checks remain effective in
derived contexts.
| # Check if entity was deleted in this execution context | ||
| if ctx is not None and hasattr(ctx, "deleted_ids") and arg.id in ctx.deleted_ids: | ||
| raise EntityNotFoundError( | ||
| "DeletedEntityAccess: Tried to access labels of a deleted entity" | ||
| ) |
There was a problem hiding this comment.
Deleted-entity guard is incomplete for graph property-style functions.
You added the deleted check for LABELS, but PROPERTIES and KEYS still read from deleted entities. That creates inconsistent deleted-access behavior for equivalent property-oriented access patterns.
Suggested fix
def _evaluate_graph_function(
@@
) -> CypherValue:
+ def _ensure_not_deleted(arg: NodeRef | EdgeRef) -> None:
+ if ctx is not None and arg.id in ctx.deleted_ids:
+ raise EntityNotFoundError(
+ "DeletedEntityAccess: Tried to access data of a deleted entity"
+ )
@@
if func_name == "LABELS":
@@
if isinstance(arg, NodeRef):
- # Check if entity was deleted in this execution context
- if ctx is not None and hasattr(ctx, "deleted_ids") and arg.id in ctx.deleted_ids:
- raise EntityNotFoundError(
- "DeletedEntityAccess: Tried to access labels of a deleted entity"
- )
+ _ensure_not_deleted(arg)
labels_list: list[CypherValue] = [CypherString(label) for label in sorted(arg.labels)]
return CypherList(labels_list)
@@
if func_name == "PROPERTIES":
@@
if isinstance(arg, (NodeRef, EdgeRef)):
+ _ensure_not_deleted(arg)
return CypherMap(dict(arg.properties))
@@
if func_name == "KEYS":
@@
if isinstance(arg, (NodeRef, EdgeRef)):
+ _ensure_not_deleted(arg)
keys_list: list[CypherValue] = [CypherString(k) for k in sorted(arg.properties)]
return CypherList(keys_list)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/executor/evaluator.py` around lines 4686 - 4690, The
deleted-entity guard currently checks ctx.deleted_ids only for LABELS; update
the same guard logic wherever graph property-style functions read entity data so
PROPERTIES and KEYS also reject access to deleted entities. Locate the evaluator
code paths handling the constants/operations LABELS, PROPERTIES, and KEYS (and
any helper functions that read entity properties or keys) and add the same
conditional: if ctx is not None and hasattr(ctx, "deleted_ids") and arg.id in
ctx.deleted_ids: raise EntityNotFoundError("DeletedEntityAccess: Tried to access
properties/keys of a deleted entity"), ensuring the error message mentions
properties/keys appropriately and reusing the existing EntityNotFoundError type.
| else: | ||
| element = evaluate_expression(var_ref, ctx, self) | ||
|
|
||
| # Delete from graph | ||
| if isinstance(element, NodeRef): | ||
| # Get all edges connected to this node | ||
| outgoing = self.graph.get_outgoing_edges(element.id) | ||
| incoming = self.graph.get_incoming_edges(element.id) | ||
| all_edges = outgoing + incoming | ||
|
|
||
| # Check if node has relationships | ||
| if all_edges and not op.detach: | ||
| raise ValueError( | ||
| "Cannot delete node with relationships. " | ||
| "Use DETACH DELETE to delete relationships first." | ||
| ) | ||
| # Skip NULL values - they don't exist in the graph | ||
| if isinstance(element, CypherNull) or element is None: | ||
| continue |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does openCypher DELETE clause raise an error when attempting to delete a non-node/non-relationship value like a property?
💡 Result:
Yes, the openCypher DELETE clause raises an error when attempting to delete a non-node/non-relationship value like a property.
Citations:
- 1: https://stackoverflow.com/questions/18010551/delete-property-from-neo4j-graph
- 2: https://neo4j.com/docs/cypher-manual/current/clauses/delete/
- 3: https://neo4j.com/docs/cypher-manual/current/clauses/remove/
DELETE should raise an error for non-element types, not silently skip them.
Per the openCypher specification (Neo4j Cypher manual), attempting to delete a non-node/non-relationship value (e.g., a property accessed via DELETE n.name) must raise an error. The current code silently skips evaluation when element is neither NodeRef, EdgeRef, CypherNull, nor None, which violates the spec. Add error handling to raise a TypeError or equivalent when a non-element type is provided to DELETE.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/executor/executor.py` around lines 2521 - 2526, The DELETE
handler currently evaluates expressions via evaluate_expression(var_ref, ctx,
self) and then silently skips when element is neither CypherNull nor None;
change this to validate the runtime type and raise an error for non-element
values: after element = evaluate_expression(var_ref, ctx, self) keep the
CypherNull/None skip, but if element is not an instance of NodeRef or EdgeRef
raise a TypeError (or the project’s standard executor error) with a clear
message indicating DELETE received a non-node/non-relationship value (include
the evaluated type and the var_ref for context); ensure this check is
implemented in the same block handling deletion so DELETE semantics follow the
openCypher spec.
| update_query: match_clause where_clause? optional_match_clause+ where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause? | ||
| | match_clause where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause? | ||
| | optional_match_clause where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause? |
There was a problem hiding this comment.
update_query now accepts no-op MATCH endings (over-permissive grammar).
At Line 61 / Line 62 / Line 63, all update parts are optional (* + ?), so a bare MATCH ... can match update_query without any write clause or RETURN. That weakens syntax validation and can mask invalid query endings.
Suggested grammar tightening
-update_query: match_clause where_clause? optional_match_clause+ where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause?
- | match_clause where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause?
- | optional_match_clause where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause?
+update_query: match_clause where_clause? optional_match_clause+ where_clause? ((set_clause | remove_clause | create_clause | merge_clause)+ delete_clause? merge_clause* | delete_clause merge_clause*) return_clause? order_by_clause? skip_clause? limit_clause?
+ | match_clause where_clause? ((set_clause | remove_clause | create_clause | merge_clause)+ delete_clause? merge_clause* | delete_clause merge_clause*) return_clause? order_by_clause? skip_clause? limit_clause?
+ | optional_match_clause where_clause? ((set_clause | remove_clause | create_clause | merge_clause)+ delete_clause? merge_clause* | delete_clause merge_clause*) return_clause? order_by_clause? skip_clause? limit_clause?📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| update_query: match_clause where_clause? optional_match_clause+ where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause? | |
| | match_clause where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause? | |
| | optional_match_clause where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause? | |
| update_query: match_clause where_clause? optional_match_clause+ where_clause? ((set_clause | remove_clause | create_clause | merge_clause)+ delete_clause? merge_clause* | delete_clause merge_clause*) return_clause? order_by_clause? skip_clause? limit_clause? | |
| | match_clause where_clause? ((set_clause | remove_clause | create_clause | merge_clause)+ delete_clause? merge_clause* | delete_clause merge_clause*) return_clause? order_by_clause? skip_clause? limit_clause? | |
| | optional_match_clause where_clause? ((set_clause | remove_clause | create_clause | merge_clause)+ delete_clause? merge_clause* | delete_clause merge_clause*) return_clause? order_by_clause? skip_clause? limit_clause? |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/parser/cypher.lark` around lines 61 - 63, update_query is
over-permissive because all write/return clauses are optional, allowing a bare
MATCH to validate; update the grammar for update_query (referencing
update_query, match_clause, optional_match_clause, set_clause, remove_clause,
create_clause, merge_clause, delete_clause, return_clause) so that at least one
"write or output" clause is required after MATCH/OPTIONAL MATCH — e.g., replace
the trailing optional groups (* and ? combinations) with a required alternative
that includes at least one of
set_clause|remove_clause|create_clause|merge_clause|delete_clause|return_clause
(or make the final return/delete/merge mandatory) so a bare MATCH no longer
matches.
| # 3. Pre-validate CREATE patterns (binding happens inline in step 5) | ||
| for create in create_clauses: | ||
| self._validate_create_patterns(create.patterns) | ||
| self._bind_pattern_types(create.patterns) | ||
|
|
There was a problem hiding this comment.
Don't bind future CREATE variables before validating earlier writes.
This makes later CREATE bindings visible to earlier SET/DELETE/MERGE validation. For example, MATCH (a) SET b.name = 1 CREATE (b) will now pass planning even though b is not in scope at the SET. CREATE validation/binding needs to happen inside the ordered write-clause pre-pass, not in a global pre-pass.
Suggested fix
- # 3. Pre-validate CREATE patterns (binding happens inline in step 5)
- for create in create_clauses:
- self._validate_create_patterns(create.patterns)
- self._bind_pattern_types(create.patterns)
-
- # 4. WHERE (before write operations so filter applies to reading results)
+ # 3. WHERE (before write operations so filter applies to reading results)
if where_clause:
...
- # 5. Write operations (MERGE, SET, REMOVE, DELETE, CREATE) in original clause order
+ # 4. Write operations (MERGE, SET, REMOVE, DELETE, CREATE) in original clause order
for clause in clauses:
if isinstance(clause, MergeClause):
self._validate_merge_patterns(clause.patterns)
self._validate_pattern_types(clause.patterns)
if clause.on_create:
self._validate_set_clause_variables(clause.on_create)
if clause.on_match:
self._validate_set_clause_variables(clause.on_match)
elif isinstance(clause, CreateClause):
- pass # already validated above
+ self._validate_create_patterns(clause.patterns)
+ self._bind_pattern_types(clause.patterns)
elif isinstance(clause, DeleteClause):
self._validate_delete_clause_variables(clause)
elif isinstance(clause, SetClause):
self._validate_set_clause_variables(clause)Also applies to: 313-327
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/graphforge/planner/planner.py` around lines 292 - 296, The current
pre-pass loops over create_clauses and calls
_validate_create_patterns(create.patterns) followed by
_bind_pattern_types(create.patterns), which binds CREATE variables too early and
makes them visible to earlier write validations; change this so you only
pre-validate CREATE patterns (call _validate_create_patterns) in the global
pre-pass and remove the early _bind_pattern_types call there, and instead
perform _bind_pattern_types(create.patterns) inline when handling each write
clause in the ordered write-clause pre-pass (the same pass that validates
SET/DELETE/MERGE), ensuring CREATE bindings happen in order and not before
earlier write validations (update all occurrences where _bind_pattern_types is
currently invoked during the global pre-pass for create_clauses, including the
similar block later).
| if value_str.startswith("{") and value_str.endswith("}"): | ||
| result: dict = {} | ||
| for raw_item in _split_list_items(value_str[1:-1]): | ||
| stripped = raw_item.strip() | ||
| if not stripped: | ||
| continue | ||
| colon_idx = stripped.index(":") | ||
| key = stripped[:colon_idx].strip() | ||
| val = _parse_param_value(stripped[colon_idx + 1 :].strip()) | ||
| result[key] = val | ||
| return result |
There was a problem hiding this comment.
Handle invalid map entries explicitly instead of crashing on index(':').
Line 1057 can throw a generic ValueError for malformed items; prefer explicit validation with a clear failure message.
Suggested fix
if value_str.startswith("{") and value_str.endswith("}"):
result: dict = {}
for raw_item in _split_list_items(value_str[1:-1]):
stripped = raw_item.strip()
if not stripped:
continue
- colon_idx = stripped.index(":")
- key = stripped[:colon_idx].strip()
- val = _parse_param_value(stripped[colon_idx + 1 :].strip())
+ if ":" not in stripped:
+ raise ValueError(f"Invalid map entry (missing ':'): {stripped!r}")
+ key, _, raw_val = stripped.partition(":")
+ key = key.strip()
+ if len(key) >= 2 and key[0] == key[-1] and key[0] in ("'", '"'):
+ key = key[1:-1]
+ val = _parse_param_value(raw_val.strip())
result[key] = val
return result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/tck/conftest.py` around lines 1051 - 1061, The map-parsing loop in
conftest.py (inside the block that detects value_str as a map) can call
stripped.index(":") and raise a generic ValueError for malformed items; update
the loop in the map branch to check for a colon (e.g., if ":" not in stripped)
before calling index/slicing and raise a clear ValueError mentioning the
offending raw_item and context (e.g., "invalid map entry: '<item>'"), then
continue parsing valid entries using the existing key extraction and
_parse_param_value call so failures are explicit and debuggable.
- Add path_var field to OptionalExpandEdges operator and bind CypherPath in _execute_optional_expand for all match/no-match branches - Pass path_var from planner when creating OptionalExpandEdges - Update _merge_relationship_pattern docstring to document 4-tuple return - Add inline comment clarifying dual-path ON MATCH SET flow in _execute_merge - Fix _param_to_cypher_literal: escape non-identifier dict keys with single quotes (e.g. keys with spaces or special chars) in TCK conftest parameter substitution Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
DELETE nodes[0],DELETE map.key, subscript/property-access patterns now parse, plan, and execute correctly.Deleteoperator andDeleteClauseAST node changed fromlist[str]tolist[Any].[]even when an intermediateWITHaggregate is in the pipeline.MERGE (b)with no constraints now matches ALL existing nodes (producing one row per match), per openCypher spec. This fixes row-count tests likeMATCH (a) MERGE (b) WITH * OPTIONAL MATCH (a)--(b) RETURN count(*)→ 6._merge_relationship_patternnow returns(was_created, src, rel, dst)tuple, enabling path bindingMERGE p = (a)-[r:R]->(b).UNWIND+MERGE, standaloneDELETE+RETURN,CREATEinupdate_queryinterleaved writes.{login: 'x'}) —_parse_param_value,_split_list_items,_param_to_cypher_literalnow handle dicts correctly.Test plan
uv run pytest tests/tck/ -k "Delete or Set or Merge"→ 199/199 passinguv run pytest tests/unit/→ 2418/2418 passinguv run mypy src/graphforge/→ no issuesuv run ruff format --check && uv run ruff check→ cleanCloses #341, #342, #303
🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
DELETE nodes.key)Bug Fixes
Improvements