Skip to content

fix: Phase 5 write clause edge cases — SET, DELETE, MERGE#372

Merged
DecisionNerd merged 3 commits into
mainfrom
fix/341-342-303-write-clause-edge-cases
Apr 29, 2026
Merged

fix: Phase 5 write clause edge cases — SET, DELETE, MERGE#372
DecisionNerd merged 3 commits into
mainfrom
fix/341-342-303-write-clause-edge-cases

Conversation

@DecisionNerd
Copy link
Copy Markdown
Owner

@DecisionNerd DecisionNerd commented Apr 29, 2026

Summary

  • DELETE expression operands: DELETE nodes[0], DELETE map.key, subscript/property-access patterns now parse, plan, and execute correctly. Delete operator and DeleteClause AST node changed from list[str] to list[Any].
  • DELETE after WITH no result: Queries ending in a write op (DELETE/SET/MERGE/etc.) now correctly return [] even when an intermediate WITH aggregate is in the pipeline.
  • MERGE multi-match semantics: MERGE (b) with no constraints now matches ALL existing nodes (producing one row per match), per openCypher spec. This fixes row-count tests like MATCH (a) MERGE (b) WITH * OPTIONAL MATCH (a)--(b) RETURN count(*) → 6.
  • MERGE path variable for relationships: _merge_relationship_pattern now returns (was_created, src, rel, dst) tuple, enabling path binding MERGE p = (a)-[r:R]->(b).
  • Write-op ordering: Planner now emits write ops in original clause order, fixing DELETE-before-MERGE semantics.
  • Grammar: Added UNWIND+MERGE, standalone DELETE+RETURN, CREATE in update_query interleaved writes.
  • TCK conftest: Fixed parameter substitution for map literal params ({login: 'x'}) — _parse_param_value, _split_list_items, _param_to_cypher_literal now handle dicts correctly.

Test plan

  • uv run pytest tests/tck/ -k "Delete or Set or Merge" → 199/199 passing
  • uv run pytest tests/unit/ → 2418/2418 passing
  • Full TCK: 3,697 passing (up from 3,647 after Phase 4, +50 new passes)
  • uv run mypy src/graphforge/ → no issues
  • uv run ruff format --check && uv run ruff check → clean

Closes #341, #342, #303

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • DELETE operations now support deleting expressions and properties, not just variables (e.g., DELETE nodes.key)
    • Enhanced error detection when accessing previously deleted entities
  • Bug Fixes

    • Improved null and type safety handling in property assignments
    • Better support for matching multiple nodes and path variables in MERGE operations
    • Corrected row flow behavior in delete operations
  • Improvements

    • More flexible query syntax supporting complex DELETE targets and optional trailing clauses

DecisionNerd and others added 2 commits April 28, 2026 11:13
SET fixes:
- SET n.prop = null now removes the property (openCypher semantics)
- SET on a null entity is now a no-op instead of crashing
- Setting a list of maps as a property now raises InvalidPropertyType

DELETE fixes:
- DELETE now passes rows through for RETURN clauses (was returning [])
- Track deleted entity IDs in ExecutionContext.deleted_ids
- Property access on deleted entities raises DeletedEntityAccess
- labels() on deleted nodes raises DeletedEntityAccess
- type() on deleted relationships still works (returns type string)

Grammar fixes:
- UNWIND ... CREATE ... RETURN ... SKIP/LIMIT now parses as single query
- CALL ... CREATE ... RETURN ... SKIP/LIMIT now parses as single query

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…#303)

- Support expression operands in DELETE (subscript/property access)
- Fix DELETE+WITH no-RETURN result suppression for write ops after aggregation
- Fix MERGE to match ALL existing nodes (not just first), enabling correct row
  multiplication semantics per openCypher spec
- Fix MERGE relationship pattern to return tuple (src, rel, dst) for path binding
- Preserve original clause order in write-op pipeline (DELETE before MERGE)
- Fix grammar: allow MERGE after UNWIND, standalone DELETE+RETURN, CREATE in update queries
- Fix TCK conftest: add map literal support in parameter substitution (_parse_param_value,
  _split_list_items now tracks {}, _param_to_cypher_literal handles dicts)
- Delete operator now accepts list[Any] (expressions) not just list[str]
- 3,697 TCK scenarios passing (+50 vs Phase 4), all unit tests passing

Closes #341, #342, #303

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 29, 2026

Warning

Rate limit exceeded

@DecisionNerd has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 44 minutes and 36 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d77d7806-6ca9-406d-88d7-d85788a19214

📥 Commits

Reviewing files that changed from the base of the PR and between a134ee6 and 741fc2f.

📒 Files selected for processing (4)
  • src/graphforge/executor/executor.py
  • src/graphforge/planner/operators.py
  • src/graphforge/planner/planner.py
  • tests/tck/conftest.py

Walkthrough

This PR extends DELETE to accept expressions beyond variable names, implements EntityNotFoundError for accessing deleted entities, updates MERGE to collect all matching nodes with path binding support, refines SET semantics for nulls and type safety, and reorders write-clause execution to preserve clause order during planning.

Changes

Cohort / File(s) Summary
Grammar & Parser
src/graphforge/parser/cypher.lark, src/graphforge/parser/parser.py
Extended grammar to permit broader clause sequences in write_query, update_query, unwind_query, and call_query. DELETE now accepts property_access and subscript expressions via new delete_expr rule. Parser transformer updated to handle non-Variable delete operands and normalize them into DeleteClause.
AST & Operator Definitions
src/graphforge/ast/clause.py, src/graphforge/planner/operators.py
DeleteClause.variables and Delete.variables broadened from list[str] to list[Any] with updated descriptions. Both classes updated with arbitrary_types_allowed=True in model configuration to permit non-primitive types.
Execution Context & Error Handling
src/graphforge/executor/evaluator.py
Added EntityNotFoundError exception. ExecutionContext now tracks deleted_ids: set[int | str]. _evaluate_graph_function signature updated to accept optional ctx parameter. PropertyAccess and LABELS graph function now raise EntityNotFoundError when accessing entities present in ctx.deleted_ids.
Core Execution Logic
src/graphforge/executor/executor.py
Significant updates to DELETE, MERGE, and SET semantics: _execute_delete now accepts expressions (not just variables), skips CypherNull/None, records deleted IDs to ctx.deleted_ids, and returns input rows instead of empty list. _execute_merge collects all matching nodes with per-match ON MATCH SET application and optional path-variable binding. _execute_set_items implements openCypher null/type-safety behavior—null entity writes are no-ops, non-property assignments skipped, and assigning null removes properties. Output suppression logic updated to detect final write operators more accurately.
Query Planning
src/graphforge/planner/planner.py
Write-clause execution changed from group-and-append to order-preserving emission over original clause sequence. Pre-pass validates MERGE patterns and SET/DELETE variable scopes. OPTIONAL MATCH planning updated to bind path_var into type context. Delete-clause validation extended to handle expression operands.
Test Updates
tests/unit/executor/test_executor_error_paths.py, tests/tck/conftest.py
Unit tests updated to reflect _execute_delete row-through semantics (assertions changed from == [] to len(...) == 1). TCK parameter handling extended to support map literals via enhanced _parse_param_value and _split_list_items to handle {...} nested structures.

Sequence Diagram(s)

sequenceDiagram
    participant Executor as QueryExecutor
    participant Context as ExecutionContext
    participant Evaluator as Evaluator
    participant PropertyAccess as PropertyAccess<br/>(Expression)

    rect rgba(200, 220, 255, 0.5)
        Note over Executor,PropertyAccess: DELETE with deleted_ids tracking
        Executor->>Executor: _execute_delete(delete_targets, ctx)
        Executor->>Executor: process each target (variable or expression)
        Executor->>Context: record target ID to deleted_ids
        Context->>Context: deleted_ids.add(entity_id)
    end

    rect rgba(255, 220, 200, 0.5)
        Note over Executor,PropertyAccess: Subsequent PropertyAccess evaluation
        Executor->>Evaluator: _evaluate_expression(node_ref.prop, ctx)
        Evaluator->>Evaluator: evaluate PropertyAccess(node_ref, 'prop')
        Evaluator->>Context: check if node_ref.id in deleted_ids
        alt Entity was deleted
            Context-->>Evaluator: True (deleted)
            Evaluator-->>Executor: raise EntityNotFoundError
        else Entity exists
            Context-->>Evaluator: False (not deleted)
            Evaluator-->>Executor: return property_value
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related issues

  • fix: SET property and label edge cases #341 — This PR implements property and null-handling semantics for SET operations via _execute_set_items (null-to-remove, type-safety checks), directly addressing the SET edge-case requirements.
  • fix: DELETE edge cases #342 — Extends DELETE semantics to accept expressions and adds EntityNotFoundError for deleted-entity access, aligning with deleted-entity-handling objectives.
  • fix: MERGE edge cases #303 — Updates to MERGE path binding (path_variable construction) and multi-match collection in _execute_merge directly implement the MERGE structural validation and multi-match handling.

Possibly related PRs

Suggested labels

enhancement, parser

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the primary change: fixing edge cases in write clauses (SET, DELETE, MERGE), which aligns with the substantial changes across multiple files and systems.
Description check ✅ Passed The description covers the required structure with clear summary of changes, test results (unit tests, TCK tests passing), type-of-change indication, and related issue references, meeting repository template expectations.
Linked Issues check ✅ Passed Code changes fully address the linked issue #341 objectives: SET property/label operations are properly implemented, executor correctly handles null entities and property removal, and all 199 TCK test cases for Delete/Set/Merge now pass.
Out of Scope Changes check ✅ Passed All code modifications are directly related to the Phase 5 write-clause objectives: DELETE expression operands, MERGE multi-match semantics, set-property behaviors, grammar updates, and TCK parameter handling—no extraneous changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/341-342-303-write-clause-edge-cases

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 0/1 reviews remaining, refill in 44 minutes and 36 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/graphforge/planner/planner.py (1)

1110-1117: ⚠️ Potential issue | 🔴 Critical

OPTIONAL MATCH path binding is planner-only here.

path_var is added to the type context, but OptionalExpandEdges in src/graphforge/planner/operators.py still has no path_var field, so nothing in the plan can actually populate p for OPTIONAL MATCH p = (a)-[r]->(b). That makes the variable look valid during planning and then disappear at execution time.

Based on learnings "Implement Cypher features across all four layers: Parser (cypher.lark + parser.py), AST (ast/ dataclasses), Planner (planner.py + operators.py), Executor (executor.py)".

Also applies to: 1127-1130

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/graphforge/planner/planner.py` around lines 1110 - 1117, The planner adds
a path variable (path_var) for OPTIONAL MATCH but the operator
OptionalExpandEdges (referenced in planner.py) does not accept or store
path_var, so the plan cannot populate the path at execution; update the operator
usage and definition: modify the OptionalExpandEdges constructor call(s) in
planner.py (the occurrences around the current diff and the other block at
1127-1130) to pass path_var, and modify the OptionalExpandEdges class in
src/graphforge/planner/operators.py to accept, store, and expose a path_var
field (and propagate it through any serialization/execution hooks) so the
executor can receive and populate the bound path.
🧹 Nitpick comments (2)
src/graphforge/executor/executor.py (2)

2753-2762: Docstring is outdated after return type change.

The docstring says "Returns: True if pattern was created, False if matched" but the method now returns a 4-tuple (was_created, src, rel, dst). Update the docstring to reflect the new return signature.

📝 Proposed docstring fix
     def _merge_relationship_pattern(self, pattern_parts: list, ctx: ExecutionContext) -> tuple:
         """Merge a relationship pattern (node-rel-node).

         Args:
             pattern_parts: List of pattern parts (nodes and relationships)
             ctx: Execution context

         Returns:
-            True if pattern was created, False if matched
+            Tuple of (was_created, src_node, rel_edge, dst_node):
+            - was_created: True if pattern was created, False if matched
+            - src_node: Source NodeRef
+            - rel_edge: Relationship EdgeRef
+            - dst_node: Destination NodeRef
         """
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/graphforge/executor/executor.py` around lines 2753 - 2762, Update the
_merge_relationship_pattern docstring to reflect the new return value: replace
the old "Returns: True if pattern was created, False if matched" with a
description that the method returns a 4-tuple (was_created, src, rel, dst),
where was_created is a boolean indicating whether the pattern was created, src
is the source node object, rel is the relationship object, and dst is the
destination node object; keep existing Args and brief purpose intact and ensure
the return section documents types and semantics for each tuple element.

2707-2719: ON MATCH SET correctly applied, but code flow is non-obvious.

The extra matched nodes (beyond the first) have ON MATCH SET applied immediately (line 2718) and are appended directly to result. The first match's new_ctx has ON MATCH SET applied later at line 2746. While functionally correct, this dual-path handling makes the code harder to follow.

Consider adding a comment clarifying that the first match is handled by the common ON MATCH path at line 2746.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/graphforge/executor/executor.py` around lines 2707 - 2719, The loop over
matched_nodes applies ON MATCH SET to extra contexts immediately (extra_node ->
extra_ctx via ExecutionContext and _execute_set_items when op.on_match) while
the first match's new_ctx has ON MATCH SET applied later at new_ctx handling;
make this explicit by adding a concise inline comment above the loop (or above
the extra-node branch) stating that extra matches are processed here but the
first match is handled by the common ON MATCH path later (referencing
matched_nodes, extra_node, ExecutionContext, op.on_match, _execute_set_items,
new_ctx, node_pattern.variable, path_variable, CypherPath, and result) so
readers understand the dual-path flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/graphforge/executor/evaluator.py`:
- Around line 80-81: ExecutionContext initializes self.deleted_ids but
cloned/derived contexts created with ExecutionContext(...) do not copy it,
allowing deleted-entity checks to be bypassed; update every
ExecutionContext(...) construction in this file (e.g., places creating child
contexts for comprehensions/subqueries and any use sites near evaluator methods)
to propagate the parent.deleted_ids into the new context by assigning
new_ctx.deleted_ids = parent.deleted_ids (or pass it through the constructor if
available) so that DeletedEntityAccess checks remain effective in derived
contexts.
- Around line 4686-4690: The deleted-entity guard currently checks
ctx.deleted_ids only for LABELS; update the same guard logic wherever graph
property-style functions read entity data so PROPERTIES and KEYS also reject
access to deleted entities. Locate the evaluator code paths handling the
constants/operations LABELS, PROPERTIES, and KEYS (and any helper functions that
read entity properties or keys) and add the same conditional: if ctx is not None
and hasattr(ctx, "deleted_ids") and arg.id in ctx.deleted_ids: raise
EntityNotFoundError("DeletedEntityAccess: Tried to access properties/keys of a
deleted entity"), ensuring the error message mentions properties/keys
appropriately and reusing the existing EntityNotFoundError type.

In `@src/graphforge/executor/executor.py`:
- Around line 2521-2526: The DELETE handler currently evaluates expressions via
evaluate_expression(var_ref, ctx, self) and then silently skips when element is
neither CypherNull nor None; change this to validate the runtime type and raise
an error for non-element values: after element = evaluate_expression(var_ref,
ctx, self) keep the CypherNull/None skip, but if element is not an instance of
NodeRef or EdgeRef raise a TypeError (or the project’s standard executor error)
with a clear message indicating DELETE received a non-node/non-relationship
value (include the evaluated type and the var_ref for context); ensure this
check is implemented in the same block handling deletion so DELETE semantics
follow the openCypher spec.

In `@src/graphforge/parser/cypher.lark`:
- Around line 61-63: update_query is over-permissive because all write/return
clauses are optional, allowing a bare MATCH to validate; update the grammar for
update_query (referencing update_query, match_clause, optional_match_clause,
set_clause, remove_clause, create_clause, merge_clause, delete_clause,
return_clause) so that at least one "write or output" clause is required after
MATCH/OPTIONAL MATCH — e.g., replace the trailing optional groups (* and ?
combinations) with a required alternative that includes at least one of
set_clause|remove_clause|create_clause|merge_clause|delete_clause|return_clause
(or make the final return/delete/merge mandatory) so a bare MATCH no longer
matches.

In `@src/graphforge/planner/planner.py`:
- Around line 292-296: The current pre-pass loops over create_clauses and calls
_validate_create_patterns(create.patterns) followed by
_bind_pattern_types(create.patterns), which binds CREATE variables too early and
makes them visible to earlier write validations; change this so you only
pre-validate CREATE patterns (call _validate_create_patterns) in the global
pre-pass and remove the early _bind_pattern_types call there, and instead
perform _bind_pattern_types(create.patterns) inline when handling each write
clause in the ordered write-clause pre-pass (the same pass that validates
SET/DELETE/MERGE), ensuring CREATE bindings happen in order and not before
earlier write validations (update all occurrences where _bind_pattern_types is
currently invoked during the global pre-pass for create_clauses, including the
similar block later).

In `@tests/tck/conftest.py`:
- Around line 1051-1061: The map-parsing loop in conftest.py (inside the block
that detects value_str as a map) can call stripped.index(":") and raise a
generic ValueError for malformed items; update the loop in the map branch to
check for a colon (e.g., if ":" not in stripped) before calling index/slicing
and raise a clear ValueError mentioning the offending raw_item and context
(e.g., "invalid map entry: '<item>'"), then continue parsing valid entries using
the existing key extraction and _parse_param_value call so failures are explicit
and debuggable.
- Around line 298-300: The map key is emitted verbatim in
_param_to_cypher_literal, which can produce invalid Cypher when keys are not
valid identifiers; update the dict branch in _param_to_cypher_literal to check
each key: if it matches a valid Cypher identifier pattern (e.g.
/^[A-Za-z_][A-Za-z0-9_]*$/) emit as-is, otherwise emit the key as a properly
escaped single-quoted string (escape internal single quotes) so non-identifier
keys produce valid Cypher map literals.

---

Outside diff comments:
In `@src/graphforge/planner/planner.py`:
- Around line 1110-1117: The planner adds a path variable (path_var) for
OPTIONAL MATCH but the operator OptionalExpandEdges (referenced in planner.py)
does not accept or store path_var, so the plan cannot populate the path at
execution; update the operator usage and definition: modify the
OptionalExpandEdges constructor call(s) in planner.py (the occurrences around
the current diff and the other block at 1127-1130) to pass path_var, and modify
the OptionalExpandEdges class in src/graphforge/planner/operators.py to accept,
store, and expose a path_var field (and propagate it through any
serialization/execution hooks) so the executor can receive and populate the
bound path.

---

Nitpick comments:
In `@src/graphforge/executor/executor.py`:
- Around line 2753-2762: Update the _merge_relationship_pattern docstring to
reflect the new return value: replace the old "Returns: True if pattern was
created, False if matched" with a description that the method returns a 4-tuple
(was_created, src, rel, dst), where was_created is a boolean indicating whether
the pattern was created, src is the source node object, rel is the relationship
object, and dst is the destination node object; keep existing Args and brief
purpose intact and ensure the return section documents types and semantics for
each tuple element.
- Around line 2707-2719: The loop over matched_nodes applies ON MATCH SET to
extra contexts immediately (extra_node -> extra_ctx via ExecutionContext and
_execute_set_items when op.on_match) while the first match's new_ctx has ON
MATCH SET applied later at new_ctx handling; make this explicit by adding a
concise inline comment above the loop (or above the extra-node branch) stating
that extra matches are processed here but the first match is handled by the
common ON MATCH path later (referencing matched_nodes, extra_node,
ExecutionContext, op.on_match, _execute_set_items, new_ctx,
node_pattern.variable, path_variable, CypherPath, and result) so readers
understand the dual-path flow.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 70bbfa5c-d9ad-4cf0-b45c-34278c30d0bd

📥 Commits

Reviewing files that changed from the base of the PR and between 1a08006 and a134ee6.

📒 Files selected for processing (9)
  • src/graphforge/ast/clause.py
  • src/graphforge/executor/evaluator.py
  • src/graphforge/executor/executor.py
  • src/graphforge/parser/cypher.lark
  • src/graphforge/parser/parser.py
  • src/graphforge/planner/operators.py
  • src/graphforge/planner/planner.py
  • tests/tck/conftest.py
  • tests/unit/executor/test_executor_error_paths.py

Comment on lines +80 to +81
# Track deleted entity IDs (set of int) so property access raises DeletedEntityAccess
self.deleted_ids: set[int | str] = set()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Propagate deleted_ids into derived execution contexts.

deleted_ids is initialized on ExecutionContext, but derived contexts created during expression evaluation don’t copy it. That allows deleted-entity access checks to be bypassed inside comprehensions/subqueries (e.g., property access evaluated in a child context).

Suggested fix
 class ExecutionContext:
@@
     def __init__(self):
@@
         self.deleted_ids: set[int | str] = set()
+
+    `@classmethod`
+    def from_parent(cls, parent: "ExecutionContext") -> "ExecutionContext":
+        child = cls()
+        child.bindings = dict(parent.bindings)
+        child.aggregate_cache = parent.aggregate_cache
+        child.deleted_ids = set(parent.deleted_ids)
+        return child
-            new_ctx = ExecutionContext()
-            new_ctx.bindings = dict(ctx.bindings)
-            new_ctx.aggregate_cache = ctx.aggregate_cache
+            new_ctx = ExecutionContext.from_parent(ctx)

Apply the same replacement to all other ExecutionContext() cloning sites in this file.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/graphforge/executor/evaluator.py` around lines 80 - 81, ExecutionContext
initializes self.deleted_ids but cloned/derived contexts created with
ExecutionContext(...) do not copy it, allowing deleted-entity checks to be
bypassed; update every ExecutionContext(...) construction in this file (e.g.,
places creating child contexts for comprehensions/subqueries and any use sites
near evaluator methods) to propagate the parent.deleted_ids into the new context
by assigning new_ctx.deleted_ids = parent.deleted_ids (or pass it through the
constructor if available) so that DeletedEntityAccess checks remain effective in
derived contexts.

Comment on lines +4686 to +4690
# Check if entity was deleted in this execution context
if ctx is not None and hasattr(ctx, "deleted_ids") and arg.id in ctx.deleted_ids:
raise EntityNotFoundError(
"DeletedEntityAccess: Tried to access labels of a deleted entity"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Deleted-entity guard is incomplete for graph property-style functions.

You added the deleted check for LABELS, but PROPERTIES and KEYS still read from deleted entities. That creates inconsistent deleted-access behavior for equivalent property-oriented access patterns.

Suggested fix
 def _evaluate_graph_function(
@@
 ) -> CypherValue:
+    def _ensure_not_deleted(arg: NodeRef | EdgeRef) -> None:
+        if ctx is not None and arg.id in ctx.deleted_ids:
+            raise EntityNotFoundError(
+                "DeletedEntityAccess: Tried to access data of a deleted entity"
+            )
@@
     if func_name == "LABELS":
@@
         if isinstance(arg, NodeRef):
-            # Check if entity was deleted in this execution context
-            if ctx is not None and hasattr(ctx, "deleted_ids") and arg.id in ctx.deleted_ids:
-                raise EntityNotFoundError(
-                    "DeletedEntityAccess: Tried to access labels of a deleted entity"
-                )
+            _ensure_not_deleted(arg)
             labels_list: list[CypherValue] = [CypherString(label) for label in sorted(arg.labels)]
             return CypherList(labels_list)
@@
     if func_name == "PROPERTIES":
@@
         if isinstance(arg, (NodeRef, EdgeRef)):
+            _ensure_not_deleted(arg)
             return CypherMap(dict(arg.properties))
@@
     if func_name == "KEYS":
@@
         if isinstance(arg, (NodeRef, EdgeRef)):
+            _ensure_not_deleted(arg)
             keys_list: list[CypherValue] = [CypherString(k) for k in sorted(arg.properties)]
             return CypherList(keys_list)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/graphforge/executor/evaluator.py` around lines 4686 - 4690, The
deleted-entity guard currently checks ctx.deleted_ids only for LABELS; update
the same guard logic wherever graph property-style functions read entity data so
PROPERTIES and KEYS also reject access to deleted entities. Locate the evaluator
code paths handling the constants/operations LABELS, PROPERTIES, and KEYS (and
any helper functions that read entity properties or keys) and add the same
conditional: if ctx is not None and hasattr(ctx, "deleted_ids") and arg.id in
ctx.deleted_ids: raise EntityNotFoundError("DeletedEntityAccess: Tried to access
properties/keys of a deleted entity"), ensuring the error message mentions
properties/keys appropriately and reusing the existing EntityNotFoundError type.

Comment on lines +2521 to +2526
else:
element = evaluate_expression(var_ref, ctx, self)

# Delete from graph
if isinstance(element, NodeRef):
# Get all edges connected to this node
outgoing = self.graph.get_outgoing_edges(element.id)
incoming = self.graph.get_incoming_edges(element.id)
all_edges = outgoing + incoming

# Check if node has relationships
if all_edges and not op.detach:
raise ValueError(
"Cannot delete node with relationships. "
"Use DETACH DELETE to delete relationships first."
)
# Skip NULL values - they don't exist in the graph
if isinstance(element, CypherNull) or element is None:
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Does openCypher DELETE clause raise an error when attempting to delete a non-node/non-relationship value like a property?

💡 Result:

Yes, the openCypher DELETE clause raises an error when attempting to delete a non-node/non-relationship value like a property.

Citations:


DELETE should raise an error for non-element types, not silently skip them.

Per the openCypher specification (Neo4j Cypher manual), attempting to delete a non-node/non-relationship value (e.g., a property accessed via DELETE n.name) must raise an error. The current code silently skips evaluation when element is neither NodeRef, EdgeRef, CypherNull, nor None, which violates the spec. Add error handling to raise a TypeError or equivalent when a non-element type is provided to DELETE.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/graphforge/executor/executor.py` around lines 2521 - 2526, The DELETE
handler currently evaluates expressions via evaluate_expression(var_ref, ctx,
self) and then silently skips when element is neither CypherNull nor None;
change this to validate the runtime type and raise an error for non-element
values: after element = evaluate_expression(var_ref, ctx, self) keep the
CypherNull/None skip, but if element is not an instance of NodeRef or EdgeRef
raise a TypeError (or the project’s standard executor error) with a clear
message indicating DELETE received a non-node/non-relationship value (include
the evaluated type and the var_ref for context); ensure this check is
implemented in the same block handling deletion so DELETE semantics follow the
openCypher spec.

Comment on lines +61 to +63
update_query: match_clause where_clause? optional_match_clause+ where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause?
| match_clause where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause?
| optional_match_clause where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

update_query now accepts no-op MATCH endings (over-permissive grammar).

At Line 61 / Line 62 / Line 63, all update parts are optional (* + ?), so a bare MATCH ... can match update_query without any write clause or RETURN. That weakens syntax validation and can mask invalid query endings.

Suggested grammar tightening
-update_query: match_clause where_clause? optional_match_clause+ where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause?
-            | match_clause where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause?
-            | optional_match_clause where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause?
+update_query: match_clause where_clause? optional_match_clause+ where_clause? ((set_clause | remove_clause | create_clause | merge_clause)+ delete_clause? merge_clause* | delete_clause merge_clause*) return_clause? order_by_clause? skip_clause? limit_clause?
+            | match_clause where_clause? ((set_clause | remove_clause | create_clause | merge_clause)+ delete_clause? merge_clause* | delete_clause merge_clause*) return_clause? order_by_clause? skip_clause? limit_clause?
+            | optional_match_clause where_clause? ((set_clause | remove_clause | create_clause | merge_clause)+ delete_clause? merge_clause* | delete_clause merge_clause*) return_clause? order_by_clause? skip_clause? limit_clause?
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
update_query: match_clause where_clause? optional_match_clause+ where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause?
| match_clause where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause?
| optional_match_clause where_clause? (set_clause | remove_clause | create_clause | merge_clause)* delete_clause? merge_clause* return_clause? order_by_clause? skip_clause? limit_clause?
update_query: match_clause where_clause? optional_match_clause+ where_clause? ((set_clause | remove_clause | create_clause | merge_clause)+ delete_clause? merge_clause* | delete_clause merge_clause*) return_clause? order_by_clause? skip_clause? limit_clause?
| match_clause where_clause? ((set_clause | remove_clause | create_clause | merge_clause)+ delete_clause? merge_clause* | delete_clause merge_clause*) return_clause? order_by_clause? skip_clause? limit_clause?
| optional_match_clause where_clause? ((set_clause | remove_clause | create_clause | merge_clause)+ delete_clause? merge_clause* | delete_clause merge_clause*) return_clause? order_by_clause? skip_clause? limit_clause?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/graphforge/parser/cypher.lark` around lines 61 - 63, update_query is
over-permissive because all write/return clauses are optional, allowing a bare
MATCH to validate; update the grammar for update_query (referencing
update_query, match_clause, optional_match_clause, set_clause, remove_clause,
create_clause, merge_clause, delete_clause, return_clause) so that at least one
"write or output" clause is required after MATCH/OPTIONAL MATCH — e.g., replace
the trailing optional groups (* and ? combinations) with a required alternative
that includes at least one of
set_clause|remove_clause|create_clause|merge_clause|delete_clause|return_clause
(or make the final return/delete/merge mandatory) so a bare MATCH no longer
matches.

Comment on lines +292 to 296
# 3. Pre-validate CREATE patterns (binding happens inline in step 5)
for create in create_clauses:
self._validate_create_patterns(create.patterns)
self._bind_pattern_types(create.patterns)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't bind future CREATE variables before validating earlier writes.

This makes later CREATE bindings visible to earlier SET/DELETE/MERGE validation. For example, MATCH (a) SET b.name = 1 CREATE (b) will now pass planning even though b is not in scope at the SET. CREATE validation/binding needs to happen inside the ordered write-clause pre-pass, not in a global pre-pass.

Suggested fix
-        # 3. Pre-validate CREATE patterns (binding happens inline in step 5)
-        for create in create_clauses:
-            self._validate_create_patterns(create.patterns)
-            self._bind_pattern_types(create.patterns)
-
-        # 4. WHERE (before write operations so filter applies to reading results)
+        # 3. WHERE (before write operations so filter applies to reading results)
         if where_clause:
             ...
 
-        # 5. Write operations (MERGE, SET, REMOVE, DELETE, CREATE) in original clause order
+        # 4. Write operations (MERGE, SET, REMOVE, DELETE, CREATE) in original clause order
         for clause in clauses:
             if isinstance(clause, MergeClause):
                 self._validate_merge_patterns(clause.patterns)
                 self._validate_pattern_types(clause.patterns)
                 if clause.on_create:
                     self._validate_set_clause_variables(clause.on_create)
                 if clause.on_match:
                     self._validate_set_clause_variables(clause.on_match)
             elif isinstance(clause, CreateClause):
-                pass  # already validated above
+                self._validate_create_patterns(clause.patterns)
+                self._bind_pattern_types(clause.patterns)
             elif isinstance(clause, DeleteClause):
                 self._validate_delete_clause_variables(clause)
             elif isinstance(clause, SetClause):
                 self._validate_set_clause_variables(clause)

Also applies to: 313-327

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/graphforge/planner/planner.py` around lines 292 - 296, The current
pre-pass loops over create_clauses and calls
_validate_create_patterns(create.patterns) followed by
_bind_pattern_types(create.patterns), which binds CREATE variables too early and
makes them visible to earlier write validations; change this so you only
pre-validate CREATE patterns (call _validate_create_patterns) in the global
pre-pass and remove the early _bind_pattern_types call there, and instead
perform _bind_pattern_types(create.patterns) inline when handling each write
clause in the ordered write-clause pre-pass (the same pass that validates
SET/DELETE/MERGE), ensuring CREATE bindings happen in order and not before
earlier write validations (update all occurrences where _bind_pattern_types is
currently invoked during the global pre-pass for create_clauses, including the
similar block later).

Comment thread tests/tck/conftest.py Outdated
Comment thread tests/tck/conftest.py
Comment on lines +1051 to +1061
if value_str.startswith("{") and value_str.endswith("}"):
result: dict = {}
for raw_item in _split_list_items(value_str[1:-1]):
stripped = raw_item.strip()
if not stripped:
continue
colon_idx = stripped.index(":")
key = stripped[:colon_idx].strip()
val = _parse_param_value(stripped[colon_idx + 1 :].strip())
result[key] = val
return result
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Handle invalid map entries explicitly instead of crashing on index(':').

Line 1057 can throw a generic ValueError for malformed items; prefer explicit validation with a clear failure message.

Suggested fix
     if value_str.startswith("{") and value_str.endswith("}"):
         result: dict = {}
         for raw_item in _split_list_items(value_str[1:-1]):
             stripped = raw_item.strip()
             if not stripped:
                 continue
-            colon_idx = stripped.index(":")
-            key = stripped[:colon_idx].strip()
-            val = _parse_param_value(stripped[colon_idx + 1 :].strip())
+            if ":" not in stripped:
+                raise ValueError(f"Invalid map entry (missing ':'): {stripped!r}")
+            key, _, raw_val = stripped.partition(":")
+            key = key.strip()
+            if len(key) >= 2 and key[0] == key[-1] and key[0] in ("'", '"'):
+                key = key[1:-1]
+            val = _parse_param_value(raw_val.strip())
             result[key] = val
         return result
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/tck/conftest.py` around lines 1051 - 1061, The map-parsing loop in
conftest.py (inside the block that detects value_str as a map) can call
stripped.index(":") and raise a generic ValueError for malformed items; update
the loop in the map branch to check for a colon (e.g., if ":" not in stripped)
before calling index/slicing and raise a clear ValueError mentioning the
offending raw_item and context (e.g., "invalid map entry: '<item>'"), then
continue parsing valid entries using the existing key extraction and
_parse_param_value call so failures are explicit and debuggable.

- Add path_var field to OptionalExpandEdges operator and bind CypherPath
  in _execute_optional_expand for all match/no-match branches
- Pass path_var from planner when creating OptionalExpandEdges
- Update _merge_relationship_pattern docstring to document 4-tuple return
- Add inline comment clarifying dual-path ON MATCH SET flow in _execute_merge
- Fix _param_to_cypher_literal: escape non-identifier dict keys with single quotes
  (e.g. keys with spaces or special chars) in TCK conftest parameter substitution

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@DecisionNerd DecisionNerd merged commit ab52507 into main Apr 29, 2026
32 checks passed
@DecisionNerd DecisionNerd deleted the fix/341-342-303-write-clause-edge-cases branch April 29, 2026 04:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: SET property and label edge cases

1 participant