feat: implement MERGE enhancements - after MATCH, relationship patterns, and SET support#183
Conversation
…ns, and SET support (#163, #164, #165) - Grammar: Allow MERGE after MATCH (issue #163) - Grammar: Allow SET after MERGE without ON CREATE/ON MATCH (issue #165) - Executor: Implement relationship MERGE patterns (issue #164) - Executor: Add _merge_relationship_pattern() for node-rel-node patterns - Executor: Add _merge_node_in_context() helper for node resolution - Tests: Unskip 4 previously failing test cases All tests passing, coverage at 90.38% total. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WalkthroughAdds executor support for MERGE on node-relationship-node patterns and node-creation helpers, extends the grammar to allow MERGE (and optional SET) after MATCH, and enables previously skipped integration tests for complex MERGE patterns. Changes
Sequence Diagram(s)sequenceDiagram
participant Parser
participant Executor
participant NodeOps as Node Operations
participant RelOps as Relationship Operations
participant Context
Parser->>Executor: Execute MERGE pattern (a)-[r]->(b)
Executor->>NodeOps: Resolve or create source node (a)
NodeOps->>Context: Bind source node
Executor->>NodeOps: Resolve or create destination node (b)
NodeOps->>Context: Bind destination node
Executor->>RelOps: Search for relationship by type/properties between nodes
alt Relationship found
RelOps->>Context: Bind existing relationship (was_created=false)
else Relationship not found
RelOps->>Executor: Create relationship
RelOps->>Context: Bind new relationship (was_created=true)
end
Executor->>Context: Apply ON CREATE / ON MATCH SET based on was_created
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 6✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@src/graphforge/executor/executor.py`:
- Around line 2264-2315: The _merge_relationship_pattern logic only inspects
outgoing edges and ignores rel_pattern.direction; update it to branch on
rel_pattern.direction (Direction.OUT, Direction.IN, Direction.UNDIRECTED): for
OUT continue using get_outgoing_edges(src_node.id) and check edge.dst.id, for IN
use get_incoming_edges(src_node.id) and check edge.src.id, and for UNDIRECTED
iterate both incoming and outgoing (merge/deduplicate edges before property
checks) to find found_rel; finally ensure when creating a new relationship for
the IN direction you swap src_node and dst_node in the creation call so the
stored edge direction is correct.
🧹 Nitpick comments (4)
src/graphforge/executor/executor.py (3)
2236-2252: UseTypeErrorfor type validation checks.Lines 2248, 2250, and 2252 validate that pattern elements are the correct type (NodePattern, RelationshipPattern). Per Python conventions and the Ruff TRY004 hint, these should raise
TypeErrorrather thanValueError.♻️ Proposed fix
- if not isinstance(src_pattern, NodePattern): - raise ValueError("First element of relationship pattern must be a node") - if not isinstance(rel_pattern, RelationshipPattern): - raise ValueError("Second element of relationship pattern must be a relationship") - if not isinstance(dst_pattern, NodePattern): - raise ValueError("Third element of relationship pattern must be a node") + if not isinstance(src_pattern, NodePattern): + raise TypeError("First element of relationship pattern must be a node") + if not isinstance(rel_pattern, RelationshipPattern): + raise TypeError("Second element of relationship pattern must be a relationship") + if not isinstance(dst_pattern, NodePattern): + raise TypeError("Third element of relationship pattern must be a node")
2332-2427: Significant code duplication with_execute_mergenode-finding logic.
_merge_node_in_context(lines 2354–2416) is nearly identical to the node-matching block in_execute_merge(lines 2128–2195). Consider extracting the shared "find node by labels+properties" logic into a private helper (e.g.,_find_node_by_pattern) that both call sites invoke. This would reduce maintenance burden and the risk of the two paths drifting apart.
2224-2330: MERGE relationship semantics differ from Neo4j's atomic pattern matching.In Neo4j,
MERGE (a:X)-[r:R]->(b:Y)treats the entire pattern atomically: it looks for the full triple and, if not found, creates all three elements fresh (even if matching nodes already exist). This implementation resolves nodes independently first (_merge_node_in_contextfor src, then dst), then checks only the relationship — which reuses existing nodes. This is a deliberate simplification but diverges from standard openCypher semantics.If this is intentional, a brief doc-comment would help future maintainers. If full openCypher fidelity is a goal, this needs to be revisited.
src/graphforge/parser/cypher.lark (1)
48-48: Standalone MERGE path lacks ORDER BY / SKIP / LIMIT support.Line 48 (
merge_clause set_clause? return_clause?) does not allow ORDER BY, SKIP, or LIMIT after RETURN, while the MATCH+MERGE alternative on line 50 does. A query likeMERGE (n:Person) RETURN n ORDER BY n.name LIMIT 5will fail to parse. This is consistent with the existingcreate_clause return_clause?on line 47, so it's pre-existing, but now that you're touching this line (addingset_clause?), it might be worth aligning:- | merge_clause set_clause? return_clause? + | merge_clause set_clause? return_clause? order_by_clause? skip_clause? limit_clause?
Codecov Report❌ Patch coverage is ❌ Your patch status has failed because the patch coverage (50.00%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #183 +/- ##
==========================================
- Coverage 88.18% 87.37% -0.81%
==========================================
Files 34 34
Lines 5892 6022 +130
Branches 1563 1610 +47
==========================================
+ Hits 5196 5262 +66
- Misses 396 430 +34
- Partials 300 330 +30
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
|
- Handle Direction.IN by checking incoming edges and swapping src/dst on creation - Handle Direction.UNDIRECTED by checking both directions - Add 3 new tests for incoming and undirected MERGE patterns - Fix linter warning with simplified comparison All tests passing (13/13 complex MERGE, 76/76 total MERGE tests) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (4)
tests/integration/test_complex_merge_patterns.py (1)
9-320: Consider extracting a shared fixture for test isolation.All tests in this class manually instantiate
GraphForge(). A@pytest.fixtureyielding a fresh instance would reduce boilerplate and make the isolation guarantee explicit. This is a pre-existing pattern, so not blocking.Example fixture
import pytest `@pytest.fixture` def gf(): return GraphForge()Then each test method would accept
gfas a parameter instead of creating its own.As per coding guidelines: "Use test fixtures to ensure test isolation with fresh GraphForge instances for each test."
src/graphforge/executor/executor.py (3)
2247-2252: UseTypeErrorfor type-validation checks.These guard invalid pattern types, not invalid values. Ruff (TRY004) correctly flags this.
Proposed fix
- if not isinstance(src_pattern, NodePattern): - raise ValueError("First element of relationship pattern must be a node") - if not isinstance(rel_pattern, RelationshipPattern): - raise ValueError("Second element of relationship pattern must be a relationship") - if not isinstance(dst_pattern, NodePattern): - raise ValueError("Third element of relationship pattern must be a node") + if not isinstance(src_pattern, NodePattern): + raise TypeError("First element of relationship pattern must be a node") + if not isinstance(rel_pattern, RelationshipPattern): + raise TypeError("Second element of relationship pattern must be a relationship") + if not isinstance(dst_pattern, NodePattern): + raise TypeError("Third element of relationship pattern must be a node")
2366-2461: Significant duplication with_execute_mergenode-matching logic.The node-finding logic in
_merge_node_in_context(lines 2388–2450) is nearly identical to the node-matching block in_execute_merge(lines 2128–2206). Consider extracting a shared_find_matching_node(node_pattern, ctx)helper that both paths call, to reduce the maintenance surface.This isn't blocking since the code is correct, but the ~60 duplicated lines will diverge over time.
2224-2241: TODO left for multi-hop MERGE support.Line 2237 has a TODO for multi-hop pattern support. This is fine for now since 3-part patterns cover the common case, but consider tracking this as an issue.
Would you like me to open a GitHub issue to track multi-hop MERGE pattern support?
Closes #163, #164, #165
Summary
MATCH (a) MERGE (a)-[r]->(b))MERGE (a)-[r:KNOWS]->(b))MERGE (n) SET n.prop = value)Changes
Grammar (cypher.lark)
merge_clause set_clause? return_clause?to allow SET after MERGEmatch_clause where_clause? merge_clause+ set_clause? return_clause? ...to allow MERGE after MATCHExecutor (executor.py)
_execute_merge()to handle relationship patterns (3+ part patterns)_merge_relationship_pattern()to match/create node-rel-node patterns_merge_node_in_context()helper to resolve or create nodes with variable binding supportTests
Test Results
All 76 MERGE-related integration tests pass:
Coverage: 90.38% total
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes / Improvements
Tests