feat: implement MERGE ON MATCH SET syntax#66
Conversation
## Summary Implements `MERGE ... ON MATCH SET` syntax to complete MERGE enhancements and fully unblock Neo4j example datasets. This follows Issue #57 (ON CREATE SET) and completes Phase 1 of the v0.3.0 roadmap. ## Changes ### Grammar - Extended Lark grammar to support `on_match_clause` rule - Updated `merge_action` to accept both ON CREATE and ON MATCH - Case-insensitive keywords (`ON MATCH SET`) ### AST - Added `on_match` field to `MergeClause` dataclass - Type: `SetClause | None` for optional ON MATCH clause - Updated examples to show combined usage ### Parser - Added `on_match_clause()` transformer - Updated `merge_clause()` to handle both on_create and on_match - Maintains backward compatibility ### Planner - Updated `Merge` operator with `on_match` field - Planner passes both `on_create` and `on_match` to operator ### Executor - Enhanced `_execute_merge()` to track create vs match state - Conditionally executes SET based on state: - `if was_created and op.on_create` → Execute ON CREATE SET - `elif not was_created and op.on_match` → Execute ON MATCH SET - Updated docstring ## Testing ### Test Coverage | Category | Tests | Status | |----------|-------|--------| | Parser | 10 | ✅ All passing | | Executor | 11 | ✅ All passing | | Integration | 11 | ✅ All passing | | **Total** | **32** | **✅ All passing** | ### Test Categories **Parser Tests:** - Single and multiple property assignments - Both ON CREATE and ON MATCH together - Case-insensitive keywords - Backward compatibility **Executor Tests:** - ON MATCH executes when matching existing nodes - ON MATCH does NOT execute when creating new nodes - Both ON CREATE and ON MATCH in same statement - State tracking across multiple MERGE calls **Integration Tests:** - Neo4j timestamp tracking patterns - Counter increment patterns - Status workflow patterns - Complex queries with WHERE, RETURN - Edge cases and performance tests ## Examples ### ON MATCH SET Only ```cypher MERGE (n:Person {id: 1}) ON MATCH SET n.updated = true ``` ### Both ON CREATE and ON MATCH ```cypher MERGE (n:Person {id: 1}) ON CREATE SET n.created = timestamp() ON MATCH SET n.updated = timestamp() ``` ### Neo4j Timestamp Pattern ```cypher MERGE (u:User {id: 'user123'}) ON CREATE SET u.created = 100 ON MATCH SET u.lastSeen = 200 ``` ## Verification - ✅ All 32 new tests pass - ✅ All 1018 total tests pass (backward compatibility verified) - ✅ 95.68% coverage (meets threshold) - ✅ Grammar correctly parses both clauses - ✅ Executor correctly implements conditional logic - ✅ No regressions ## Impact Completes MERGE enhancement from Issue #57. Together with ON CREATE SET, this fully supports the Neo4j dataset patterns: - neo4j-movie-graph - neo4j-northwind - neo4j-game-of-thrones - neo4j-fincen-files - neo4j-twitter ## Related - Part of v0.3.0 roadmap (Phase 1) - Follows #57 (MERGE ON CREATE SET) - Closes #58 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
WalkthroughThis PR implements MERGE ... ON MATCH SET syntax throughout the GraphForge pipeline. It extends the grammar to recognize ON MATCH clauses, updates the AST to include an on_match field in MergeClause, modifies the parser to handle the new clause, updates the planner to propagate on_match to the Merge operator, and adjusts executor logic to conditionally execute ON MATCH SET only when an existing pattern is matched. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Parser
participant AST
participant Planner
participant Executor
participant Graph
Client->>Parser: MERGE (...) ON CREATE SET ... ON MATCH SET ...
Parser->>AST: Parse ON MATCH clause → MergeClause(patterns, on_create, on_match)
AST->>Planner: Build plan with Merge(patterns, on_create, on_match)
Planner->>Executor: Execute Merge operator
alt Pattern exists
Executor->>Graph: Query pattern
Graph-->>Executor: Match found
Executor->>Graph: Execute ON MATCH SET properties
Graph-->>Executor: Properties updated on existing node
else Pattern does not exist
Executor->>Graph: Query pattern
Graph-->>Executor: No match
Executor->>Graph: Create node + Execute ON CREATE SET properties
Graph-->>Executor: Node created with properties
end
Executor-->>Client: Return result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #66 +/- ##
==========================================
+ Coverage 93.74% 93.82% +0.08%
==========================================
Files 15 15
Lines 1998 2009 +11
Branches 498 501 +3
==========================================
+ Hits 1873 1885 +12
+ Misses 45 44 -1
Partials 80 80
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
|
Version Bump: - Bump version to 0.2.1 in pyproject.toml, __init__.py, and uv.lock - Add comprehensive v0.2.1 changelog entry Documentation Updates: - Update README.md with dataset loading examples and quickstart - Add "Load Real-World Datasets" section to main README - Update docs/index.md with dataset features and examples - Complete rewrite of docs/datasets/snap.md: - Mark as available in v0.2.1 (5 datasets) - Add detailed dataset table with stats - Add comprehensive usage examples and query patterns - Document download/caching behavior - Add performance tips for large datasets - Update docs/datasets/overview.md: - Reorganize to show SNAP as "Available Now" - Mark other sources as "Coming Soon" - List all 5 available SNAP datasets - Update docs/getting-started/quickstart.md: - Add "Load a Dataset" section with examples - Add dataset browsing examples - Update navigation links Release Contents (v0.2.1): - Dataset loading infrastructure with caching (#68) - CSV loader for edge-list datasets (#69) - 5 SNAP datasets available - MERGE ON CREATE SET syntax (#65) - MERGE ON MATCH SET syntax (#66) - WITH clause variable passing fix (#67) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: prepare v0.2.1 release with dataset documentation Version Bump: - Bump version to 0.2.1 in pyproject.toml, __init__.py, and uv.lock - Add comprehensive v0.2.1 changelog entry Documentation Updates: - Update README.md with dataset loading examples and quickstart - Add "Load Real-World Datasets" section to main README - Update docs/index.md with dataset features and examples - Complete rewrite of docs/datasets/snap.md: - Mark as available in v0.2.1 (5 datasets) - Add detailed dataset table with stats - Add comprehensive usage examples and query patterns - Document download/caching behavior - Add performance tips for large datasets - Update docs/datasets/overview.md: - Reorganize to show SNAP as "Available Now" - Mark other sources as "Coming Soon" - List all 5 available SNAP datasets - Update docs/getting-started/quickstart.md: - Add "Load a Dataset" section with examples - Add dataset browsing examples - Update navigation links Release Contents (v0.2.1): - Dataset loading infrastructure with caching (#68) - CSV loader for edge-list datasets (#69) - 5 SNAP datasets available - MERGE ON CREATE SET syntax (#65) - MERGE ON MATCH SET syntax (#66) - WITH clause variable passing fix (#67) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * chore: update uv.lock after version bump --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Summary
Implements
MERGE ... ON MATCH SETsyntax to complete MERGE enhancements and fully unblock Neo4j example datasets.This follows PR #65 (ON CREATE SET) and completes Phase 1 of the v0.3.0 roadmap.
Changes
🔧 Grammar
on_match_clauserulemerge_actionto accept both ON CREATE and ON MATCHON MATCH SET)📦 AST
on_matchfield toMergeClausedataclassSetClause | Nonefor optional ON MATCH clause🔄 Parser
on_match_clause()transformermerge_clause()to handle both on_create and on_match📋 Planner
Mergeoperator withon_matchfieldon_createandon_matchto operator⚙️ Executor
_execute_merge()to track create vs match stateif was_created and op.on_create→ Execute ON CREATE SETelif not was_created and op.on_match→ Execute ON MATCH SETTesting
✅ Test Coverage
📊 Test Categories
Parser Tests:
Executor Tests:
Integration Tests:
Examples
ON MATCH SET Only
Both ON CREATE and ON MATCH
Neo4j Timestamp Pattern
Counter Increment Pattern
Verification
Impact
🎯 Completes MERGE Enhancement
Together with PR #65 (ON CREATE SET), this fully supports all Neo4j dataset patterns:
neo4j-movie-graph(170 nodes, 250 edges)neo4j-northwind(1K nodes, 3K edges)neo4j-game-of-thrones(800 nodes, 3K edges)neo4j-fincen-files(500 nodes, 1.5K edges)neo4j-twitter(2K nodes, 8K edges)✨ Real-World Use Cases
Related
Notes
🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Tests