feat: implement REMOVE clause for properties and labels#42
Conversation
Add REMOVE clause support to allow removing properties from nodes/relationships and labels from nodes. Changes: - Grammar: Add remove_clause with support for property and label removal - AST: Add RemoveClause and RemoveItem classes - Parser: Add transformers for remove_clause, remove_property, remove_label - Planner: Add Remove operator and integrate into query planning - Executor: Implement _execute_remove with proper NodeRef immutability handling - Grammar: Allow multiple SET/REMOVE clauses in any order in update_query - Grammar: Support RETURN with ORDER BY/SKIP/LIMIT after REMOVE Features: - Remove single or multiple properties: REMOVE n.age, n.city - Remove single or multiple labels: REMOVE n:Person, n:Employee - Mix property and label removal: REMOVE n.age, n:Label - Works with WHERE filtering and RETURN projection - Compatible with SET and other update operations - Properly handles NodeRef immutability by creating new instances Tests: - 16 comprehensive integration tests covering: - Property removal (single, multiple, nonexistent, from relationships) - Label removal (single, multiple, nonexistent) - Mixed property/label removal - Integration with WHERE, SET, RETURN, ORDER BY, SKIP, LIMIT - Edge cases (all properties, no matches, relationship properties) All tests passing (723 passed, 14 skipped). Coverage: 94.87% (above 85% threshold). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
WalkthroughThis PR implements the REMOVE clause feature, enabling removal of properties and labels from nodes and relationships. Changes span the Cypher parser (grammar and transformer), query planner (operator definition and planning logic), executor (removal implementation), and comprehensive test coverage across parser, planner, and integration layers. Changes
Sequence DiagramsequenceDiagram
participant Parser
participant Planner
participant Executor
participant Graph
Parser->>Parser: Parse REMOVE clause<br/>(variable.property or variable:Label)
Parser->>Planner: Emit RemoveClause AST<br/>(contains list of RemoveItem)
Planner->>Planner: Plan query: create Remove operator<br/>(from RemoveClause items)
Executor->>Executor: _execute_remove(Remove, input_rows)
loop For each input row and item
alt Property Removal
Executor->>Graph: Get element (NodeRef/EdgeRef)
Graph-->>Executor: Return element with properties
Executor->>Graph: Delete property from element
Graph->>Graph: Update properties
else Label Removal
Executor->>Graph: Get NodeRef with labels
Graph-->>Executor: Return NodeRef
Executor->>Graph: Create new NodeRef<br/>(with updated labels)
Graph->>Graph: Add new node to graph
Executor->>Executor: Update binding to<br/>reference new NodeRef
end
end
Executor-->>Executor: Return updated contexts
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #42 +/- ##
==========================================
+ Coverage 92.59% 92.68% +0.08%
==========================================
Files 15 15
Lines 1810 1858 +48
Branches 446 456 +10
==========================================
+ Hits 1676 1722 +46
Misses 50 50
- Partials 84 86 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
|
Add parser and planner unit tests plus additional integration tests to improve patch coverage from 31.25% to target >80%. Added tests: - 12 parser unit tests for all REMOVE grammar patterns - 5 planner unit tests for Remove operator creation - 6 additional integration tests for edge cases New tests cover: - Parser: single/multiple properties, labels, mixed, with WHERE/RETURN/SET/ORDER BY/LIMIT - Planner: operator creation, ordering, mixed SET/REMOVE - Integration: multiple variables, idempotency, same property twice, label removal edge cases Test results: - 38 REMOVE-specific tests (all passing) - 745 total tests passing (14 skipped) - Coverage maintained at 94.87% Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/graphforge/parser/cypher.lark`:
- Around line 56-60: The grammar's remove_item currently only allows a single
label via "variable label", so update the remove_item rule to accept multiple
labels by replacing the second alternative with "variable labels" (keeping the
existing property_access -> remove_property and remove_label action names).
Modify the remove_item production so remove_clause, remove_item, remove_label,
labels and property_access are used consistently with the labels+ rule (same
pattern as node_pattern) to support REMOVE n:Person:Employee.
In `@src/graphforge/planner/planner.py`:
- Around line 108-109: The planner currently overwrites remove_clause for each
RemoveClause instance (the elif branch handling RemoveClause), discarding
earlier REMOVE clauses; change the collector to append each RemoveClause into a
list (e.g., remove_clauses) similar to
match_clauses/create_clauses/merge_clauses, and then update the
operator-building code that consumes remove_clause to iterate over
remove_clauses and construct/apply an operator per clause; ensure you update any
variable names and any code paths that expected a single remove_clause to handle
the list (e.g., places that referenced remove_clause when creating the
logical/remove operator).
🧹 Nitpick comments (1)
src/graphforge/ast/clause.py (1)
58-70: LGTM! Consider using an Enum foritem_typefor better type safety.The
RemoveItemdataclass is well-documented and follows the existing pattern. Using a string literal foritem_typeis pragmatic, but an Enum (e.g.,RemoveItemType.PROPERTY,RemoveItemType.LABEL) would provide better IDE autocompletion and catch typos at parse time rather than runtime.
| // REMOVE clause | ||
| remove_clause: "REMOVE"i remove_item ("," remove_item)* | ||
|
|
||
| remove_item: property_access -> remove_property | ||
| | variable label -> remove_label |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's examine the full cypher.lark file around the remove_clause
cat -n src/graphforge/parser/cypher.lark | head -100Repository: DecisionNerd/graphforge
Length of output: 4064
🏁 Script executed:
# Look for the label rules to understand how labels are defined
rg -n "^label" src/graphforge/parser/cypher.lark -A 2Repository: DecisionNerd/graphforge
Length of output: 156
🏁 Script executed:
# Check if there's a plural 'labels' rule
rg -n "labels" src/graphforge/parser/cypher.larkRepository: DecisionNerd/graphforge
Length of output: 140
🏁 Script executed:
# Look for the RemoveClause in clause.py to see the docstring
rg -n "class RemoveClause" src/graphforge/ast/clause.py -A 20Repository: DecisionNerd/graphforge
Length of output: 556
🏁 Script executed:
# Search for tests with multi-label removal
rg -n "Person:Employee\|remove.*:" tests/ --type py -B 2 -A 2Repository: DecisionNerd/graphforge
Length of output: 49
Grammar does not support multi-label removal syntax shown in docstring.
The remove_item rule uses variable label (singular), which only handles single label removal like REMOVE n:Person. However, the RemoveClause docstring in src/graphforge/ast/clause.py lists REMOVE n:Person:Employee as a valid example.
The grammar defines a labels: label+ rule that supports multiple labels, but remove_item doesn't use it. Update line 60 in src/graphforge/parser/cypher.lark:
remove_item: property_access -> remove_property
| variable labels -> remove_labelThis aligns the grammar with the documented syntax and the existing labels rule pattern used elsewhere (e.g., node_pattern).
🤖 Prompt for AI Agents
In `@src/graphforge/parser/cypher.lark` around lines 56 - 60, The grammar's
remove_item currently only allows a single label via "variable label", so update
the remove_item rule to accept multiple labels by replacing the second
alternative with "variable labels" (keeping the existing property_access ->
remove_property and remove_label action names). Modify the remove_item
production so remove_clause, remove_item, remove_label, labels and
property_access are used consistently with the labels+ rule (same pattern as
node_pattern) to support REMOVE n:Person:Employee.
| elif isinstance(clause, RemoveClause): | ||
| remove_clause = clause |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n src/graphforge/planner/planner.py | sed -n '80,160p'Repository: DecisionNerd/graphforge
Length of output: 3388
🏁 Script executed:
find . -type f -name "*.py" -o -name "*.lark" -o -name "*.g4" | grep -i "gram\|parse" | head -20Repository: DecisionNerd/graphforge
Length of output: 359
🏁 Script executed:
cat src/graphforge/parser/cypher.lark | grep -A 5 -B 5 "remove_clause\|RemoveClause"Repository: DecisionNerd/graphforge
Length of output: 1006
🏁 Script executed:
cat src/graphforge/parser/cypher.lark | head -40Repository: DecisionNerd/graphforge
Length of output: 1782
🏁 Script executed:
cat src/graphforge/parser/test_remove_parser.pyRepository: DecisionNerd/graphforge
Length of output: 142
🏁 Script executed:
cat tests/unit/parser/test_remove_parser.pyRepository: DecisionNerd/graphforge
Length of output: 5218
🏁 Script executed:
rg -n "test.*multiple.*set.*clause" tests/ -A 10Repository: DecisionNerd/graphforge
Length of output: 49
🏁 Script executed:
grep -r "SET.*SET" tests/unit/parser/ --include="*.py" -A 5 -B 5Repository: DecisionNerd/graphforge
Length of output: 49
🏁 Script executed:
rg "MATCH.*SET.*SET" tests/Repository: DecisionNerd/graphforge
Length of output: 49
🏁 Script executed:
cat tests/unit/parser/test_parser.py | grep -A 10 "def test"Repository: DecisionNerd/graphforge
Length of output: 15544
🏁 Script executed:
grep -A 15 "test_parse_multiple_set" tests/unit/parser/test_parser.pyRepository: DecisionNerd/graphforge
Length of output: 49
🏁 Script executed:
rg "SET.*SET" tests/ -t pyRepository: DecisionNerd/graphforge
Length of output: 49
🏁 Script executed:
web_search
Does Neo4j Cypher support multiple REMOVE clauses in a single queryRepository: DecisionNerd/graphforge
Length of output: 160
🌐 Web query:
Does Neo4j Cypher support multiple REMOVE clauses in a single query
💡 Result:
Yes. Cypher queries are built by chaining clauses together, so you can use more than one REMOVE clause in the same query (e.g., ... REMOVE n.prop REMOVE n:Label ...). [1]
That said, you can often do it in a single REMOVE by removing multiple labels at once (e.g., REMOVE n:German:Swedish). [2]
Example
MATCH (n:Person {id: $id})
REMOVE n.tempFlag
REMOVE n:Deprecated
RETURN n;Or (single REMOVE for multiple labels)
MATCH (n:Person {id: $id})
REMOVE n:Deprecated:Legacy
RETURN n;Sources: [1] [2]
🏁 Script executed:
cat -n tests/unit/parser/test_remove_parser.py | sed -n '95,110p'Repository: DecisionNerd/graphforge
Length of output: 753
🏁 Script executed:
grep -n "test_parse.*multiple.*set" tests/unit/parser/ -r -iRepository: DecisionNerd/graphforge
Length of output: 49
Multiple REMOVE clauses are silently discarded — only the last one is preserved in the logical plan.
The grammar and parser correctly support multiple REMOVE clauses (e.g., MATCH (n) REMOVE n.age REMOVE n:Temp), and Neo4j Cypher officially supports this syntax. However, the planner's collector pattern overwrites remove_clause on each iteration (lines 108–109), keeping only the final clause. This differs from how match_clauses, create_clauses, and merge_clauses are handled—those use lists to preserve all occurrences.
Refactor to collect remove_clause instances into a list and iterate over all of them when building operators, matching the pattern used for other repeatable clauses.
🤖 Prompt for AI Agents
In `@src/graphforge/planner/planner.py` around lines 108 - 109, The planner
currently overwrites remove_clause for each RemoveClause instance (the elif
branch handling RemoveClause), discarding earlier REMOVE clauses; change the
collector to append each RemoveClause into a list (e.g., remove_clauses) similar
to match_clauses/create_clauses/merge_clauses, and then update the
operator-building code that consumes remove_clause to iterate over
remove_clauses and construct/apply an operator per clause; ensure you update any
variable names and any code paths that expected a single remove_clause to handle
the list (e.g., places that referenced remove_clause when creating the
logical/remove operator).
Update the workflow documentation to match the actual development process that has been successfully used for all v0.2 features. Changes: - Update issue status (7/9 features complete) - Add comprehensive `make pre-push` step (critical requirement) - Update commit message format with actual examples - Add detailed test requirements (45-70 tests per feature) - Include implementation patterns for all layers - Add pre-push checklist for verification - Document coverage requirements (>85% overall, >80% patch) - Add reference to PR #44 as example implementation - Include CI/CD pipeline details - Add common pitfalls and debugging tips - Update PR creation process with actual gh CLI commands - Document test file structure and locations - Add comprehensive summary section The updated workflow emphasizes: 1. Comprehensive testing (unit + integration) 2. Running `make pre-push` before all PRs 3. Proper commit message format 4. Clear PR documentation with examples 5. Maintaining high test coverage Reference implementations: - PR #44 (Arithmetic operators): 68 tests, 95% coverage - PR #42 (REMOVE clause): 38 tests, 94% coverage - PR #43 (String matching): 68 tests, 95% coverage Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update the workflow documentation to match the actual development process, making it version-agnostic and suitable for all future development. Changes: - Remove v0.2-specific content and status tracking - Make document version-agnostic (no milestone references) - Add comprehensive `make pre-push` step (critical requirement) - Update commit message format with actual examples - Add detailed test requirements (45-70 tests per feature) - Include implementation patterns for all layers - Add pre-push checklist for verification - Document coverage requirements (>85% overall, >80% patch) - Add reference implementations (PR #44, #43, #42) - Include CI/CD pipeline details - Add common pitfalls and debugging tips - Update PR creation process with actual gh CLI commands - Document test file structure and locations - Add comprehensive summary section - Include branch type prefixes (feature/, fix/, docs/, etc.) The updated workflow emphasizes: 1. Comprehensive testing (unit + integration) 2. Running `make pre-push` before all PRs 3. Proper commit message format 4. Clear PR documentation with examples 5. Maintaining high test coverage Reference implementations: - PR #44 (Arithmetic operators): 68 tests, 95% coverage - PR #43 (String matching): 68 tests, 95% coverage - PR #42 (REMOVE clause): 38 tests, 95% coverage Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: update ISSUE_WORKFLOW.md to reflect current process Update the workflow documentation to match the actual development process, making it version-agnostic and suitable for all future development. Changes: - Remove v0.2-specific content and status tracking - Make document version-agnostic (no milestone references) - Add comprehensive `make pre-push` step (critical requirement) - Update commit message format with actual examples - Add detailed test requirements (45-70 tests per feature) - Include implementation patterns for all layers - Add pre-push checklist for verification - Document coverage requirements (>85% overall, >80% patch) - Add reference implementations (PR #44, #43, #42) - Include CI/CD pipeline details - Add common pitfalls and debugging tips - Update PR creation process with actual gh CLI commands - Document test file structure and locations - Add comprehensive summary section - Include branch type prefixes (feature/, fix/, docs/, etc.) The updated workflow emphasizes: 1. Comprehensive testing (unit + integration) 2. Running `make pre-push` before all PRs 3. Proper commit message format 4. Clear PR documentation with examples 5. Maintaining high test coverage Reference implementations: - PR #44 (Arithmetic operators): 68 tests, 95% coverage - PR #43 (String matching): 68 tests, 95% coverage - PR #42 (REMOVE clause): 38 tests, 95% coverage Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix: improve test directory structure formatting Change tree structure from box-drawing characters to simple indentation for better markdown rendering across different viewers. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Summary
Implements issue #25: REMOVE clause for removing properties from nodes/relationships and labels from nodes.
Changes
Grammar (cypher.lark)
AST (ast/clause.py)
Parser (parser.py)
Planner (planner.py, operators.py)
Executor (executor.py)
Features
Tests
Added 16 comprehensive integration tests in test_remove_clause.py:
Property Removal Tests (5)
Label Removal Tests (3)
Mixed Tests (2)
Integration Tests (3)
Edge Cases (3)
Test Results
Examples
Remove property:
Remove label:
Remove multiple properties and labels:
Remove with filtering and ordering:
Notes
Checklist
Generated with Claude Code
Summary by CodeRabbit