fix: accept WITH after SET/REMOVE/DELETE and in chained multi-part queries#310
Conversation
…eries (#257) Restructures the multi_part_query grammar rule to support all openCypher-valid WITH clause positions: - Grammar: Replace reading_or_writing_clauses+ with_clause+ with multi_part_segment+ pattern, where each segment is (segment_clause* with_clause) - Grammar: segment_clause now includes set_clause, remove_clause, delete_clause in addition to reading and writing clauses, allowing patterns like: MATCH (n) WITH n SET n.x = 1 WITH n RETURN n MATCH (n) WITH n REMOVE n:T WITH n RETURN n MATCH (n) SET n.x = 1 WITH n RETURN n - Grammar: exists_expr and count_expr now accept query (not single_part_query) inside {}, enabling WITH inside EXISTS/COUNT subqueries - Transformer: Add multi_part_segment and segment_clause handlers that return flat clause lists for consistent flattening in multi_part_query - Tests: 23 new parse + execution tests covering all new patterns Fixes patterns where WITH appeared after update clauses (SET, REMOVE, DELETE) in multi-part pipelines, which previously raised grammar parse errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WalkthroughRestructures the Cypher grammar to support WITH in more clause positions via segment-based multi-part queries; updates expressions (EXISTS, COUNT, quantifiers) and FLOAT token; adds transformer methods for segments; and adds unit and integration tests validating WITH clause positions and execution. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
tests/unit/parser/test_with_clause_positions.py (2)
11-79: Consider using@pytest.mark.parametrizeto reduce duplication.The parse-level tests follow a repetitive pattern: call
parse_cypher()and assert the result is notNone. This could be consolidated into a single parametrized test for better maintainability.♻️ Example refactor using parametrize
import pytest class TestWithClausePositionParsing: """Parse-level tests — verify grammar accepts all WITH positions.""" `@pytest.mark.parametrize`("query,description", [ ("MATCH (n) SET n.x = 1 WITH n RETURN n", "MATCH SET WITH RETURN"), ("MATCH (n:T) REMOVE n:T WITH n RETURN n", "MATCH REMOVE WITH RETURN"), ("MATCH (n:T) DELETE n WITH 1 AS done RETURN done", "MATCH DELETE WITH RETURN"), ("MATCH (n) WITH n SET n.x = 1 WITH n RETURN n", "two WITH segments"), ("MATCH (n:T) WITH n REMOVE n:T WITH n RETURN n", "WITH REMOVE WITH"), ("MATCH (a) WITH a MATCH (b) WITH a, b RETURN a, b", "triple part"), ("MATCH (a) WITH a MATCH (b) WITH a, b MATCH (c) WITH a, b, c RETURN a, b, c", "three WITH segments"), ("CREATE (n:T) WITH n SET n.x = 1 WITH n RETURN n.x AS val", "CREATE WITH SET WITH"), ("MATCH (n) SET n.x = 1 RETURN n", "SET without subsequent WITH"), ("MATCH (n) WITH n WITH n AS m RETURN m", "consecutive WITH clauses"), ("MATCH (n) WHERE EXISTS { MATCH (n)-[]->(m) WITH m RETURN m } RETURN n", "EXISTS with WITH inside"), ("MATCH (n) WHERE n.x > 1 WITH n RETURN n", "WHERE WITH"), ("UNWIND [1, 2, 3] AS x WITH x MATCH (n {id: x}) RETURN n", "UNWIND WITH MATCH"), ]) def test_with_clause_positions(self, query, description): """Verify grammar accepts WITH in various positions.""" ast = parse_cypher(query) assert ast is not None, f"Failed to parse: {description}"As per coding guidelines: "Use
@pytest.mark.parametrizefor testing the same logic with different inputs to avoid code duplication."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/parser/test_with_clause_positions.py` around lines 11 - 79, Tests in TestWithClausePositionParsing duplicate the same parse_cypher() + assert pattern across many test_... methods; consolidate them by replacing the individual test_* methods with a single parametrized test using pytest.mark.parametrize that iterates over the query strings and descriptions and calls parse_cypher(query) with assert ast is not None (include a helpful message), referencing the existing test class TestWithClausePositionParsing and the parse_cypher function to locate where to change.
82-84: Execution tests may belong intests/integration/.These tests exercise the full stack via
GraphForge.execute(), which is end-to-end query execution rather than isolated unit testing. Per the test organization guidelines, integration tests that verify end-to-end behavior should reside intests/integration/.As per coding guidelines: "Organize tests into unit/ (< 1ms, isolated), integration/ (end-to-end queries)..." and "Integration tests must test end-to-end Cypher query execution with GraphForge.execute()."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/parser/test_with_clause_positions.py` around lines 82 - 84, Tests under the TestWithClausePositionExecution class exercise end-to-end query execution via GraphForge.execute() and should be moved from unit to integration tests; relocate the TestWithClausePositionExecution class and its tests to tests/integration/, update any imports or test helpers (e.g., ensure GraphForge and fixtures are imported from the integration test utilities), and adjust test module/package names or pytest markers as needed so the tests run with the integration suite rather than the fast unit suite.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/graphforge/parser/cypher.lark`:
- Around line 31-33: Remove the unused grammar rule writing_clause (which simply
aliases create_clause and merge_clause) from cypher.lark and delete the
corresponding unused transformer method in parser.py (the writing_clause
transformer), since segment_clause already references create_clause and
merge_clause directly and nothing calls writing_clause; also remove the
misleading "backward compatibility" comment. Ensure no other rules or tests
reference writing_clause before deleting.
In `@src/graphforge/parser/parser.py`:
- Around line 148-150: The branch checking for isinstance(item, WithClause) in
parser.py is dead code because WithClause instances are only produced nested
inside lists by multi_part_segment and never appear directly in
multi_part_query; remove the elif block that appends item to all_clauses (the
WithClause branch) and ensure all_clauses continues to be built from
list-unpacked clauses produced by multi_part_segment so the remaining logic
handles WithClause instances when present.
---
Nitpick comments:
In `@tests/unit/parser/test_with_clause_positions.py`:
- Around line 11-79: Tests in TestWithClausePositionParsing duplicate the same
parse_cypher() + assert pattern across many test_... methods; consolidate them
by replacing the individual test_* methods with a single parametrized test using
pytest.mark.parametrize that iterates over the query strings and descriptions
and calls parse_cypher(query) with assert ast is not None (include a helpful
message), referencing the existing test class TestWithClausePositionParsing and
the parse_cypher function to locate where to change.
- Around line 82-84: Tests under the TestWithClausePositionExecution class
exercise end-to-end query execution via GraphForge.execute() and should be moved
from unit to integration tests; relocate the TestWithClausePositionExecution
class and its tests to tests/integration/, update any imports or test helpers
(e.g., ensure GraphForge and fixtures are imported from the integration test
utilities), and adjust test module/package names or pytest markers as needed so
the tests run with the integration suite rather than the fast unit suite.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/graphforge/parser/cypher.larksrc/graphforge/parser/parser.pytests/unit/parser/test_with_clause_positions.py
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #310 +/- ##
==========================================
+ Coverage 90.09% 90.11% +0.01%
==========================================
Files 38 38
Lines 10512 10537 +25
Branches 2155 2158 +3
==========================================
+ Hits 9471 9495 +24
- Misses 726 731 +5
+ Partials 315 311 -4
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
|
- Remove dead `writing_clause` grammar rule and its transformer methods (`writing_clause`, `reading_or_writing_clauses`) — superseded by `segment_clause` which covers create/merge directly - Remove unreachable `isinstance(item, WithClause)` branch in `multi_part_query` transformer — `multi_part_segment` always returns lists - Parametrize `TestWithClausePositionParsing` with `@pytest.mark.parametrize` to reduce repetition - Move `TestWithClausePositionExecution` from `tests/unit/` to `tests/integration/test_with_clause_positions.py` per project conventions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (4)
tests/integration/test_with_clause_positions.py (2)
42-53:test_three_segment_chainshould validate result values, not only row count.At Line 52, a length check alone can miss projection/scoping regressions across the chained
WITHsegments.As per coding guidelines `tests/integration/**/*.py`: Integration tests must test end-to-end Cypher query execution with GraphForge.execute() and verify result values.🧪 Suggested assertion upgrade
def test_three_segment_chain(self): """Three-segment multi-part query works correctly.""" gf = GraphForge() - gf.execute("CREATE (:A)-[:R]->(:B)-[:R]->(:C)") + gf.execute("CREATE (:A {v: 1})-[:R]->(:B {v: 2})-[:R]->(:C {v: 3})") result = gf.execute( "MATCH (a:A) WITH a " "MATCH (a)-[:R]->(b:B) WITH a, b " "MATCH (b)-[:R]->(c:C) WITH a, b, c " - "RETURN a, b, c" + "RETURN a.v AS av, b.v AS bv, c.v AS cv" ) assert len(result) == 1 + assert result[0]["av"].value == 1 + assert result[0]["bv"].value == 2 + assert result[0]["cv"].value == 3🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_with_clause_positions.py` around lines 42 - 53, Update test_three_segment_chain to assert the returned row contains the expected node values rather than only checking len(result). After running gf.execute in test_three_segment_chain, inspect result[0] and assert that keys 'a', 'b', and 'c' exist and have the expected properties (e.g., labels or unique identifiers) produced by the CREATE call; use the GraphForge.execute return shape (result[0]['a'], result[0]['b'], result[0]['c']) to compare against expected node identities/labels so the test validates projection/scoping across the chained WITH segments.
10-108: Add an end-to-endDELETE ... WITH ...execution test.
DELETEis part of the feature scope, but this integration suite currently validatesSETandREMOVEflows only. Adding one delete case would close that gap.🧪 Suggested additional integration test
class TestWithClausePositionExecution: @@ + def test_match_delete_with_return(self): + """MATCH ... DELETE ... WITH ... RETURN executes correctly.""" + gf = GraphForge() + gf.execute("CREATE (:T {id: 1})") + result = gf.execute("MATCH (n:T) DELETE n WITH 1 AS done RETURN done") + assert len(result) == 1 + assert result[0]["done"].value == 1 + remaining = gf.execute("MATCH (n:T) RETURN n") + assert len(remaining) == 0🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/test_with_clause_positions.py` around lines 10 - 108, Add an integration test in TestWithClausePositionExecution (e.g., def test_delete_with_with_return) that covers DELETE combined with WITH: create two :T nodes (distinct v values), run a multi-segment query that MATCHes one node and DELETEs it then uses WITH to continue the query and MATCH remaining :T nodes (for example "MATCH (n:T {v:1}) DELETE n WITH 1 AS dummy MATCH (m:T) RETURN count(m) AS cnt"), and assert the returned count equals 1 to confirm the deleted node is gone; reference the class TestWithClausePositionExecution and add the new test method name (test_delete_with_with_return) to locate where to add it.src/graphforge/parser/parser.py (1)
145-152: Guard against silent clause loss inmulti_part_query.At Line 145, items that are neither
listnorCypherQueryare dropped silently. This can hide grammar/transformer drift and produce truncated ASTs without a clear failure.♻️ Proposed hardening
for item in items: if isinstance(item, list): # multi_part_segment returns a list of clauses all_clauses.extend(item) elif isinstance(item, CypherQuery): # final_query_part returns a CypherQuery all_clauses.extend(item.clauses) + else: + raise TypeError( + f"Unexpected item in multi_part_query transformer: {type(item).__name__}" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/graphforge/parser/parser.py` around lines 145 - 152, The loop over items in multi_part_query silently drops any element that is not a list or CypherQuery (variables: items, all_clauses, CypherQuery), so add a defensive else branch that surfaces unexpected types: detect the unexpected item, raise a clear exception (or at minimum log an error with the item's type and repr) referencing the multi_part_query context and include the offending item and its type to fail fast and avoid silent clause loss; ensure the message mentions the unexpected type and the variable name (item) so downstream callers can trace grammar/transformer drift.tests/unit/parser/test_with_clause_positions.py (1)
11-31: Strengthen parse assertions and include aCOUNT { ... WITH ... }case.At Line 40,
assert ast is not Noneonly proves “no exception.” It won’t catch AST-shape regressions from the new segment/count paths. Also, the changedcount_exprgrammar path is not covered in this test set.As per coding guidelines `tests/unit/**/*.py`: Unit tests must test a single component in isolation using fixtures like empty_graph and assertions on AST/operator types.🧪 Suggested test hardening
+from graphforge.ast.query import CypherQuery from graphforge.parser.parser import parse_cypher _WITH_POSITION_CASES = [ ("MATCH (n) SET n.x = 1 WITH n RETURN n", "MATCH SET WITH RETURN"), @@ ( "MATCH (n) WHERE EXISTS { MATCH (n)-[]->(m) WITH m RETURN m } RETURN n", "EXISTS subquery with WITH inside", ), + ( + "MATCH (n) WHERE COUNT { MATCH (n)-[]->(m) WITH m RETURN m } > 0 RETURN n", + "COUNT subquery with WITH inside", + ), @@ def test_with_clause_positions(self, query: str, description: str) -> None: """Verify grammar accepts WITH in various positions.""" ast = parse_cypher(query) - assert ast is not None, f"Failed to parse: {description}" + assert isinstance(ast, CypherQuery), f"Unexpected AST type for: {description}" + assert ast.clauses, f"Empty AST clauses for: {description}"Also applies to: 37-41
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/parser/test_with_clause_positions.py` around lines 11 - 31, The test currently only checks `ast is not None` (involving `_WITH_POSITION_CASES`) which won't detect AST-shape regressions or the new `count_expr` grammar path; update the test to (1) add a case covering a COUNT with an inner subquery that contains a WITH (e.g. a "COUNT { ... WITH ... }" scenario added to `_WITH_POSITION_CASES`), (2) replace or augment the weak `assert ast is not None` with concrete assertions against the parsed AST/operator types (use the same parsing entry used in the test to produce `ast` and assert expected node types/structure for the top-level statement and the WITH segment), and (3) use the prescribed fixture (e.g. `empty_graph`) so the unit test isolates parsing behavior rather than execution context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/graphforge/parser/parser.py`:
- Around line 145-152: The loop over items in multi_part_query silently drops
any element that is not a list or CypherQuery (variables: items, all_clauses,
CypherQuery), so add a defensive else branch that surfaces unexpected types:
detect the unexpected item, raise a clear exception (or at minimum log an error
with the item's type and repr) referencing the multi_part_query context and
include the offending item and its type to fail fast and avoid silent clause
loss; ensure the message mentions the unexpected type and the variable name
(item) so downstream callers can trace grammar/transformer drift.
In `@tests/integration/test_with_clause_positions.py`:
- Around line 42-53: Update test_three_segment_chain to assert the returned row
contains the expected node values rather than only checking len(result). After
running gf.execute in test_three_segment_chain, inspect result[0] and assert
that keys 'a', 'b', and 'c' exist and have the expected properties (e.g., labels
or unique identifiers) produced by the CREATE call; use the GraphForge.execute
return shape (result[0]['a'], result[0]['b'], result[0]['c']) to compare against
expected node identities/labels so the test validates projection/scoping across
the chained WITH segments.
- Around line 10-108: Add an integration test in TestWithClausePositionExecution
(e.g., def test_delete_with_with_return) that covers DELETE combined with WITH:
create two :T nodes (distinct v values), run a multi-segment query that MATCHes
one node and DELETEs it then uses WITH to continue the query and MATCH remaining
:T nodes (for example "MATCH (n:T {v:1}) DELETE n WITH 1 AS dummy MATCH (m:T)
RETURN count(m) AS cnt"), and assert the returned count equals 1 to confirm the
deleted node is gone; reference the class TestWithClausePositionExecution and
add the new test method name (test_delete_with_with_return) to locate where to
add it.
In `@tests/unit/parser/test_with_clause_positions.py`:
- Around line 11-31: The test currently only checks `ast is not None` (involving
`_WITH_POSITION_CASES`) which won't detect AST-shape regressions or the new
`count_expr` grammar path; update the test to (1) add a case covering a COUNT
with an inner subquery that contains a WITH (e.g. a "COUNT { ... WITH ... }"
scenario added to `_WITH_POSITION_CASES`), (2) replace or augment the weak
`assert ast is not None` with concrete assertions against the parsed
AST/operator types (use the same parsing entry used in the test to produce `ast`
and assert expected node types/structure for the top-level statement and the
WITH segment), and (3) use the prescribed fixture (e.g. `empty_graph`) so the
unit test isolates parsing behavior rather than execution context.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/graphforge/parser/cypher.larksrc/graphforge/parser/parser.pytests/integration/test_with_clause_positions.pytests/unit/parser/test_with_clause_positions.py
Grammar and parser changes were already included in #310 (squashed). This commit adds the corresponding test suite and TCK harness fix: - Tests: 31 unit tests covering all float literal forms: - Standard: 1.0, 3.14 (regression) - Leading-dot: .1, .5, .3405892687 - Exponent notation: 1e9, 1E9, 1e-5, 1e308, 1.5e10, 2.3E-4 - Mixed: .1e9, .1e-5, .1E-5 - Overflow: 1.34E999 raises ValueError - Negative (unary minus + float): -1e9, -.5, etc. - TCK harness: fix _parse_value in conftest.py to handle exponent-notation floats like 1e308 and 1e-305 that have no decimal point (previously fell through to CypherString instead of CypherFloat) All 30 Literals5 TCK scenarios now pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…311) Grammar and parser changes were already included in #310 (squashed). This commit adds the corresponding test suite and TCK harness fix: - Tests: 31 unit tests covering all float literal forms: - Standard: 1.0, 3.14 (regression) - Leading-dot: .1, .5, .3405892687 - Exponent notation: 1e9, 1E9, 1e-5, 1e308, 1.5e10, 2.3E-4 - Mixed: .1e9, .1e-5, .1E-5 - Overflow: 1.34E999 raises ValueError - Negative (unary minus + float): -1e9, -.5, etc. - TCK harness: fix _parse_value in conftest.py to handle exponent-notation floats like 1e308 and 1e-305 that have no decimal point (previously fell through to CypherString instead of CypherFloat) All 30 Literals5 TCK scenarios now pass. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Closes #257
Summary
multi_part_queryfromreading_or_writing_clauses+ with_clause+into a propermulti_part_segment+pattern where each segment is(segment_clause* with_clause), matching the openCypher specsegment_clausecovers all clause types before WITH: reading clauses (MATCH, OPTIONAL MATCH, UNWIND, CALL), writing clauses (CREATE, MERGE), and updating clauses (SET, REMOVE, DELETE)exists_expr/count_exprnow acceptqueryinside{}, enabling WITH inside EXISTS/COUNT subqueriesmulti_part_segmentandsegment_clausehandlers flatten clauses into the existingCypherQueryaccumulatorPatterns Now Supported
```cypher
-- SET before WITH
MATCH (n) SET n.x = 1 WITH n RETURN n
-- REMOVE before WITH
MATCH (n:T) REMOVE n:T WITH n RETURN n
-- SET after a prior WITH (chained segment)
MATCH (n) WITH n SET n.x = 99 WITH n RETURN n.x AS val
-- Three-segment chains
MATCH (a) WITH a MATCH (b) WITH a, b MATCH (c) WITH a, b, c RETURN a, b, c
-- EXISTS containing WITH
MATCH (n) WHERE EXISTS { MATCH (n)-[]->(m) WITH m RETURN m } RETURN n
```
Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests