fix: [cypher] label disjunction (n:A|B) returned 0 rows#4221
Conversation
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 26 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Code Review
This pull request introduces support for label disjunction node patterns in Cypher queries, such as (n:A|B). A new physical operator, NodeByLabelDisjunctionScan, has been implemented to scan multiple vertex types and return matching nodes. The logical plan and index selection rules were updated to identify disjunctions and apply the new operator accordingly. Comprehensive regression tests were also added in CypherLabelDisjunctionTest. I have no feedback to provide as no review comments were submitted.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4221 +/- ##
==========================================
+ Coverage 64.62% 65.00% +0.38%
==========================================
Files 1643 1644 +1
Lines 126818 126894 +76
Branches 27149 27167 +18
==========================================
+ Hits 81955 82490 +535
+ Misses 33418 32844 -574
- Partials 11445 11560 +115 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@claude review this |
|
Claude finished @robfrank's task in 2m 42s —— View job Code Review
SummaryThe fix is correct and the root cause analysis is accurate. The optimizer path was always folding Issues Found1.
2. // LogicalNode.java:82
sb.append(":").append(String.join(":", labels));This always renders labels as sb.append(":").append(String.join(labelDisjunction ? "|" : ":", labels));3. The // IndexSelectionRule.java:111
if (anchor.getNode().isLabelDisjunction() && labels.size() > 1) {If Observations (Not Blocking)4.
This is consistent with how 5. Test coverage gaps The 8 regression tests are well-targeted. Two scenarios worth adding:
6. Coverage report discrepancy Codecov shows 0% diff coverage on VerdictThe fix is clean, well-scoped, and consistent with the existing codebase patterns. Item 2 ( |
The optimizer treated multiple labels as AND conjunction unconditionally, calling Labels.getCompositeTypeName which returned "A~B" — a type that does not exist in the schema, so NodeByLabelScan returned 0 rows. LogicalNode now carries the labelDisjunction flag from NodePattern, and IndexSelectionRule routes disjunction patterns to a new NodeByLabelDisjunctionScan operator that iterates each VertexType once and includes types that are instanceOf any disjunction label. Closes #4211
feccdee to
b0b8d6e
Compare
Code ReviewOverviewThis PR adds optimizer-path support for Cypher label disjunction patterns ( Critical: New operator is dead code - planner gate not updated
// Multi-label nodes not yet supported in optimizer
if (node.getLabels().size() > 1)
return false;This check does not distinguish between conjunction ( The tests pass, but they exercise the legacy path - not the new operator. Codecov confirms this: 0% coverage on The planner gate needs to be updated to allow disjunctions through: // Multi-label conjunction (n:A:B) not yet supported in optimizer.
// Disjunction (n:A|B) is handled by NodeByLabelDisjunctionScan.
if (node.getLabels().size() > 1 && !node.isLabelDisjunction())
return false;Without this change, the PR's stated root cause (optimizer calling Test coverage gaps
Code quality notes
The three comments that were removed ("// FULL SCAN", "// INDEX SEEK (equality)", "// RANGE SCAN - pass predicates for runtime parameter resolution") served as navigation markers for the branching logic. Consider keeping them since the code is routing between meaningfully different physical strategies.
The two-constructor approach (3-arg delegating to 4-arg) is clean and backward-compatible. Using Summary
The architecture and operator implementation are sound. The missing piece is wiring the planner to allow disjunctions through to the optimizer, and adding a test that actually exercises the new operator end-to-end. |
Code review revealed the new operator was dead code: the planner gate at CypherExecutionPlanner.shouldUseOptimizer falls back to the legacy path for every multi-label node, so NodeByLabelDisjunctionScan was never invoked. The disjunction tests passed via the legacy MatchNodeStep (already fixed for issue #4105). The gate now allows disjunction patterns through to the optimizer when the node has no incident relationships, since ExpandAll/ExpandInto carry a single targetLabel and cannot represent OR semantics in non-anchor position. Adds a subtype/inheritance regression test (Animal|Pet matches A:Dog node via instanceOf hierarchy traversal). Renames the internal cursor variable for clarity.
|
The critical finding is correct. Verified with a probe ( Fixed in 4845bd6:
Skipped the target-side disjunction test: it exposes a pre-existing gap in Skipped restoring the The commit message on the original commit (b0b8d6e) overstated the root cause; the new commit explains the actual wiring. |
Code Review - PR #4221: fix: [cypher] label disjunction (n:A|B) returned 0 rowsOverviewThis PR fixes a real, reproducible bug: Overall this is a clean, well-structured fix. A few things worth addressing before merging: Issues1. Missing WHERE predicate pushdown (moderate)
MATCH (n:A|B) WHERE n.status = 'active'all matching nodes must be fully materialized before the filter step runs, rather than filtering during the scan. For large datasets this will be noticeably slower than the single-label case. At minimum this should be documented as a known limitation, ideally with a follow-up ticket. 2.
|
Adds regression test for MATCH (n:A|B)-[:REL]->(m) routing through the legacy MatchNodeStep (planner gate falls back when relationships are incident on a disjunction node). Documents the polymorphic=false choice in NodeByLabelDisjunctionScan.buildMatchingIterators.
|
Per-item evaluation: #1 WHERE predicate pushdown - Skipped. Pushdown is a perf optimization, not a correctness gap, and the existing tests show WHERE works as a separate filter step. Worth a follow-up issue but out of scope for fixing #4211. #2 Eager iterator construction - Skipped. #3 #4 Relationship-pattern fallback test - Added in 159204b: Minor: Minor: removed inline comments in |
Code Review - fix: [cypher] label disjunction (n:A|B) returned 0 rowsOverviewGood, targeted fix for a clear bug. The root cause (optimizer collapsing Code QualityStrengths
Issues 1. // NodeByLabelDisjunctionScan.java
public List<String> getLabels() {
return labels;
}Callers can modify the internal 2. No predicate pushdown support
This does not block the PR, but is worth a TODO comment or follow-up issue. 3. Duplicate-label edge case produces duplicate rows
for (final String label : labels) {
if (type.instanceOf(label)) {
iterators.add(iter);
break; // correct: avoids duplicate for a type matching multiple labels
}
}The 4. Overly conservative optimizer fallback if (path.getRelationshipCount() > 0)
return false;This applies to the whole path whenever any node in the path has a multi-label disjunction. For the pattern Test Coverage
database.command("sql", "CREATE VERTEX TYPE Animal");
database.command("sql", "CREATE VERTEX TYPE Dog EXTENDS Animal");
database.command("opencypher", "CREATE (:Dog {id: 10})");
Minor Nits
SummaryThe fix is correct and the approach is sound. The main actionable items before merge are:
Items 2 (predicate pushdown) and 4 (relaxing the relationship fallback) are good follow-up issues but do not block correctness. |
Documents the deliberate scope decisions raised in code review: no inline WHERE pushdown, eager iterator construction is cheap due to MultiIterator lazy page reads, and getLabels does not defensively copy.
Code ReviewOverviewThis PR correctly fixes a long-standing bug where Strengths
Issues1. In if (!iterator.hasNext()) {
finished = true;
}
2. The test creates 3. Cost estimation for disjunction is inaccurate
4. No WHERE pushdown for The Javadoc already acknowledges this, so it's clearly a known gap. It just means that Minor / Style
SummaryThe fix is correct and the test suite is solid. Two things worth addressing before merging:
The cost estimation and WHERE pushdown gaps are reasonable deferred items. |
- fetchMore now sets finished eagerly when the last type iterator drains in the same call that fills the buffer, matching NodeByLabelScan's pattern and avoiding one redundant fetchMore round-trip per query. - Adds a regression test using a real schema parent chain (Dog extends Animal via addSuperType) to exercise type.instanceOf walking the schema hierarchy, distinct from the existing multi-label conjunction test (:Animal:Dog). - Documents the cost-estimation gap for disjunctions as a known limitation (estimatedCardinality reflects only the first label).
|
Per-item evaluation (4e632de): #1 #2 Subtype-hierarchy coverage - Fixed. Added #3 Cost estimation gap - Documented in the operator Javadoc as a known limitation: #4 WHERE pushdown - Already documented in the operator Javadoc; flagged in PR description for follow-up. |
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 29 |
🟢 Coverage 78.75% diff coverage · -7.11% coverage variation
Metric Results Coverage variation ✅ -7.11% coverage variation Diff coverage ✅ 78.75% diff coverage Coverage variation details
Coverable lines Covered lines Coverage Common ancestor commit (89e0d16) 126818 93429 73.67% Head commit (4e632de) 158572 (+31754) 105542 (+12113) 66.56% (-7.11%) Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch:
<coverage of head commit> - <coverage of common ancestor commit>Diff coverage details
Coverable lines Covered lines Diff coverage Pull request (#4221) 80 63 78.75% Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified:
<covered lines added or modified>/<coverable lines added or modified> * 100%
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
Code Review: PR #4221 - fix: [cypher] label disjunction (n:A|B) returned 0 rowsSummaryThis PR fixes issue #4211 where The fix introduces a new What Works Well
IssuesMajorM1: Cardinality underestimation in
M2: No test verifying the optimizer path is actually taken Commit history shows the new operator was dead code in the first commit (the planner gate was not updated). The tests still passed because they were testing the legacy @Test
public void testOptimizerPlanUsesDisjunctionScan() {
// EXPLAIN returns the execution plan as a string
try (ResultSet rs = database.query("cypher", "EXPLAIN MATCH (n:A|B) RETURN n")) {
String plan = rs.next().getProperty("plan").toString();
assertThat(plan).contains("NodeByLabelDisjunctionScan");
}
}Minorm1: None of the try (ResultSet rs = database.query("cypher", "MATCH (n:A|B) RETURN n")) {
// assertions
}m2: Queries run outside transactions All queries in m3: Stale Javadoc on The m4: Ordering assumption in assertThat(ids).containsExactly(2);
m5:
Summary RecommendationThe bug fix is correct and the design is sound. The main actionable items are:
The PR is otherwise ready to merge - the fix is correct, the scope is appropriately narrow, and the test coverage is good for the functional cases. |
Summary
MATCH (n:A|B)returned 0 rows even when nodes with both labels exist.Labels.getCompositeTypeName(["A","B"])→"A~B", treating multiple labels as AND.NodeByLabelScanthen short-circuited onexistsType("A~B")→ false.LogicalNodenow carries thelabelDisjunctionflag propagated fromNodePattern.IndexSelectionRule.createAnchorOperator()routes disjunction patterns to a newNodeByLabelDisjunctionScanoperator that walks eachVertexTypeonce and includes types matching any disjunction label.Files
engine/src/main/java/com/arcadedb/query/opencypher/optimizer/plan/LogicalNode.java—labelDisjunctionfield + getterengine/src/main/java/com/arcadedb/query/opencypher/optimizer/plan/LogicalPlan.java— propagates flag fromNodePatternengine/src/main/java/com/arcadedb/query/opencypher/executor/operators/NodeByLabelDisjunctionScan.java— new operatorengine/src/main/java/com/arcadedb/query/opencypher/optimizer/rules/IndexSelectionRule.java— disjunction routing increateAnchorOperatorengine/src/test/java/com/arcadedb/query/opencypher/CypherLabelDisjunctionTest.java— 8 regression testsTest plan
CypherLabelDisjunctionTest— 8/8 pass (A|B, A|C, B|C, A|B|C, count variants, single-label sanity, disjunction + WHERE)CypherMultiLabelTest,CypherMultiLabelPreExistingTypeTest,CypherLabelFilteringTest,CypherLabelCheckInWhereTest,CypherTest— 40/40 pass (no regressions)CountEdgesOptimizationTest,CypherInlinePropertyFilterTest,CypherRangeIndexTest,CypherCountSubqueryTest,CypherPatternPredicateTest,CypherPolymorphicEdgeTraversalTest,CypherRelationDirectionTest— 50/50 passCypherCallYieldWithVariablesTest,CollectDistinctTest,CypherCaseTest,CypherExistsTest,CypherMissingFunctionsTest,CypherCountNonExistingLabelTest— 82/82 pass