Skip to content

Fixes #3244 : label filtering in Cypher MATCH relationship patterns#3252

Merged
lvca merged 3 commits intoArcadeData:mainfrom
ExtReMLapin:claude/fix-arcadedb-cypher-8z1Ym
Jan 27, 2026
Merged

Fixes #3244 : label filtering in Cypher MATCH relationship patterns#3252
lvca merged 3 commits intoArcadeData:mainfrom
ExtReMLapin:claude/fix-arcadedb-cypher-8z1Ym

Conversation

@ExtReMLapin
Copy link
Contributor

What does this PR do?

This PR fixes a critical bug in Cypher query execution where target node labels in relationship patterns were not being filtered, and bound variables with repeated labels in subsequent MATCH clauses were not being properly recognized. This caused:

  1. Incorrect result sets: Queries like (chunk:CHUNK)<-[r:in]-(target:NER) would return vertices of all types connected via the "in" edge, not just NER vertices
  2. Cartesian products: Repeating labels on already-bound variables (e.g., searchedChunk:CHUNK in multiple clauses) was not recognized as referring to the same bound variable, causing unnecessary cross-joins
  3. Query failures: Some valid query patterns would return null or incorrect results

Motivation

This fix addresses a user-reported issue where complex Cypher queries with multiple OPTIONAL MATCH clauses and repeated labels on bound variables were returning incorrect data. The root causes were:

  1. MatchRelationshipStep was not filtering target vertices by their labels
  2. The bound variable tracking across MATCH clauses didn't account for variables with labels
  3. The heuristic for detecting already-bound variables (checking for no labels/properties) was too simplistic

Related issues

This fixes the issue reported in the user's query pattern where searchedChunks was incorrectly containing NER nodes instead of only CHUNK nodes.

Changes Made

CypherExecutionPlan.java

  • Added boundVariables tracking across MATCH clauses to detect already-bound variables
  • Modified buildMatchStep() to skip creating new MatchNodeStep for variables already bound in previous MATCH clauses
  • Updated source node binding detection to check both the new boundVariables set and the old heuristic
  • Modified MatchRelationshipStep instantiation to pass target node pattern and bound variables for filtering
  • Applied same fixes to the legacy execution path

MatchRelationshipStep.java

  • Added targetNodePattern and boundVariableNames fields to support label filtering and bound variable checking
  • Implemented matchesTargetLabel() method to filter target vertices by their labels
  • Added identity check for already-bound target variables to prevent incorrect matches
  • Created backward-compatible constructor overload

CypherLabelFilteringTest.java (new)

  • Comprehensive test suite covering:
    • Target node label filtering in relationship patterns
    • Bound variables with labels in subsequent MATCH clauses
    • Full query pattern from the bug report
    • Verification that searchedChunks doesn't contain NER nodes

Issue3218Test.java

  • Updated existing tests to use labels on bound variables (now supported)
  • Removed workaround comments about the label filtering bug

Additional Notes

  • The changes are backward compatible - the new MatchRelationshipStep constructor has an overload that maintains the old behavior
  • Both the modern and legacy execution paths have been updated to ensure consistent behavior
  • The fix properly handles OPTIONAL MATCH clauses with label constraints on bound variables

Checklist

  • I have run the build using mvn clean package command
  • My unit tests cover both failure and success scenarios

https://claude.ai/code/session_017hhsQf7fzBvXcGSeg48hfP

Two bugs caused incorrect results in multi-MATCH/OPTIONAL MATCH Cypher queries:

1. MatchRelationshipStep never filtered target vertices by label. A pattern
   like (a:CHUNK)<-[r:in]-(b:NER) would return ANY connected vertex as b,
   not just NER vertices. Fixed by passing the target NodePattern and checking
   vertex type against label constraints.

2. Already-bound variables with labels were treated as unbound. The
   sourceAlreadyBound check required !sourceNode.hasLabels(), so writing
   (searchedChunk:CHUNK) in a subsequent MATCH created a full type scan
   instead of using the already-bound value. Fixed by tracking bound
   variable names across MATCH clauses.

Also adds identity checking: when a target variable is already bound from a
previous step, the traversed vertex must match the bound vertex identity.

https://claude.ai/code/session_017hhsQf7fzBvXcGSeg48hfP
AssertJ's assertThat() has ambiguous overloads for Object return types
from getProperty(). Switched to JUnit assertEquals/assertTrue/assertFalse.

https://claude.ai/code/session_017hhsQf7fzBvXcGSeg48hfP
@mergify
Copy link
Contributor

mergify bot commented Jan 27, 2026

🧪 CI Insights

Here's what we observed from your CI run for 1fc22ba.

🟢 All jobs passed!

But CI Insights is watching 👀

@ExtReMLapin
Copy link
Contributor Author

Claude code did what gemini could not do.

Claude code wrote all of this.

@ExtReMLapin ExtReMLapin changed the title Fix label filtering in Cypher MATCH relationship patterns Fixex #3244 : label filtering in Cypher MATCH relationship patterns Jan 27, 2026
@ExtReMLapin ExtReMLapin changed the title Fixex #3244 : label filtering in Cypher MATCH relationship patterns Fixes #3244 : label filtering in Cypher MATCH relationship patterns Jan 27, 2026
@robfrank
Copy link
Collaborator

@claude review this pr

@claude
Copy link

claude bot commented Jan 27, 2026

Claude encountered an error —— View job

Command failed: git fetch origin --depth=20 claude/fix-arcadedb-cypher-8z1Ym

I'll analyze this and get back to you.

@lvca
Copy link
Contributor

lvca commented Jan 27, 2026

@claude please check this again

@claude
Copy link

claude bot commented Jan 27, 2026

Claude encountered an error —— View job

Command failed: git fetch origin --depth=20 claude/fix-arcadedb-cypher-8z1Ym

I'll analyze this and get back to you.

@ExtReMLapin
Copy link
Contributor Author

Claude encountered an error —— View job

Command failed: git fetch origin --depth=20 claude/fix-arcadedb-cypher-8z1Ym

I'll analyze this and get back to you.

@claude time to reconsider your career choices

@lvca
Copy link
Contributor

lvca commented Jan 27, 2026

@ExtReMLapin could you please rebase from main? Maybe claude is confused because it's too far apart?

@ExtReMLapin
Copy link
Contributor Author

Claude is confused because it's not a branch on your repo but on mine.

@lvca
Copy link
Contributor

lvca commented Jan 27, 2026

@claude please review this PR

@claude
Copy link

claude bot commented Jan 27, 2026

Claude encountered an error —— View job

Command failed: git fetch origin --depth=20 claude/fix-arcadedb-cypher-8z1Ym

I'll analyze this and get back to you.

@lvca
Copy link
Contributor

lvca commented Jan 27, 2026

I'm checking it manually - old fashion.

@lvca lvca self-requested a review January 27, 2026 16:06
@lvca lvca added this to the 26.2.1 milestone Jan 27, 2026
@lvca lvca merged commit def0349 into ArcadeData:main Jan 27, 2026
7 of 14 checks passed
@lvca
Copy link
Contributor

lvca commented Jan 27, 2026

Merged thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants