fix(opencypher): MATCH WHERE ID(n) = <expr> falls back to full scan when expr is dynamic#3865
Conversation
…hen expr is dynamic
|
Gemini wrote that, claude checked it |
There was a problem hiding this comment.
Code Review
This pull request implements dynamic ID filtering within MatchNodeStep to optimize OpenCypher queries where the node ID is specified in a WHERE clause. It introduces logic to extract ID filters from boolean expressions and refactors the ExpressionEvaluator into a class member for reuse. Review feedback highlights a potential NullPointerException when a vertex is not found by its RID and suggests optimizing the AST traversal by pre-analyzing the filter expression.
| final RID rid = new RID(context.getDatabase(), effectiveIdFilter); | ||
| final Identifiable vertex = context.getDatabase().lookupByRID(rid, true); | ||
| // Return single-element iterator for the matched vertex | ||
| return List.of(vertex).iterator(); |
There was a problem hiding this comment.
The List.of(vertex) call will throw a NullPointerException if the RID is valid but the vertex does not exist in the database (i.e., lookupByRID returns null). Since List.of does not allow null elements, you should check for null and return an empty iterator instead.
final RID rid = new RID(context.getDatabase(), effectiveIdFilter);
final Identifiable vertex = context.getDatabase().lookupByRID(rid, true);
if (vertex == null)
return Collections.emptyIterator();
// Return single-element iterator for the matched vertex
return List.of(vertex).iterator();| // Check for dynamic ID filter from WHERE clause if static idFilter is not present | ||
| String effectiveIdFilter = this.idFilter; | ||
| if ((effectiveIdFilter == null || effectiveIdFilter.isEmpty()) && whereFilter != null) { | ||
| effectiveIdFilter = extractDynamicIdFilter(whereFilter, currentInputResult); |
There was a problem hiding this comment.
The extractDynamicIdFilter method is called for every input row and performs a recursive traversal of the whereFilter AST. Since the structure of the whereFilter is constant for the duration of this execution step, this traversal is redundant.
It would be more efficient to pre-analyze the whereFilter once (e.g., in the constructor) to identify the Expression that provides the ID value, and then simply evaluate that expression here. Additionally, consider supporting elementId() in addition to id() for broader Cypher compatibility.
Up to standards ✅🟢 Issues
|
|
According to claude (but I don't have any tokens left) There might be other cases where it's not fixed, for example UNWIND $batch AS BatchEntry
MATCH (b:CHUNK {someKey: BatchEntry.value}) -- inline props, not WHERE clause
|
|
It actually makes sense! Merged, thanks!! I'm going to write some test cases to avoid future regressions. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3865 +/- ##
==========================================
+ Coverage 64.66% 64.94% +0.28%
==========================================
Files 1579 1579
Lines 116503 116618 +115
Branches 24707 24749 +42
==========================================
+ Hits 75335 75742 +407
+ Misses 30871 30504 -367
- Partials 10297 10372 +75 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Fixes #3864
Long story short, in
MATCH (b:CHUNK) WHERE ID(b) = BatchEntry.destRIDis doing a full scan on CHUNK