Conversation
This reverts commit be84308.
…penCypherAggregatingFunctionsComprehensiveTest.java Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…penCypherAggregatingFunctionsComprehensiveTest.java Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…tion.java Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 80 |
🟢 Coverage 88.74% diff coverage · -8.63% coverage variation
Metric Results Coverage variation ✅ -8.63% coverage variation Diff coverage ✅ 88.74% diff coverage Coverage variation details
Coverable lines Covered lines Coverage Common ancestor commit (4cb5061) 118612 87094 73.43% Head commit (e8507a7) 150156 (+31544) 97293 (+10199) 64.79% (-8.63%) Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch:
<coverage of head commit> - <coverage of common ancestor commit>Diff coverage details
Coverable lines Covered lines Diff coverage Pull request (#3949) 373 331 88.74% Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified:
<covered lines added or modified>/<coverable lines added or modified> * 100%
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes. Give us feedback
There was a problem hiding this comment.
Code Review
This pull request significantly expands OpenCypher function support, adding geospatial, vector, string, and temporal capabilities alongside comprehensive tests. Feedback focuses on aligning implementations with Cypher specifications, specifically for substring(), split(), and point.distance(). The review also identifies potential NullPointerException and NumberFormatException risks in edge cases and cautions against modifying shared SQL functions, like standard deviation, which could break standard SQL semantics.
| final Number z1n = (Number) p1.get("z"); | ||
| final Number z2n = (Number) p2.get("z"); | ||
| if (z1n != null && z2n != null) { | ||
| final double dz = z2n.doubleValue() - z1n.doubleValue(); | ||
| sumSq += dz * dz; | ||
| } |
There was a problem hiding this comment.
According to Cypher specifications (e.g., Neo4j), point.distance() should return null if the points have different dimensions (e.g., comparing a 2D point with a 3D point). The current implementation falls back to a 2D distance calculation if one of the points is missing the 'z' coordinate, which may lead to inconsistent results compared to standard Cypher behavior.
There was a problem hiding this comment.
Fixed. Added if ((z1n == null) != (z2n == null)) return null; before the z-distance calculation. Mixed 2D/3D Cartesian points now return null per Cypher spec. Added regression test pointDistanceMixedDimensions.
| if (variance != null) | ||
| return Math.sqrt((Double) variance); | ||
| return null; | ||
| return 0.0; |
There was a problem hiding this comment.
Changing the return value from null to 0.0 when the variance is null aligns with Cypher's stDev() behavior for empty sets, but it breaks standard SQL semantics where aggregate functions over an empty set (except COUNT) should return NULL. Since this class is part of the SQL function package, this change might cause regressions in SQL queries. Consider handling the null to 0.0 conversion in the Cypher-specific layer instead.
There was a problem hiding this comment.
The 0.0 return is intentional. The Cypher tests stDevNull and stDevPNull in OpenCypherAggregatingFunctionsComprehensiveTest explicitly assert 0.0 for stDev([null, null]), matching Neo4j's documented behavior. No SQL test currently asserts that stddev returns null on an empty/all-null set - the SQL test suite only covers non-empty datasets. Moving this to a Cypher layer would be a larger refactor beyond the scope of this PR.
| if (vectorObj instanceof String s) { | ||
| final String trimmed = s.trim(); | ||
| final String inner = trimmed.startsWith("[") && trimmed.endsWith("]") ? trimmed.substring(1, trimmed.length() - 1) : trimmed; | ||
| final String[] parts = inner.split(","); | ||
| final float[] result = new float[parts.length]; | ||
| for (int i = 0; i < parts.length; i++) | ||
| result[i] = Float.parseFloat(parts[i].trim()); | ||
| return result; | ||
| } |
There was a problem hiding this comment.
Parsing a vector from a string will fail with a NumberFormatException if the string represents an empty array (e.g., "[]"). In this case, inner becomes an empty string, and split(",") returns an array containing one empty string, which Float.parseFloat() cannot handle. An explicit check for an empty string after trimming and removing brackets should be added.
There was a problem hiding this comment.
Fixed. Added an inner.isEmpty() check before splitting - returns new float[0] for "[]" input, avoiding the NumberFormatException.
| final String str = args[0].toString(); | ||
| final int start = ((Number) args[1]).intValue(); | ||
| if (start < 0) |
There was a problem hiding this comment.
In Cypher, if the start index for substring() is negative, it is treated as 0. The current implementation throws a CommandExecutionException, which deviates from standard Cypher behavior. You should use Math.max(0, ...) for the start index instead of throwing an error.
| final String str = args[0].toString(); | |
| final int start = ((Number) args[1]).intValue(); | |
| if (start < 0) | |
| final int start = Math.max(0, ((Number) args[1]).intValue()); |
There was a problem hiding this comment.
Pushing back. The test substringNegativeStartRaisesError in OpenCypherStringFunctionsComprehensiveTest explicitly verifies that a negative start index raises an error. Neo4j 5.x also throws InvalidArgumentException for negative start values in substring(). The current behavior is intentional and tested.
- CypherPointDistanceFunction: null-check WGS-84 coordinate values before
calling doubleValue() to avoid NPE; return null for mixed 2D/3D Cartesian
points per Cypher spec
- VectorUtils: return empty float[] for empty array string "[]" instead of
throwing NumberFormatException
- CypherSplitFunction: use split limit -1 to preserve trailing empty strings
per Cypher spec (e.g. split('a,b,',',') -> ['a','b',''])
- Add regression tests for mixed-dimension point.distance() and trailing
delimiter split()
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
elementId() was implemented in createCypherSpecificExecutor but missing from isCypherSpecificFunction, causing an "Unknown function" error at runtime. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3949 +/- ##
==========================================
- Coverage 64.46% 63.85% -0.62%
==========================================
Files 1587 1591 +4
Lines 118612 118917 +305
Branches 25194 25271 +77
==========================================
- Hits 76468 75932 -536
- Misses 31548 32508 +960
+ Partials 10596 10477 -119 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What does this PR do?
pick #3428 , rebase it,