Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix duplicate results and failures of queries with indexes and query parameters and improve index scan generation [HZ-3014] [HZ-3013] #25527

Merged
merged 4 commits into from Oct 12, 2023

Conversation

k-jamroz
Copy link
Collaborator

@k-jamroz k-jamroz commented Sep 25, 2023

In order to generate correct results for queries with query parameters normalisation of index iteration pointers set is performed to make it disjoint and ordered. This has to be done late, when query parameter values are known (during execution not optimisation), for literals this happens during query optimisation. This process uses special treatment of NULL value which can be compared and satisfies the following ordering: -Inf < NULL < non null value < +Inf. This ordering is compatible with SQL ORDER BY x NULLS FIRST and ORDER BY x DESC NULLS LAST - this is the only currently available NULL placement in ordered results.

After that some obsolete restrictions for index usage (e.g. disjunction of ranges) could be removed and IS NOT NULL predicates can also be correctly executed using index.

Fixes HZ-3013, HZ-3014

Checklist:

  • Labels (Team:, Type:, Source:, Module:) and Milestone set
  • Label Add to Release Notes or Not Release Notes content set
  • Request reviewers if possible
  • Send backports/forwardports if fix needs to be applied to past/future releases

@k-jamroz k-jamroz changed the title Normalize index iteration pointers to make them disjoint and ordered Normalize index iteration pointers to make them disjoint and ordered [HZ-3014] Sep 25, 2023
@k-jamroz k-jamroz added this to the 5.4.0 milestone Sep 25, 2023
@hz-devops-test
Copy link

The job Hazelcast-pr-builder of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file
---------ERRORS-----------
--------------------------
[ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder_2/hazelcast/src/main/java/com/hazelcast/internal/iteration/IndexIterationPointer.java:149:84: '&&' should be on a new line. [OperatorWrap]
--------------------------
[ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder_2/hazelcast/src/main/java/com/hazelcast/internal/iteration/IndexIterationPointer.java:272:13: Name 'rf_lt' must match pattern '^[a-z][a-zA-Z0-9]*$'. [LocalVariableName]
--------------------------
[ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder_2/hazelcast/src/test/java/com/hazelcast/internal/iteration/IndexIterationPointerTest.java:1: File does not end with a newline. [NewlineAtEndOfFile]
--------------------------
[ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder_2/hazelcast/src/test/java/com/hazelcast/internal/iteration/IndexIterationPointerTest.java:1: Line does not match expected header line of '/*'. [Header]
--------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.3.0:checkstyle (default) on project hazelcast: An error has occurred in Checkstyle report generation. Failed during checkstyle execution: There are 4 errors reported by Checkstyle 9.3 with /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder_2/checkstyle/checkstyle.xml ruleset. -> [Help 1]
--------------------------
[ERROR] 
--------------------------
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
--------------------------
[ERROR] 
--------------------------
[ERROR] For more information about the errors and possible solutions, please read the following articles:
--------------------------
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
--------------------------
[ERROR] 
--------------------------
[ERROR] After correcting the problems, you can resume the build with the command
--------------------------
[ERROR]   mvn  -rf :hazelcast
--------------------------

@hz-devops-test
Copy link

The job Hazelcast-pr-compiler of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file
---------ERRORS-----------
--------------------------
[ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-compiler/hazelcast/src/main/java/com/hazelcast/internal/iteration/IndexIterationPointer.java:149:84: '&&' should be on a new line. [OperatorWrap]
--------------------------
[ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-compiler/hazelcast/src/main/java/com/hazelcast/internal/iteration/IndexIterationPointer.java:272:13: Name 'rf_lt' must match pattern '^[a-z][a-zA-Z0-9]*$'. [LocalVariableName]
--------------------------
[ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-compiler/hazelcast/src/test/java/com/hazelcast/internal/iteration/IndexIterationPointerTest.java:1: File does not end with a newline. [NewlineAtEndOfFile]
--------------------------
[ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-compiler/hazelcast/src/test/java/com/hazelcast/internal/iteration/IndexIterationPointerTest.java:1: Line does not match expected header line of '/*'. [Header]
--------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.3.0:checkstyle (default) on project hazelcast: An error has occurred in Checkstyle report generation. Failed during checkstyle execution: There are 4 errors reported by Checkstyle 9.3 with /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-compiler/checkstyle/checkstyle.xml ruleset. -> [Help 1]
--------------------------
[ERROR] 
--------------------------
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
--------------------------
[ERROR] 
--------------------------
[ERROR] For more information about the errors and possible solutions, please read the following articles:
--------------------------
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
--------------------------
[ERROR] 
--------------------------
[ERROR] After correcting the problems, you can resume the build with the command
--------------------------
[ERROR]   mvn  -rf :hazelcast
--------------------------

@hz-devops-test
Copy link

The job Hazelcast-pr-builder of your PR failed. (Hazelcast internal details: build log, artifacts).
Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log file
---------ERRORS-----------
--------------------------
[ERROR] /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder/hazelcast-sql/src/main/java/com/hazelcast/jet/sql/impl/connector/map/QueryUtil.java:174:110: ')' is preceded with whitespace. [ParenPad]
--------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.3.0:checkstyle (default) on project hazelcast-sql: An error has occurred in Checkstyle report generation. Failed during checkstyle execution: There is 1 error reported by Checkstyle 9.3 with /home/jenkins/jenkins_slave/workspace/Hazelcast-pr-builder/checkstyle/checkstyle.xml ruleset. -> [Help 1]
--------------------------
[ERROR] 
--------------------------
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
--------------------------
[ERROR] 
--------------------------
[ERROR] For more information about the errors and possible solutions, please read the following articles:
--------------------------
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
--------------------------
[ERROR] 
--------------------------
[ERROR] After correcting the problems, you can resume the build with the command
--------------------------
[ERROR]   mvn  -rf :hazelcast-sql
--------------------------

@k-jamroz k-jamroz changed the title Normalize index iteration pointers to make them disjoint and ordered [HZ-3014] Fix duplicate results and failures of queries with indexes and query parameters and improve index scan generation [HZ-3014] [HZ-3013] Sep 26, 2023
@hazelcast hazelcast deleted a comment from hz-devops-test Sep 26, 2023
@hazelcast hazelcast deleted a comment from hz-devops-test Sep 26, 2023
@hazelcast hazelcast deleted a comment from hz-devops-test Sep 26, 2023
@hazelcast hazelcast deleted a comment from hz-devops-test Sep 26, 2023
// both ends are equal, but might differ in inclusiveness,
// use one which covers bigger range
return tuple2(left, leftInclusive || rightInclusive);
} else if (result < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: here you can use ternary operator, if you like it.

// both ends are equal, but might differ in inclusiveness,
// use one which covers bigger range
return tuple2(left, leftInclusive || rightInclusive);
} else if (result < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same as above

@@ -142,9 +149,14 @@ private static IndexComponentFilter convertFromRangeFilters(
toInclusive = candidateFilter.isToInclusive();
expressions.add(candidate.getExpression());
}

if (from == null && to == null && candidateFilter.getFrom() == null && candidateFilter.getTo() == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR, but feels like all these specific condition checks to handle NULL should live somewhere outside of this class or alternatively be moved to their own separate methods to convey logic more clearly, although maybe comments would work too.

Copy link
Collaborator Author

@k-jamroz k-jamroz Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the null/NULL checks generally feel very inconsistent, especially in different layers (optimizer, DAG, Index and IndexStore, Predicate API has also its own logic). I had lot's of headaches because of that but I did not feel confident enough to make it more consistent. Sometimes null means NULL, sometimes unbounded (including NULL or not), sometimes it is invalid... (probably somewhere is also means NOT NULL - for range filters)

@k-jamroz
Copy link
Collaborator Author

run-ee-tests

@k-jamroz k-jamroz merged commit 3a0107a into hazelcast:master Oct 12, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants