New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHOENIX-6751 Force using range scan vs skip scan when using large IN clause #1495
Conversation
@tkhurana Please review |
1 similar comment
@tkhurana Please review |
// is below the configured max (maxInListSkipScanSize). | ||
// We shall force a range scan if the configured max is exceeded. | ||
// cnfStartPos => is the start slot of this IN list | ||
if (checkMaxSkipScanCardinality) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this conversion of skip scan to range scan configurable because range scan on bigger data sets is slow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's already a config param to control how many elements triggers the conversion, which could be used to turn it off by setting it very high. However, when we get in this state (high cardinality skip scans) we find that we get OOM exceptions even with large client-side heaps, which is worse than a slow query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chrajeshbabu - do you still have concerns about this patch or is it ready to be merged?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chrajeshbabu For some queries especially with RVC expression and mixed sort orders, the cost of optimization results in huge memory allocations and sometimes even exceeds the number of KEY_RANGES allowed This JIRA provides a framework for opting out the optimization path when a certain threshold is reached. We will be working towards an algorithm that is more linear in nature than combinatorial as is the case today.
@gjacoby126 The difference is the use of biginteger to avoid overflow issues when determining whether to use skip scan or range scan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
No description provided.