-
Notifications
You must be signed in to change notification settings - Fork 994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHOENIX-6960 Scan range is incorrect when query desc columns #1663
Conversation
boolean lowerInclusive = column.getSortOrder() == SortOrder.ASC; | ||
boolean upperInclusive = column.getSortOrder() == SortOrder.DESC; | ||
KeyRange range = type.getKeyRange(lowerRange, lowerInclusive, upperRange, | ||
upperInclusive, SortOrder.ASC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have debugged into this, and after inverting twice, we end up returning a non-inverted range, which eventually gets inverted in WhereOptimizer.pushKeyExpressionsToScan().
Do we even need to take SortOrder into account here, and if yes, couldn't we simplify this logic ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm talking the whole method, not the specific line above.
// TODO: is there a case where we'd need to go through the childPart to calculate the key range? | ||
PColumn column = childSlot.getKeyPart().getColumn(); | ||
PDataType type = column.getDataType(); | ||
byte[] key = PVarchar.INSTANCE.toBytes(startsWith, column.getSortOrder()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where we get an inverted range
boolean lowerInclusive = column.getSortOrder() == SortOrder.ASC; | ||
boolean upperInclusive = column.getSortOrder() == SortOrder.DESC; | ||
KeyRange range = type.getKeyRange(lowerRange, lowerInclusive, upperRange, | ||
upperInclusive, SortOrder.ASC); | ||
if (column.getSortOrder() == SortOrder.DESC) { | ||
range = range.invert(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this is wehere we re-invert, getting a normal range back.
Thank you for taking a look @stoty, and yes you are correct that we have scope for optimization here, i just addressed your review with the latest revision. Also, @jinggou found some interesting test case failures, i tried to address them. Thank you for running additional tests, Jing. Created 5.1 backport PR: #1668 @stoty @jinggou could you please take a look again? (not urgent, i might likely be able to come online after 5 days) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM
@virajjasani can you also add some unit tests in |
LGTM. The test case that failed before passes now (e.g., descending columns with LIKE x%). |
done |
// TODO: is there a case where we'd need to go through the childPart to calculate the key range? | ||
PColumn column = childSlot.getKeyPart().getColumn(); | ||
PDataType type = column.getDataType(); | ||
byte[] key = PVarchar.INSTANCE.toBytes(startsWith, SortOrder.ASC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might work in most of the cases but I'd be a little worried in the admittedly weird case of where the LIKE contains a DESC column reference. Maybe add a test here and see? Likely it won't extract any keys but worth a look. @virajjasani @stoty?
SELECT * FROM table WHERE col1 LIKE ('abc' || col2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting! do we support col1 LIKE ('xy%' || col2)
case?
i tried this and saw that we return right here as null KeySlots due to empty childParts:
@Override
public KeySlots visitLeave(LikeExpression node, List<KeySlots> childParts) {
// TODO: optimize ILIKE by creating two ranges for the literal prefix: one with lower case, one with upper case
if (childParts.isEmpty()) {
return null;
}
...
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even col1 LIKE col2
doesn't seem to be working as childParts are coming empty, resulting in null KeySlots
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, we cannot use values from the table for constructing scan filters on the rowkey (unless they are coming from uncorrelated subqueries, but those are effectively constants)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the test, @virajjasani I do think that in theory 'xy%' || col2
would be extractable as the prefix here is a constant though I was guessing we don't try to optimize this currently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that seems the case, this also got me to look into what postgres has, whether they allow col reference but seems like they also support constant or expression (bit wider variety of regular expressions): https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-LIKE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @dbwong
Jira: PHOENIX-6960