-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix issue with auto column grouping #16489
Conversation
changes: * fixes bug where AutoTypeColumnIndexer reports incorrect cardinality, allowing it to incorrectly use array grouper algorithm for realtime queries producing incorrect results for strings * fixes bug where auto LONG and DOUBLE type columns incorrectly report not having null values, resulting in incorrect null handling when grouping
Thank you very much for PR. Do we still have underline issue in array groupers to treat single value field ? |
No i think its fine, the issue here was that the indexer was reporting a cardinality through the storage adapter, which made the array grouper think that it was dictionary encoded in a way where it could use the dictionary ids, but the indexer was not providing a dictionary encoded selector that fit that contract, so correcting the cardinality reporting solves that problem |
* fix issue with auto column grouping changes: * fixes bug where AutoTypeColumnIndexer reports incorrect cardinality, allowing it to incorrectly use array grouper algorithm for realtime queries producing incorrect results for strings * fixes bug where auto LONG and DOUBLE type columns incorrectly report not having null values, resulting in incorrect null handling when grouping * fix test
* fix issue with auto column grouping (#16489) * fix issue with auto column grouping changes: * fixes bug where AutoTypeColumnIndexer reports incorrect cardinality, allowing it to incorrectly use array grouper algorithm for realtime queries producing incorrect results for strings * fixes bug where auto LONG and DOUBLE type columns incorrectly report not having null values, resulting in incorrect null handling when grouping * fix test * Fix checkstyle --------- Co-authored-by: Clint Wylie <cwylie@apache.org>
changes: * fix issue similar to apache#16489 but for NestedDataColumnIndexerV4, which can report STRING type if it only processes a single type of values. this should be less common than the auto indexer problem * fix some issues with sql benchmarks
* fix issue with auto column grouping changes: * fixes bug where AutoTypeColumnIndexer reports incorrect cardinality, allowing it to incorrectly use array grouper algorithm for realtime queries producing incorrect results for strings * fixes bug where auto LONG and DOUBLE type columns incorrectly report not having null values, resulting in incorrect null handling when grouping * fix test
* fix NestedDataColumnIndexerV4 to not report cardinality changes: * fix issue similar to #16489 but for NestedDataColumnIndexerV4, which can report STRING type if it only processes a single type of values. this should be less common than the auto indexer problem * fix some issues with sql benchmarks
Description
changes:
This PR has: