Fix bug in distinctCountRawHLL on SQL path #5494
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
DISTINCTCOUNTRAWHLL() aggregation function returns serialized HLL bytes as STRINGn for users to do some aggregation on their side. While the final return type of the function is STRING, the final return value (SerializedHll object) is never transformed in the broker reducer (for SQL path) to hex encoded string.
So the ResultTable has the column type as STRING but the data is object of type SerializedHll. We need to format the final return value to a serializable value on the SQL path just like we have done on PQL path.
Enhanced the tests.
NOTE: Ideally each query execution unit test under
src/test/java/org/apache/pinot/queries
should be enhanced to run its SQL counterpart. Right now there are multiple InterSegment files with potentially duplicate tests and not sure if all of them are indeed exercising the entire SQL execution path. Quite a bit of refactoring is needed which I will do.This fix is needed soon for our next internal build to let folks continue SQL testing.
Upgrade Notes
Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)
No
Does this PR fix a zero-downtime upgrade introduced earlier?
No
Does this PR otherwise need attention when creating release notes? Things to consider:
No
Release Notes
Documentation
None