Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement DISTINCT in aggregation operator
This makes it possible to execute DISTINCT aggregation queries without relying on the MarkDistinct operator, which requires one shuffle for each unique combination of DISTINCT arguments to aggregations. A new config option (optimizer.use-mark-distinct) and session property (use_mark_distinct) control whether the MultipleDistinctToMarkDistict optimizer fires. It's worth noting that when the MarkDistinct optimization is disabled and aggregations contain DISTINCT inputs, aggregations will be planned as SINGLE-step. A future improvement could introduce a partial aggregation step that produces deduped subsets as intermediates.
- Loading branch information
Showing
18 changed files
with
411 additions
and
72 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.