Skip to content

Optimization for selection only queries: Allow early termination#5163

Merged
Jackie-Jiang merged 5 commits intomasterfrom
early_termination_for_select_only
Mar 18, 2020
Merged

Optimization for selection only queries: Allow early termination#5163
Jackie-Jiang merged 5 commits intomasterfrom
early_termination_for_select_only

Conversation

@xiangfu0
Copy link
Copy Markdown
Contributor

In Pinot CombineOpearator, queries are scheduled with at most 10 threads. For tables with many segments(say 10k), it means each thread will process 1k segments.
For selection only queries (e.g. select * from myTable limit 10), each thread may collect enough results to return after scan a few segments. There is no means to wait and scan all the segments.
This is extremely useful for people want to randomly browse a big table from query console clicks.

@codecov-io
Copy link
Copy Markdown

codecov-io commented Mar 18, 2020

Codecov Report

Merging #5163 into master will decrease coverage by 0.05%.
The diff coverage is 94.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #5163      +/-   ##
============================================
- Coverage     66.02%   65.97%   -0.06%     
  Complexity       12       12              
============================================
  Files          1050     1050              
  Lines         54117    54112       -5     
  Branches       8071     8067       -4     
============================================
- Hits          35729    35698      -31     
- Misses        15734    15765      +31     
+ Partials       2654     2649       -5     
Impacted Files Coverage Δ Complexity Δ
...rg/apache/pinot/core/operator/CombineOperator.java 65.62% <72.72%> (-0.67%) 0.00 <0.00> (ø)
...he/pinot/core/operator/CombineGroupByOperator.java 84.44% <100.00%> (+0.74%) 0.00 <0.00> (ø)
...t/core/operator/CombineGroupByOrderByOperator.java 85.32% <100.00%> (+0.63%) 0.00 <0.00> (ø)
...g/apache/pinot/core/operator/DocIdSetOperator.java 93.54% <100.00%> (+0.44%) 0.00 <0.00> (ø)
...ore/operator/query/AggregationGroupByOperator.java 95.23% <100.00%> (-0.42%) 0.00 <0.00> (ø)
...rator/query/AggregationGroupByOrderByOperator.java 91.66% <100.00%> (-0.65%) 0.00 <0.00> (ø)
...pinot/core/operator/query/AggregationOperator.java 94.44% <100.00%> (-0.56%) 0.00 <0.00> (ø)
...ator/query/DictionaryBasedAggregationOperator.java 88.88% <100.00%> (-0.40%) 0.00 <0.00> (ø)
...erator/query/MetadataBasedAggregationOperator.java 87.50% <100.00%> (ø) 0.00 <0.00> (ø)
...not/core/operator/query/SelectionOnlyOperator.java 97.29% <100.00%> (-0.08%) 0.00 <0.00> (ø)
... and 27 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bbaa1d9...3f568b5. Read the comment docs.

Copy link
Copy Markdown
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good optimization, thanks

Comment thread pinot-core/src/main/java/org/apache/pinot/core/operator/CombineOperator.java Outdated
Comment thread pinot-core/src/test/java/org/apache/pinot/queries/BaseSingleValueQueriesTest.java Outdated
@xiangfu0 xiangfu0 force-pushed the early_termination_for_select_only branch from c98b96e to 5ad1acf Compare March 18, 2020 02:56
@xiangfu0 xiangfu0 force-pushed the early_termination_for_select_only branch from d8eee19 to 12f7b9f Compare March 18, 2020 19:11
@Jackie-Jiang
Copy link
Copy Markdown
Contributor

@fx19880617 I pushed a commit to address the metadata issue with the early termination. The fixed metadata are: numSegmentsProcessed which is the number of segments after the segment pruning; numTotalDocs which is the total number of documents for the table. Please take a look

@xiangfu0
Copy link
Copy Markdown
Contributor Author

@fx19880617 I pushed a commit to address the metadata issue with the early termination. The fixed metadata are: numSegmentsProcessed which is the number of segments after the segment pruning; numTotalDocs which is the total number of documents for the table. Please take a look

lgtm! 👍

@xiangfu0 xiangfu0 requested a review from Jackie-Jiang March 18, 2020 20:50
@xiangfu0 xiangfu0 requested review from kishoreg and npawar March 18, 2020 22:27
@Jackie-Jiang Jackie-Jiang merged commit 917493f into master Mar 18, 2020
@Jackie-Jiang Jackie-Jiang deleted the early_termination_for_select_only branch March 18, 2020 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants