Optimization for selection only queries: Allow early termination#5163
Optimization for selection only queries: Allow early termination#5163Jackie-Jiang merged 5 commits intomasterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #5163 +/- ##
============================================
- Coverage 66.02% 65.97% -0.06%
Complexity 12 12
============================================
Files 1050 1050
Lines 54117 54112 -5
Branches 8071 8067 -4
============================================
- Hits 35729 35698 -31
- Misses 15734 15765 +31
+ Partials 2654 2649 -5 Continue to review full report at Codecov.
|
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Very good optimization, thanks
c98b96e to
5ad1acf
Compare
d8eee19 to
12f7b9f
Compare
|
@fx19880617 I pushed a commit to address the metadata issue with the early termination. The fixed metadata are: |
lgtm! 👍 |
In Pinot CombineOpearator, queries are scheduled with at most 10 threads. For tables with many segments(say 10k), it means each thread will process 1k segments.
For selection only queries (e.g. select * from myTable limit 10), each thread may collect enough results to return after scan a few segments. There is no means to wait and scan all the segments.
This is extremely useful for people want to randomly browse a big table from query console clicks.