-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Search before asking
- I had searched in the issues and found no similar issues.
Description
After SegmentIterator Vectorization PR merged, there is still some todo for it;
This ISSUE tried to solve some performance problems.
Solution
Test SQL
SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue
FROM lineorder_flat
WHERE LO_ORDERDATE >= 19930101 and LO_ORDERDATE <= 19931231 AND LO_DISCOUNT BETWEEN 1 AND 3 AND LO_QUANTITY < 25;
Initial performance test:
code version:SegmentIterator row version
- BlockLoadTime: 3s687ms
- VectorPredEvalTime: 778.640ms
- BlockSeekCount: 5.36M
code version: SegmentIterator vectorization
- BlockLoadTime: 4s140ms
- VectorPredEvalTime: 256.926ms
- BlockSeekCount: 5.36M
Analysis
1 After SegIter is vectorized, the performance is dropped.
2 The predicate calculation performance is indeed improved, but the overall impact is not large
3 BlockSeekCount is too big, it can be optimized.
Optimization 1: remove timer BlockSeekTime
- BlockLoadTime: 3s512ms
Optimization 2(based on opt 1): Batch insert column vector in BitShufflePageDecoder.next_batch
- BlockLoadTime: 3s105ms
Optimization 3(based on opt1, opt2): eliminate lazy materialization
- BlockLoadTime: 2s641ms
- BlockSeekCount: 175.02K
We can seeBlockSeekCountreduced much.
Optimization 4(based on op1, opt2, opt3): set doris_scanner_thread_pool_thread_num = 1
- BlockLoadTime: 1s665ms
Performance is further improved, but the whole sql may cost more time.
Then I wonder whether original version has the same problem
Origin Version Test: set doris_scanner_thread_pool_thread_num = 1 vs default value
set doris_scanner_thread_pool_thread_num = false value
- BlockLoadTime: 3s571ms
set doris_scanner_thread_pool_thread_num = 1
- BlockLoadTime: 2s232ms
We can see that the origin version has the same problem, this may be related to memory allocation under multithreading, this need further research.
I will submit a PR for opt1, opt2, opt3
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct