Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(expr): further optimize performance #744

Merged
merged 15 commits into from
Dec 20, 2022
Merged

Conversation

wangrunji0408
Copy link
Member

This PR includes several optimizations to expression evaluation.

  • modify the Array trait.
    • introduce is_null and get_raw (originally get_unchecked) as low-level methods.
    • remove iterator structures. use RPITIT to simplify code.
  • optimize the filter operation on the array.
  • optimize bitvec operations. its built-in operations have poor performance. 😕
  • optimize the cast to string array. introduce writer for Utf8Array to avoid generating String.
  • avoid using zip_eq on the critical path.

Bench Results

bench old new change
add(i32,i32) 1.2296 µs 365.56 ns -70.906%
mul(i32,i32) 1.2369 µs 373.86 ns -70.162%
eq(i32,i32) 1.1316 µs 904.02 ns -20.275%
gt(i32,i32) 1.1296 µs 910.48 ns -20.115%
add(f64,f64) 1.4912 µs 648.14 ns -56.439%
mul(f64,f64) 1.5012 µs 647.52 ns -57.020%
div(f64,f64) 1.5708 µs 728.73 ns -53.531%
eq(f64,f64) 2.8126 µs 2.5688 µs -8.5452%
gt(f64,f64) 3.7635 µs 3.5239 µs -6.2045%
add(decimal,decimal) 14.215 µs 13.384 µs -5.8360%
mul(decimal,decimal) 12.364 µs 11.544 µs -6.5028%
eq(decimal,decimal) 13.599 µs 13.805 µs +1.4521%
gt(decimal,decimal) 13.692 µs 13.849 µs +1.1735%
and(bool,bool) 8.4668 µs 2.3852 µs -71.878%
or(bool,bool) 8.4402 µs 1.2443 µs -85.264%
not(bool) 163.67 ns 772.42 ns +368.99%
sum(i32) 298.17 ns 295.43 ns -0.8082%
max(i32) 9.6741 µs 6.3693 µs -34.099%
first(i32) 5.7060 ns 4.7957 ns -15.959%
count(i32) 13.459 ns 13.465 ns -0.1833%
sum(f64) 3.6906 µs 3.6942 µs -0.0145%
max(f64) 17.019 µs 16.243 µs -4.6018%
first(f64) 5.9944 ns 4.8889 ns -18.660%
count(f64) 13.486 ns 13.462 ns +0.2111%
sum(decimal) 29.358 µs 29.617 µs -0.2345%
max(decimal) 23.269 µs 23.141 µs +0.1118%
first(decimal) 6.4851 ns 5.1673 ns -19.684%
count(decimal) 13.448 ns 13.462 ns +0.0345%
cast(i32->f64) 560.58 ns 556.22 ns +0.0332%
cast(f64->decimal) 123.47 µs 118.41 µs -4.0403%
cast(i32->string) 83.432 µs 52.788 µs -36.677%
cast(f64->string) 181.72 µs 165.45 µs -8.8352%
cast(decimal->string) 134.69 µs 92.167 µs -31.467%
filter(i32) 17.656 µs 7.2464 µs -58.961%

TPC-H Results

time(s) old new change
Q1 1.580 1.467 -7%
Q3 0.434 0.314 -28%
Q5 0.853 0.827 -3%
Q6 0.469 0.158 -66%
Q10 0.672 0.527 -22%

Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
@wangrunji0408 wangrunji0408 merged commit 0265c50 into main Dec 20, 2022
@wangrunji0408 wangrunji0408 deleted the byte-array-writer branch December 20, 2022 07:03
MingjiHan99 pushed a commit that referenced this pull request Dec 22, 2022
* optimize to string array

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize bitvec

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* avoid zip_eq for performance

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* array: add `is_null` and `get_raw`

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* add bench for array filter

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize filter -30%

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize filter from bool array

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* clear null data

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* fix cardinality error

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* remove array iterator

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* introduce non-null iterator

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize bitmap &&

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize BitVec operations

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* fix clippy and test

Signed-off-by: Runji Wang <wangrunji0408@163.com>

Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: MingjiHan <mjhan@bu.edu>
MingjiHan99 pushed a commit that referenced this pull request Dec 22, 2022
* optimize to string array

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize bitvec

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* avoid zip_eq for performance

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* array: add `is_null` and `get_raw`

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* add bench for array filter

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize filter -30%

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize filter from bool array

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* clear null data

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* fix cardinality error

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* remove array iterator

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* introduce non-null iterator

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize bitmap &&

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize BitVec operations

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* fix clippy and test

Signed-off-by: Runji Wang <wangrunji0408@163.com>

Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: MingjiHan <mjhan@bu.edu>
wangrunji0408 added a commit that referenced this pull request Dec 23, 2022
* fix(storage): compaction type error (#737)

* Fix compaction type error

Signed-off-by: Yue Yin <yueyin.dev@gmail.com>

* assert empty

Signed-off-by: Alex Chi <iskyzh@gmail.com>

Signed-off-by: Yue Yin <yueyin.dev@gmail.com>
Signed-off-by: Alex Chi <iskyzh@gmail.com>
Co-authored-by: Alex Chi <iskyzh@gmail.com>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* chore: bump sqllogictest to 0.9.0 (#736)

Signed-off-by: MingjiHan <mjhan@bu.edu>

* feat(storage): Dict encoding for compaction (#740)

* Dict encoding for compaction

Signed-off-by: Yue Yin <yueyin.dev@gmail.com>

* CI

Signed-off-by: Yue Yin <yueyin.dev@gmail.com>

* Add tests

Signed-off-by: Yue Yin <yueyin.dev@gmail.com>

Signed-off-by: Yue Yin <yueyin.dev@gmail.com>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* perf(expr): apply auto-vectorization and remove explicit SIMDs (#741)

* remove explicit simd

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* apply auto-vectorization for all binary/unary ops

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* add more bench for ops

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* SIMD accelerate &[bool] to BitVec

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize const expression evaluation

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* fix clippy

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* recover push

Signed-off-by: Runji Wang <wangrunji0408@163.com>

Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* release: v0.2 (#742)

Signed-off-by: Alex Chi <iskyzh@gmail.com>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* perf(expr): further optimize performance (#744)

* optimize to string array

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize bitvec

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* avoid zip_eq for performance

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* array: add `is_null` and `get_raw`

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* add bench for array filter

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize filter -30%

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize filter from bool array

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* clear null data

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* fix cardinality error

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* remove array iterator

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* introduce non-null iterator

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize bitmap &&

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* optimize BitVec operations

Signed-off-by: Runji Wang <wangrunji0408@163.com>

* fix clippy and test

Signed-off-by: Runji Wang <wangrunji0408@163.com>

Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* add path option (#747)

Signed-off-by: MingjiHan <mjhan@bu.edu>

Signed-off-by: MingjiHan <mjhan@bu.edu>

* wtf

Signed-off-by: MingjiHan <mjhan@bu.edu>

* updates

Signed-off-by: Mingji Han <mjhan@bu.edu>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* change python compile configurations

Signed-off-by: Mingji Han <mjhan@bu.edu>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* updates

Signed-off-by: Mingji Han <mjhan@bu.edu>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* change python compile configurations

Signed-off-by: Mingji Han <mjhan@bu.edu>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* add python type conversion

Signed-off-by: Mingji Han <mjhan@bu.edu>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* add docs

Signed-off-by: Mingji Han <mjhan@bu.edu>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* fix format

Signed-off-by: Mingji Han <mjhan@bu.edu>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* fix macos

Signed-off-by: MingjiHan <mjhan@bu.edu>

* fix linux...

Signed-off-by: MingjiHan <mjhan@bu.edu>

* support macOS complication

Signed-off-by: MingjiHan <mjhan@bu.edu>

* Update docs/07-python-extension.md

Co-authored-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* Update docs/07-python-extension.md

Co-authored-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* Update src/lib.rs

Co-authored-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* Update src/lib.rs

Co-authored-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: MingjiHan <mjhan@bu.edu>

* move files

Signed-off-by: MingjiHan <mjhan@bu.edu>

Signed-off-by: Yue Yin <yueyin.dev@gmail.com>
Signed-off-by: Alex Chi <iskyzh@gmail.com>
Signed-off-by: MingjiHan <mjhan@bu.edu>
Signed-off-by: Runji Wang <wangrunji0408@163.com>
Signed-off-by: Mingji Han <mjhan@bu.edu>
Co-authored-by: Yue Yin <41224888+yinfredyue@users.noreply.github.com>
Co-authored-by: Alex Chi <iskyzh@gmail.com>
Co-authored-by: xxchan <xxchan22f@gmail.com>
Co-authored-by: Runji Wang <wangrunji0408@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants