Add extra inlining to speed up take #226

Dandandan · 2021-07-25T06:19:33Z

They had some inline hints on the methods, but not all the way to extend_from_trusted_len_iter_unchecked and try_from_trusted_len_iter_unchecked.

This has changes around 30-50%

Benchmarking take i32 512: Collecting 100 samples in estimated 5.0009 s (17M ite                                                                                take i32 512            time:   [300.35 ns 301.92 ns 303.62 ns]
                        change: [-51.942% -51.683% -51.411%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

Benchmarking take i32 1024: Collecting 100 samples in estimated 5.0032 s (7.4M i                                                                                take i32 1024           time:   [674.49 ns 675.96 ns 677.54 ns]
                        change: [-47.687% -47.558% -47.445%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  8 (8.00%) high mild

Benchmarking take i32 nulls 512: Collecting 100 samples in estimated 5.0013 s (9                                                                                take i32 nulls 512      time:   [511.26 ns 511.65 ns 512.11 ns]
                        change: [-1.1965% -0.9889% -0.7859%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

Benchmarking take i32 nulls 1024: Collecting 100 samples in estimated 5.0011 s (                                                                                take i32 nulls 1024     time:   [796.57 ns 797.00 ns 797.49 ns]
                        change: [-12.886% -12.768% -12.652%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  3 (3.00%) high mild
  8 (8.00%) high severe

Benchmarking take str 512: Collecting 100 samples in estimated 5.0041 s (1.6M it                                                                                take str 512            time:   [3.0675 us 3.0691 us 3.0705 us]
                        change: [-19.937% -19.801% -19.674%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

Benchmarking take str 1024: Collecting 100 samples in estimated 5.0202 s (1.0M i                                                                                take str 1024           time:   [4.9393 us 4.9453 us 4.9512 us]
                        change: [-38.380% -38.271% -38.141%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

Benchmarking take str null indices 512: Collecting 100 samples in estimated 5.01                                                                                take str null indices 512                        
                        time:   [2.9738 us 2.9800 us 2.9856 us]
                        change: [-33.972% -33.766% -33.544%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

Benchmarking take str null indices 1024: Collecting 100 samples in estimated 5.0                                                                                take str null indices 1024                        
                        time:   [5.2084 us 5.2177 us 5.2279 us]
                        change: [-35.800% -35.646% -35.513%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild

Benchmarking take str null values 1024: Collecting 100 samples in estimated 5.03                                                                                take str null values 1024                        
                        time:   [11.351 us 11.362 us 11.375 us]
                        change: [-32.987% -32.304% -31.477%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

Benchmarking take str null values null indices 1024: Collecting 100 samples in e                                                                                take str null values null indices 1024                        
                        time:   [10.549 us 10.584 us 10.617 us]
                        change: [-22.214% -21.878% -21.539%] (p = 0.00 < 0.05)
                        Performance has improved.

codecov · 2021-07-25T06:28:56Z

Codecov Report

Merging #226 (558bfa2) into main (eaa9be9) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main     #226   +/-   ##
=======================================
  Coverage   76.94%   76.94%           
=======================================
  Files         229      229           
  Lines       19536    19536           
=======================================
  Hits        15031    15031           
  Misses       4505     4505

Impacted Files	Coverage Δ
src/buffer/mutable.rs	`91.76% <ø> (ø)`
src/compute/take/generic_binary.rs	`98.86% <ø> (ø)`
src/compute/take/primitive.rs	`93.87% <ø> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eaa9be9...558bfa2. Read the comment docs.

jorgecarleitao · 2021-07-25T06:48:28Z

Uf, amazing. Thanks a lot, @Dandandan . cc @ritchie46 , since this is relevant to Polars.

jorgecarleitao

I made two small suggestions just so that we remember why they were added.

src/buffer/mutable.rs

Co-authored-by: Jorge Leitao <jorgecarleitao@gmail.com>

Dandandan added 4 commits July 25, 2021 07:01

add inlining for from_trusted_len_iter functions

7948ffe

add inlining for bitmap

ac2d64d

Remove try_from_trusted_len_iter_unchecked for now

2decfcc

Remove inline from bitmap for now

39c093f

jorgecarleitao reviewed Jul 25, 2021

View reviewed changes

src/buffer/mutable.rs Show resolved Hide resolved

src/buffer/mutable.rs Show resolved Hide resolved

Dandandan and others added 2 commits July 25, 2021 09:39

Update src/buffer/mutable.rs

2951022

Co-authored-by: Jorge Leitao <jorgecarleitao@gmail.com>

Update src/buffer/mutable.rs

558bfa2

Co-authored-by: Jorge Leitao <jorgecarleitao@gmail.com>

jorgecarleitao merged commit b03c906 into jorgecarleitao:main Jul 25, 2021

jorgecarleitao added the enhancement An improvement to an existing feature label Jul 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add extra inlining to speed up take #226

Add extra inlining to speed up take #226

Dandandan commented Jul 25, 2021

codecov bot commented Jul 25, 2021 •

edited

Loading

jorgecarleitao commented Jul 25, 2021

jorgecarleitao left a comment

Add extra inlining to speed up take #226

Add extra inlining to speed up take #226

Conversation

Dandandan commented Jul 25, 2021

codecov bot commented Jul 25, 2021 • edited Loading

Codecov Report

jorgecarleitao commented Jul 25, 2021

jorgecarleitao left a comment

Choose a reason for hiding this comment

codecov bot commented Jul 25, 2021 •

edited

Loading