Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Add extra inlining to speed up take #226

Merged
merged 6 commits into from
Jul 25, 2021

Conversation

Dandandan
Copy link
Collaborator

They had some inline hints on the methods, but not all the way to extend_from_trusted_len_iter_unchecked and try_from_trusted_len_iter_unchecked.

This has changes around 30-50%

Benchmarking take i32 512: Collecting 100 samples in estimated 5.0009 s (17M ite                                                                                take i32 512            time:   [300.35 ns 301.92 ns 303.62 ns]
                        change: [-51.942% -51.683% -51.411%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

Benchmarking take i32 1024: Collecting 100 samples in estimated 5.0032 s (7.4M i                                                                                take i32 1024           time:   [674.49 ns 675.96 ns 677.54 ns]
                        change: [-47.687% -47.558% -47.445%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  8 (8.00%) high mild

Benchmarking take i32 nulls 512: Collecting 100 samples in estimated 5.0013 s (9                                                                                take i32 nulls 512      time:   [511.26 ns 511.65 ns 512.11 ns]
                        change: [-1.1965% -0.9889% -0.7859%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

Benchmarking take i32 nulls 1024: Collecting 100 samples in estimated 5.0011 s (                                                                                take i32 nulls 1024     time:   [796.57 ns 797.00 ns 797.49 ns]
                        change: [-12.886% -12.768% -12.652%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  3 (3.00%) high mild
  8 (8.00%) high severe

Benchmarking take str 512: Collecting 100 samples in estimated 5.0041 s (1.6M it                                                                                take str 512            time:   [3.0675 us 3.0691 us 3.0705 us]
                        change: [-19.937% -19.801% -19.674%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

Benchmarking take str 1024: Collecting 100 samples in estimated 5.0202 s (1.0M i                                                                                take str 1024           time:   [4.9393 us 4.9453 us 4.9512 us]
                        change: [-38.380% -38.271% -38.141%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

Benchmarking take str null indices 512: Collecting 100 samples in estimated 5.01                                                                                take str null indices 512                        
                        time:   [2.9738 us 2.9800 us 2.9856 us]
                        change: [-33.972% -33.766% -33.544%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

Benchmarking take str null indices 1024: Collecting 100 samples in estimated 5.0                                                                                take str null indices 1024                        
                        time:   [5.2084 us 5.2177 us 5.2279 us]
                        change: [-35.800% -35.646% -35.513%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild

Benchmarking take str null values 1024: Collecting 100 samples in estimated 5.03                                                                                take str null values 1024                        
                        time:   [11.351 us 11.362 us 11.375 us]
                        change: [-32.987% -32.304% -31.477%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

Benchmarking take str null values null indices 1024: Collecting 100 samples in e                                                                                take str null values null indices 1024                        
                        time:   [10.549 us 10.584 us 10.617 us]
                        change: [-22.214% -21.878% -21.539%] (p = 0.00 < 0.05)
                        Performance has improved.

@codecov
Copy link

codecov bot commented Jul 25, 2021

Codecov Report

Merging #226 (558bfa2) into main (eaa9be9) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #226   +/-   ##
=======================================
  Coverage   76.94%   76.94%           
=======================================
  Files         229      229           
  Lines       19536    19536           
=======================================
  Hits        15031    15031           
  Misses       4505     4505           
Impacted Files Coverage Δ
src/buffer/mutable.rs 91.76% <ø> (ø)
src/compute/take/generic_binary.rs 98.86% <ø> (ø)
src/compute/take/primitive.rs 93.87% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eaa9be9...558bfa2. Read the comment docs.

@jorgecarleitao
Copy link
Owner

Uf, amazing. Thanks a lot, @Dandandan . cc @ritchie46 , since this is relevant to Polars.

Copy link
Owner

@jorgecarleitao jorgecarleitao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made two small suggestions just so that we remember why they were added.

src/buffer/mutable.rs Show resolved Hide resolved
src/buffer/mutable.rs Show resolved Hide resolved
Dandandan and others added 2 commits July 25, 2021 09:39
Co-authored-by: Jorge Leitao <jorgecarleitao@gmail.com>
Co-authored-by: Jorge Leitao <jorgecarleitao@gmail.com>
@jorgecarleitao jorgecarleitao merged commit b03c906 into jorgecarleitao:main Jul 25, 2021
@jorgecarleitao jorgecarleitao added the enhancement An improvement to an existing feature label Jul 29, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement An improvement to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants