Optimized `null_count` #442

ritchie46 · 2021-09-23T09:33:30Z

Every time we slice an array, we count the null values. This PR does a small optimization so that we only count he null values of the smallest chunk of memory.

codecov · 2021-09-23T10:39:27Z

Codecov Report

Merging #442 (2a6093d) into main (235b7f5) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #442   +/-   ##
=======================================
  Coverage   79.87%   79.87%           
=======================================
  Files         371      371           
  Lines       22753    22758    +5     
=======================================
+ Hits        18174    18178    +4     
- Misses       4579     4580    +1

Impacted Files	Coverage Δ
src/bitmap/immutable.rs	`86.11% <100.00%> (+1.03%)`	⬆️
src/compute/arithmetics/time.rs	`44.89% <0.00%> (-2.05%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 235b7f5...2a6093d. Read the comment docs.

jorgecarleitao · 2021-09-23T12:06:28Z

Cool idea!

I think it is worth benchmarking: we now have 2 (smaller) iterations instead of one. I can PR a bench for slicing bitmaps.

ritchie46 · 2021-09-23T13:43:05Z

I think it is worth benchmarking: we now have 2 (smaller) iterations instead of one. I can PR a bench for slicing bitmaps.

Yes, I was also wondering this. Perhaps there is some more instruction level parallelism as they are independent. But you are right. Let's benchmark

ritchie46 · 2021-09-23T14:00:44Z

I ran this benchmark:

        let offset = ((size as f64) * 0.2) as usize;
        let len = ((size as f64) * 0.55) as usize;

        c.bench_function(&format!("bitmap_count_zeros {}", log2_size), |b| {
            b.iter(|| {
                let r = bitmap.clone().slice(offset, len);
                assert!(r.null_count() > 0);
            })
        });

Worst case side of the spectrum

So that we have a smaller chunk at the start and at the end of the array. We slice only 55% as this would be an almost worst case scenario (51% would be). So the result would be better if we slice bigger chunks. It seems we make a difference once we don't fit the cache anymore.

bitmap_count_zeros 10   time:   [31.303 ns 31.316 ns 31.329 ns]                                   
                        change: [+2.2863% +3.3347% +4.2056%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild
  2 (2.00%) high severe

bitmap_count_zeros 12   time:   [46.442 ns 46.496 ns 46.555 ns]                                   
                        change: [+13.943% +14.534% +15.153%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

bitmap_count_zeros 14   time:   [91.293 ns 91.322 ns 91.370 ns]                                  
                        change: [-14.881% -14.622% -14.426%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  4 (4.00%) high severe

bitmap_count_zeros 16   time:   [304.45 ns 305.34 ns 306.20 ns]                                  
                        change: [-14.044% -13.760% -13.461%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 21 outliers among 100 measurements (21.00%)
  8 (8.00%) low severe
  1 (1.00%) low mild
  8 (8.00%) high mild
  4 (4.00%) high severe

bitmap_count_zeros 18   time:   [1.1123 us 1.1126 us 1.1130 us]                                   
                        change: [-24.429% -23.034% -21.759%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

bitmap_count_zeros 20   time:   [4.3841 us 4.3865 us 4.3888 us]                                   
                        change: [-19.887% -19.629% -19.434%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Best case side of the spectrum

Here we slice a large chunk (85% of the array)

bitmap_count_zeros 10   time:   [34.895 ns 34.906 ns 34.918 ns]                                   
                        change: [+9.0526% +9.3320% +9.5184%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

bitmap_count_zeros 12   time:   [37.946 ns 37.990 ns 38.037 ns]                                   
                        change: [-27.212% -26.650% -26.237%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 18 outliers among 100 measurements (18.00%)
  7 (7.00%) low mild
  7 (7.00%) high mild
  4 (4.00%) high severe

bitmap_count_zeros 14   time:   [55.512 ns 55.536 ns 55.567 ns]                                  
                        change: [-63.246% -63.169% -63.098%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

bitmap_count_zeros 16   time:   [121.03 ns 121.45 ns 122.02 ns]                                  
                        change: [-79.464% -79.020% -78.621%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 18 outliers among 100 measurements (18.00%)
  1 (1.00%) low severe
  7 (7.00%) low mild
  5 (5.00%) high mild
  5 (5.00%) high severe

bitmap_count_zeros 18   time:   [399.81 ns 399.89 ns 400.01 ns]                                  
                        change: [-82.211% -82.129% -82.045%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  9 (9.00%) high severe

bitmap_count_zeros 20   time:   [1.4925 us 1.4930 us 1.4937 us]                                   
                        change: [-82.449% -82.356% -82.301%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) low mild
  2 (2.00%) high mild
  5 (5.00%) high severe

src/bitmap/immutable.rs

jorgecarleitao

Awesome; very good result.

It may be worth creating a bench that you used, in case someone else would like to use it for future improvements.

ritchie46 · 2021-09-23T18:31:30Z

It may be worth creating a bench that you used, in case someone else would like to use it for future improvements.

Added 👍

jorgecarleitao · 2021-09-25T20:24:03Z

Could you resolve the conflict?

ritchie46 · 2021-09-26T07:23:01Z

Could you resolve the conflict?

Good to go.

ritchie46 closed this Sep 23, 2021

ritchie46 reopened this Sep 23, 2021

ritchie46 force-pushed the count_zeros branch from a08e426 to 549ddbf Compare September 23, 2021 10:26

jorgecarleitao changed the title ~~small null_count optimization~~ Optimized null_count Sep 23, 2021

jorgecarleitao added the enhancement An improvement to an existing feature label Sep 23, 2021

jorgecarleitao reviewed Sep 23, 2021

View reviewed changes

src/bitmap/immutable.rs Outdated Show resolved Hide resolved

jorgecarleitao approved these changes Sep 23, 2021

View reviewed changes

ritchie46 force-pushed the count_zeros branch from 549ddbf to 3585481 Compare September 23, 2021 18:27

ritchie46 added 4 commits September 26, 2021 09:22

small null_count optimization

b200dcf

account for both chunks

27a9044

remove num_trait import

57c455a

add zero_count benchmark

2a6093d

ritchie46 force-pushed the count_zeros branch from f51fff8 to 2a6093d Compare September 26, 2021 07:22

jorgecarleitao merged commit e27ff27 into jorgecarleitao:main Sep 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimized `null_count` #442

Optimized `null_count` #442

ritchie46 commented Sep 23, 2021

codecov bot commented Sep 23, 2021 •

edited

jorgecarleitao commented Sep 23, 2021

ritchie46 commented Sep 23, 2021

ritchie46 commented Sep 23, 2021 •

edited

jorgecarleitao left a comment

ritchie46 commented Sep 23, 2021

jorgecarleitao commented Sep 25, 2021

ritchie46 commented Sep 26, 2021

Optimized null_count #442

Optimized null_count #442

Conversation

ritchie46 commented Sep 23, 2021

codecov bot commented Sep 23, 2021 • edited

Codecov Report

jorgecarleitao commented Sep 23, 2021

ritchie46 commented Sep 23, 2021

ritchie46 commented Sep 23, 2021 • edited

Worst case side of the spectrum

Best case side of the spectrum

jorgecarleitao left a comment

Choose a reason for hiding this comment

ritchie46 commented Sep 23, 2021

jorgecarleitao commented Sep 25, 2021

ritchie46 commented Sep 26, 2021

Optimized `null_count` #442

Optimized `null_count` #442

codecov bot commented Sep 23, 2021 •

edited

ritchie46 commented Sep 23, 2021 •

edited