Add bloom filter benchmark for parquet writer #3323

viirya · 2022-12-10T06:10:14Z

Which issue does this PR close?

Related to #3320.

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

viirya · 2022-12-10T06:11:27Z

write_batch primitive/4096 values primitive                                                                        
                        time:   [619.38 µs 621.82 µs 624.41 µs]
                        thrpt:  [282.70 MiB/s 283.88 MiB/s 285.00 MiB/s]
Found 12 outliers among 100 measurements (12.00%)
  12 (12.00%) high severe                    
write_batch primitive/4096 values primitive with bloom filter  
                        time:   [4.8945 ms 4.9131 ms 4.9326 ms]         
                        thrpt:  [35.786 MiB/s 35.928 MiB/s 36.065 MiB/s]
Found 1 outliers among 100 measurements (1.00%)

write_batch primitive/4096 values primitive non-null                                                               
                        time:   [540.92 µs 541.26 µs 541.54 µs]                                                                                                                                                                        
                        thrpt:  [319.65 MiB/s 319.82 MiB/s 320.02 MiB/s]
Found 1 outliers among 100 measurements (1.00%)                                                                    
  1 (1.00%) high severe                                                                                            
write_batch primitive/4096 values primitive non-null with bloom filter
                        time:   [4.3944 ms 4.4298 ms 4.4636 ms]
                        thrpt:  [38.781 MiB/s 39.077 MiB/s 39.392 MiB/s]

write_batch primitive/4096 values string                
                        time:   [272.10 µs 272.33 µs 272.59 µs]
                        thrpt:  [292.14 MiB/s 292.42 MiB/s 292.68 MiB/s]
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) low mild                  
  5 (5.00%) high mild                                                                                              
  5 (5.00%) high severe                             
write_batch primitive/4096 values string with bloom filter           
                        time:   [1.1696 ms 1.1821 ms 1.1913 ms]
                        thrpt:  [66.850 MiB/s 67.368 MiB/s 68.090 MiB/s]

write_batch primitive/4096 values string non-null
                        time:   [332.30 µs 332.95 µs 333.64 µs]
                        thrpt:  [235.76 MiB/s 236.25 MiB/s 236.71 MiB/s]
Benchmarking write_batch primitive/4096 values string non-null with bloom filter: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.1s, enable flat sampling, or reduce sample count to 60.
write_batch primitive/4096 values string non-null with bloom filter
                        time:   [1.1043 ms 1.1553 ms 1.2063 ms]
                        thrpt:  [65.204 MiB/s 68.087 MiB/s 71.232 MiB/s]

alamb · 2022-12-10T11:12:22Z

Thank you @viirya

viirya · 2022-12-10T17:41:49Z

Thank you @alamb

ursabot · 2022-12-13T16:24:50Z

Benchmark runs are scheduled for baseline = 9e39f96 and contender = ad94368. ad94368 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-rs-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Add bloom filter benchmark

9714d3f

github-actions bot added the parquet Changes to the parquet crate label Dec 10, 2022

alamb approved these changes Dec 10, 2022

View reviewed changes

viirya merged commit ad94368 into apache:master Dec 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bloom filter benchmark for parquet writer #3323

Add bloom filter benchmark for parquet writer #3323

viirya commented Dec 10, 2022

viirya commented Dec 10, 2022

alamb commented Dec 10, 2022

viirya commented Dec 10, 2022

ursabot commented Dec 13, 2022

Add bloom filter benchmark for parquet writer #3323

Add bloom filter benchmark for parquet writer #3323

Conversation

viirya commented Dec 10, 2022

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

viirya commented Dec 10, 2022

alamb commented Dec 10, 2022

viirya commented Dec 10, 2022

ursabot commented Dec 13, 2022