Automatically run and track arrow / parquet benchmarks over time #1274

alamb · 2022-02-05T11:23:03Z

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
As discussed by @jonkeane in apache/arrow-site#188 (comment)

It would be nice to track the speed of rust benchmarks over time to

Ensure it doesn't regress (get worse) accidentally
We can see the results of improvements over time (and link to them publicly)

Describe the solution you'd like
Figure out how to run the benchmarks cargo bench ... in the framework described here: https://lists.apache.org/thread/n98tr38moqvbf1rlqd7n7912mfrqv8ps

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

The text was updated successfully, but these errors were encountered:

kazuk · 2022-05-12T01:08:13Z

I write bash script for this.

usage example ./dev/bench/run_bench.sh 11.0.0 12.0.0

This saves 4 baseline: arrow-11.0.0 and arrow-12.0.0 , parquet-11.0.0, parquet-12.0.0,

this can compare with critcmp

#!/bin/bash
#
# run bench for 2 git hash/branch/tag
#
# ./dev/bench/run_bench.sh [baseline] [compare]

set -e

function patch_cargo_toml() {
    if [ ! -z $(grep "bench = false" arrow/Cargo.toml) ]; then 
        echo "Patch not required"
    else
        echo "Patch required"
        cp arrow/Cargo.toml arrow/Cargo.toml.org
        sed '/^\[lib\]/a bench = false' arrow/Cargo.toml.org > arrow/Cargo.toml
        echo '[lib]' >> parquet/Cargo.toml
        echo 'bench = false' >> parquet/Cargo.toml
    fi
}

function run_bench( ) {
    cd arrow
    cargo bench -- --save-baseline arrow-$1
    cd ..
    cd parquet
    cargo bench -- --save-baseline parquet-$1
    cd ..
}


# checkout baseline and run bench 
git checkout $1

patch_cargo_toml 
run_bench $1

# checkout compare and run bench
git reset --hard
git checkout $2

patch_cargo_toml 
run_bench $2

kazuk · 2022-05-12T01:12:59Z

Why critcmp use?

criterion's --save-baseline and --baseline crashes new bench added in base..compare history.

this is fixed in bheisler/criterion.rs#532 but not released.

alamb added the enhancement Any new improvement worthy of a entry in the changelog label Feb 5, 2022

alamb mentioned this issue Feb 5, 2022

[Website] Blog post for Rust arrow 9 release apache/arrow-site#188

Closed

tustvold added the help wanted label May 6, 2022

tustvold mentioned this issue May 6, 2022

4% over performance degrade on upgrading v11 to v12 #1660

Open

tustvold mentioned this issue Jun 10, 2022

Larger CI Runners to Prevent MIRI OOMing and Improve CI Times #1833

Closed

alamb mentioned this issue Apr 26, 2023

[EPIC] Improved DataFusion Benchmarking apache/datafusion#6126

Closed

1 task

alamb changed the title ~~Automatically run and track arrow / parquet benchmarks overtime~~ Automatically run and track arrow / parquet benchmarks over time Apr 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically run and track arrow / parquet benchmarks over time #1274

Automatically run and track arrow / parquet benchmarks over time #1274

alamb commented Feb 5, 2022

kazuk commented May 12, 2022

kazuk commented May 12, 2022

Automatically run and track arrow / parquet benchmarks over time #1274

Automatically run and track arrow / parquet benchmarks over time #1274

Comments

alamb commented Feb 5, 2022

kazuk commented May 12, 2022

kazuk commented May 12, 2022