Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically run and track arrow / parquet benchmarks over time #1274

Open
Tracked by #6126
alamb opened this issue Feb 5, 2022 · 2 comments
Open
Tracked by #6126

Automatically run and track arrow / parquet benchmarks over time #1274

alamb opened this issue Feb 5, 2022 · 2 comments
Labels
enhancement Any new improvement worthy of a entry in the changelog help wanted

Comments

@alamb
Copy link
Contributor

alamb commented Feb 5, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
As discussed by @jonkeane in apache/arrow-site#188 (comment)

It would be nice to track the speed of rust benchmarks over time to

  1. Ensure it doesn't regress (get worse) accidentally
  2. We can see the results of improvements over time (and link to them publicly)

Describe the solution you'd like
Figure out how to run the benchmarks cargo bench ... in the framework described here: https://lists.apache.org/thread/n98tr38moqvbf1rlqd7n7912mfrqv8ps

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

@kazuk
Copy link
Contributor

kazuk commented May 12, 2022

I write bash script for this.

usage example ./dev/bench/run_bench.sh 11.0.0 12.0.0

This saves 4 baseline: arrow-11.0.0 and arrow-12.0.0 , parquet-11.0.0, parquet-12.0.0,

this can compare with critcmp

#!/bin/bash
#
# run bench for 2 git hash/branch/tag
#
# ./dev/bench/run_bench.sh [baseline] [compare]

set -e

function patch_cargo_toml() {
    if [ ! -z $(grep "bench = false" arrow/Cargo.toml) ]; then 
        echo "Patch not required"
    else
        echo "Patch required"
        cp arrow/Cargo.toml arrow/Cargo.toml.org
        sed '/^\[lib\]/a bench = false' arrow/Cargo.toml.org > arrow/Cargo.toml
        echo '[lib]' >> parquet/Cargo.toml
        echo 'bench = false' >> parquet/Cargo.toml
    fi
}

function run_bench( ) {
    cd arrow
    cargo bench -- --save-baseline arrow-$1
    cd ..
    cd parquet
    cargo bench -- --save-baseline parquet-$1
    cd ..
}


# checkout baseline and run bench 
git checkout $1

patch_cargo_toml 
run_bench $1

# checkout compare and run bench
git reset --hard
git checkout $2

patch_cargo_toml 
run_bench $2

@kazuk
Copy link
Contributor

kazuk commented May 12, 2022

Why critcmp use?

criterion's --save-baseline and --baseline crashes new bench added in base..compare history.

this is fixed in bheisler/criterion.rs#532 but not released.

@alamb alamb changed the title Automatically run and track arrow / parquet benchmarks overtime Automatically run and track arrow / parquet benchmarks over time Apr 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog help wanted
Projects
None yet
Development

No branches or pull requests

3 participants