Continuous benchmark tracking #27284

dergoegge · 2023-03-20T14:00:07Z

It would be beneficial to have continuous tracking of our benchmark tests, because regressions (or unexpected improvements) otherwise go undetected (at least for a while). Afaict currently, the only benefit of our benchmarking tests is to evaluate changes as they are being proposed but imo that only gives us ~50% of the benefit that benchmarks can provide.

I am imagining this to be a separate service (maybe integrated with @DrahtBot) that regularly runs the benchmarks in an environment configured for benchmarking. Regressions could be reported by the service through opening issues or sending emails. Additionally, a website that presents the benchmark data with some pretty graphs would be nice (example from firefox's infra).

Setting this up in a way that it is easy to replicate would be very beneficial.

maflcko · 2023-03-20T14:07:52Z

I think @jamesob set something up at one point, but it had to be queried manually, as there were no notifications. Also, I am not sure if it is running at all. See https://codespeed.bitcoinperf.com/timeline/

jonatack · 2023-03-20T14:16:19Z

I proposed to @LarryRuane last week (Thurs/Fri/Sat) to check in with @jamesob about picking up https://bitcoinperf.com/ and checking with @0xB10C about potential cross-fertilization with tracepoints and their dashboards, and potentially hooking it up to the CI or DrahtBot. Also #26957 (comment).

jonatack · 2023-03-20T14:21:16Z

See also #26957 (comment) by @martinus for one nice way, with an example, to create and share detailed benchmark results.

dergoegge · 2023-03-20T14:50:41Z

Honestly I think https://codespeed.bitcoinperf.com/ is pretty close to what we want here. It does seem like that hasn't been running for a while? But getting that running again and adding some kind of notification system is probably all we need.

LarryRuane · 2023-03-20T15:24:40Z

Yes, this would be very valuable. I'd like to attempt to get this going; @dergoegge, would that be okay? I made a related comment last week before I was aware of these websites (which are definitely better than what I suggested).

dergoegge · 2023-03-20T15:34:09Z

@LarryRuane cool, please do!

epompeii · 2023-04-17T12:42:17Z

If using https://codespeed.bitcoinperf.com doesn't work out, I have created a continuous benchmarking for doing exactly this, Bencher: https://github.com/bencherdev/bencher

Bencher tracks changes over time. It can easily be run in CI as a GitHub Action, and it has statistical thresholds to detect deviations.

aureleoules · 2023-12-12T14:12:08Z

I was not aware that this issue existed but I've started working on monitoring benchmark results on pull requests on corecheck. For example: https://corecheck.dev/bitcoin/bitcoin/pulls/28674.
It is still experimental and I am still working on reducing the noise between runs, but as of today I usually don't see more than 5-6% difference between identical bench runs.

epompeii · 2023-12-12T14:18:53Z

@aureleoules that looks really nice!

Would you be interesting in plotting those data over time? If so I can work on ingesting your results into Bencher, similar to how rustls is doing it: https://bencher.dev/perf/rustls-821705769

aureleoules · 2023-12-12T14:23:49Z

Would you be interesting in plotting those data over time?

Yes I plan to display on the homepage the plot of benchmarks and test coverage ratio of master over time!

maflcko · 2023-12-12T14:24:11Z

Agree that a plot over time would be useful. They were on https://codespeed.bitcoinperf.com/timeline/ , but it hasn't run for some years now.

epompeii · 2023-12-12T14:29:36Z

Sounds great!

If you want them to be live updating, you can embed Bencher plots. Just go to the Share button on the Perf Page and copy the Embed Perf Plot Link for the current plot. This is an example of what that could look like.

0xB10C · 2024-04-08T10:25:32Z

If using https://codespeed.bitcoinperf.com doesn't work out, I have created a continuous benchmarking for doing exactly this, Bencher: https://github.com/bencherdev/bencher

Bencher tracks changes over time. It can easily be run in CI as a GitHub Action, and it has statistical thresholds to detect deviations.

For Bitcoin Core, it would be useful to have an adapter for the nanobench JSON output. To track this, I've opened bencherdev/bencher#361.

0xB10C · 2024-04-08T12:42:54Z

I just learned that nanobench is able to fill in an output format template. It might make sense to try that route first.

epompeii · 2024-04-08T14:45:19Z

For Bitcoin Core, it would be useful to have an adapter for the nanobench JSON output. To track this, I've opened bencherdev/bencher#361.

@0xB10C I would be more than happy to implement a nanobench JSON output adapter. It is going to take me a couple of weeks or so to get to it though. So you could either:

Use the nanobench output format template to Bencher Metric Format
Implement the adapter in Bencher and open a PR
Wait a few weeks and I'll take care of it 😃

0xB10C · 2024-04-10T08:18:06Z

I've been playing around with bencher running the bitcoin_bench bitcoind binary in a GH action as PoC. A sample dashboard is here (however, it takes a while till it loads for me). While my branch needs a bit of cleanup, it works out of the box without modifications to nanobench using a nanobench output template and a custom bencher metric (seconds instead of the default nanoseconds).

epompeii · 2024-04-10T13:29:57Z

A sample dashboard is here

For others who haven't created an account yet this is the public perf page.
I have also create a tracking issue to make this sort of redirect the default behavior going forward: bencherdev/bencher#364

(however, it takes a while till it loads for me)

Yes, my apologies about the long load times. I'm still trying to figure out design wise how I want to handle displaying reports with a lot of benchmarks 😃
This has prompted me to create a tracking issue for this as well: bencherdev/bencher#363

0xB10C · 2024-04-29T13:04:03Z

I was made aware of https://bencher.dev/learn/engineering/sqlite-performance-tuning/ recently and the dashboard seem to load nearly instantly now! Cool, I have this on my list to further work on it (at some point). My wip branch is here if someone else wants to give this a shot.

Next thing to look into is probably adding instructions as a measurement. Nanobench supports this on Linux, but I'm not sure this is possible in our CI. time might not be an ideal metric to track on a public GitHub runner that might also be running other jobs in parallel and changing it's hardware over time. After that, probably setting up a master job for Statistical Continuous Benchmarking and a PR job for Relative Continuous Benchmarking a la https://bencher.dev/docs/how-to/track-benchmarks/.

maflcko added Brainstorming Tests labels Mar 20, 2023

maflcko added the Resource usage label Mar 20, 2023

aureleoules mentioned this issue Dec 12, 2023

Benchmark WalletCreateTxUsePresetInputsAndCoinSelection crashes due to #25273 #29061

Closed

1 task

This comment has been minimized.

Sign in to view

0xB10C mentioned this issue Apr 9, 2024

Q: How to print milliseconds using an output template? martinus/nanobench#107

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continuous benchmark tracking #27284

Continuous benchmark tracking #27284

dergoegge commented Mar 20, 2023

maflcko commented Mar 20, 2023

jonatack commented Mar 20, 2023 •

edited

jonatack commented Mar 20, 2023

dergoegge commented Mar 20, 2023

LarryRuane commented Mar 20, 2023

dergoegge commented Mar 20, 2023

epompeii commented Apr 17, 2023

aureleoules commented Dec 12, 2023

epompeii commented Dec 12, 2023

aureleoules commented Dec 12, 2023

maflcko commented Dec 12, 2023

epompeii commented Dec 12, 2023

This comment has been minimized.

0xB10C commented Apr 8, 2024

0xB10C commented Apr 8, 2024

epompeii commented Apr 8, 2024

0xB10C commented Apr 10, 2024

epompeii commented Apr 10, 2024

0xB10C commented Apr 29, 2024

Continuous benchmark tracking #27284

Continuous benchmark tracking #27284

Comments

dergoegge commented Mar 20, 2023

maflcko commented Mar 20, 2023

jonatack commented Mar 20, 2023 • edited

jonatack commented Mar 20, 2023

dergoegge commented Mar 20, 2023

LarryRuane commented Mar 20, 2023

dergoegge commented Mar 20, 2023

epompeii commented Apr 17, 2023

aureleoules commented Dec 12, 2023

epompeii commented Dec 12, 2023

aureleoules commented Dec 12, 2023

maflcko commented Dec 12, 2023

epompeii commented Dec 12, 2023

This comment has been minimized.

0xB10C commented Apr 8, 2024

0xB10C commented Apr 8, 2024

epompeii commented Apr 8, 2024

0xB10C commented Apr 10, 2024

epompeii commented Apr 10, 2024

0xB10C commented Apr 29, 2024

jonatack commented Mar 20, 2023 •

edited