Replace std's unmaintained bench with criterion #2235

TheBlueMatt · 2023-04-26T16:48:25Z

Rather than using the std benchmark framework (which isn't
maintained and is unlikely to get any further maintenance), we swap
for criterion, which at least gets us a variable number of test
runs so our benchmarks don't take forever.

We also fix the RGS benchmark to pass now that the file in use is
stale compared to today's date.

TheBlueMatt · 2023-04-28T17:18:05Z

Tagging 116 as this is the first step towards a large scorer refactor which I think we should prioritize as it substantially improves success rates.

dunxen · 2023-05-01T18:26:05Z

Oh this is nice. Actually was unrelatedly looking at criterion. I believe it’s also trivial to generate graphs of benchmark changes with it (or maybe that needs a separate crate locally).

Anyway, concept ACK.

codecov-commenter · 2023-05-01T19:36:55Z

Codecov Report

Patch coverage: 30.00% and project coverage change: +0.61 🎉

Comparison is base (7b64527) 90.91% compared to head (3debd6a) 91.53%.

❗ Current head 3debd6a differs from pull request most recent head 4b27cc4. Consider uploading reports for the commit 4b27cc4 to get more accurate results

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2235      +/-   ##
==========================================
+ Coverage   90.91%   91.53%   +0.61%     
==========================================
  Files         104      104              
  Lines       52760    52034     -726     
  Branches    52760    52034     -726     
==========================================
- Hits        47969    47628     -341     
+ Misses       4791     4406     -385

Impacted Files	Coverage Δ
lightning-rapid-gossip-sync/src/lib.rs	`85.13% <ø> (ø)`
lightning/src/chain/keysinterface.rs	`88.75% <ø> (ø)`
lightning/src/lib.rs	`67.74% <ø> (ø)`
lightning/src/ln/channelmanager.rs	`88.80% <ø> (+1.67%)`	⬆️
lightning/src/ln/functional_test_utils.rs	`92.79% <ø> (+0.13%)`	⬆️
lightning/src/routing/gossip.rs	`89.77% <ø> (ø)`
lightning/src/sync/mod.rs	`100.00% <ø> (ø)`
lightning/src/util/test_utils.rs	`76.87% <ø> (+4.49%)`	⬆️
lightning/src/routing/router.rs	`93.68% <28.20%> (-0.75%)`	⬇️
lightning-persister/src/lib.rs	`89.25% <100.00%> (ø)`

... and 23 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

TheBlueMatt · 2023-05-01T21:05:10Z

Addressed comments originally on #2176.

dunxen

Done some pretty extensive test runs (beyond those of last week) with this and not running into any issues. The stats are pretty handy.

lightning-persister/Cargo.toml

lightning/src/routing/router.rs

tnull

Nice, generally looks good after first pass!

tnull · 2023-05-02T12:42:44Z

lightning/src/routing/router.rs

+	use crate::ln::features::InvoiceFeatures;
+	use crate::routing::gossip::NetworkGraph;
+	use crate::util::config::UserConfig;
+	use crate::util::ser::ReadableArgs;


Seems this shadows the ReadableArgs in L. 5660, which hence unused in cargo test?

They're in separate modules, but the fixup i pushed to switch to the other util makes the one on L5660 uneccessary now.

lightning/src/routing/router.rs

.github/workflows/build.yml

dunxen · 2023-05-08T09:21:10Z

LGTM and works for me. Happy for squash

TheBlueMatt · 2023-05-08T16:32:30Z

Let's land #2237 first cause it'll probably conflict a good bit and I'll take the rebase hit.

There's a few route tests which do the same thing as the benchmarks as they're also a good test. However, they didn't share code, which is somewhat wasteful, so we fix that here.

When benchmarking our router, we previously only ever tested with amounts under 1,000 sats, which is an incredibly small amount. While this ensures we have the maximal number of available channels to consider, it prevents our scorer from getting exercise across its range. Further, we only score the immediate path we are expecting to to send over, and not randomly but rather based on the amount sent. Here we try to make the benchmarks a bit more realistic by adding a new benchmark which attempts to send around 100K sats, which is a reasonable amount to send over a channel today. We also convert the scoring data to be randomized based on the seed as well as attempt to (possibly) find a new route for a much larger value and score based on that. This potentially allows us to score multiple potential paths between the source and destination as the large route-find may return an MPP result.

Rather than using the std benchmark framework (which isn't maintained and is unlikely to get any further maintenance), we swap for criterion, which at least gets us a variable number of test runs so our benchmarks don't take forever. We also fix the RGS benchmark to pass now that the file in use is stale compared to today's date.

TheBlueMatt · 2023-05-11T06:12:11Z

Squashed and rebased, more mechanical changes but it changed the patchset a good bit.

TheBlueMatt · 2023-05-16T21:00:41Z

Would love to land this, do y'all have time to re-review this one @dunxen @tnull or should I assign other reviewers? (No problem either way).

dunxen · 2023-05-16T21:07:25Z

Would love to land this, do y'all have time to re-review this one @dunxen @tnull or should I assign other reviewers? (No problem either way).

I can do a more thorough re-review in the morning. Did you want to land it it today still? :)

TheBlueMatt · 2023-05-16T21:31:52Z

No lol, next week is fine, we just are starting to pile up the PRs so trying to move things forward.

dunxen

reACK 4b27cc4

LGTM. Re-reviewed and no objections.

tnull

LGTM

TheBlueMatt mentioned this pull request Apr 26, 2023

Move the historical bucket tracker to 32 unequal sized buckets #2176

Merged

TheBlueMatt added this to the 0.0.116 milestone Apr 28, 2023

TheBlueMatt force-pushed the 2023-04-criterion branch from a9f44aa to b08541f Compare May 1, 2023 18:17

TheBlueMatt force-pushed the 2023-04-criterion branch from b08541f to 51d7777 Compare May 1, 2023 21:04

dunxen reviewed May 2, 2023

View reviewed changes

lightning-persister/Cargo.toml Outdated Show resolved Hide resolved

lightning/src/routing/router.rs Show resolved Hide resolved

tnull reviewed May 2, 2023

View reviewed changes

TheBlueMatt force-pushed the 2023-04-criterion branch from 51d7777 to 3debd6a Compare May 2, 2023 17:13

TheBlueMatt added 5 commits May 11, 2023 05:42

Unify route benchmarking with route tests

fbaa3c4

There's a few route tests which do the same thing as the benchmarks as they're also a good test. However, they didn't share code, which is somewhat wasteful, so we fix that here.

Add trivial README to bench to describe how to run them.

6ddc88b

Update .gitignore to ignore benchmark data files

4b27cc4

TheBlueMatt force-pushed the 2023-04-criterion branch from 3debd6a to 4b27cc4 Compare May 11, 2023 06:12

TheBlueMatt mentioned this pull request May 15, 2023

Use RUSTFLAGS instead of feature to gate benching #1882

Closed

dunxen approved these changes May 18, 2023

View reviewed changes

tnull approved these changes May 19, 2023

View reviewed changes

TheBlueMatt merged commit bada713 into lightningdevkit:main May 20, 2023
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace std's unmaintained bench with criterion #2235

Replace std's unmaintained bench with criterion #2235

TheBlueMatt commented Apr 26, 2023

TheBlueMatt commented Apr 28, 2023

dunxen commented May 1, 2023

codecov-commenter commented May 1, 2023 •

edited

TheBlueMatt commented May 1, 2023

dunxen left a comment

tnull left a comment

tnull May 2, 2023

TheBlueMatt May 2, 2023

dunxen commented May 8, 2023

TheBlueMatt commented May 8, 2023

TheBlueMatt commented May 11, 2023

TheBlueMatt commented May 16, 2023

dunxen commented May 16, 2023

TheBlueMatt commented May 16, 2023

dunxen left a comment

tnull left a comment

Replace std's unmaintained bench with criterion #2235

Replace std's unmaintained bench with criterion #2235

Conversation

TheBlueMatt commented Apr 26, 2023

TheBlueMatt commented Apr 28, 2023

dunxen commented May 1, 2023

codecov-commenter commented May 1, 2023 • edited

Codecov Report

TheBlueMatt commented May 1, 2023

dunxen left a comment

Choose a reason for hiding this comment

tnull left a comment

Choose a reason for hiding this comment

tnull May 2, 2023

Choose a reason for hiding this comment

TheBlueMatt May 2, 2023

Choose a reason for hiding this comment

dunxen commented May 8, 2023

TheBlueMatt commented May 8, 2023

TheBlueMatt commented May 11, 2023

TheBlueMatt commented May 16, 2023

dunxen commented May 16, 2023

TheBlueMatt commented May 16, 2023

dunxen left a comment

Choose a reason for hiding this comment

tnull left a comment

Choose a reason for hiding this comment

codecov-commenter commented May 1, 2023 •

edited