Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace std's unmaintained bench with criterion #2235

Merged
merged 5 commits into from
May 20, 2023

Conversation

TheBlueMatt
Copy link
Collaborator

Rather than using the std benchmark framework (which isn't
maintained and is unlikely to get any further maintenance), we swap
for criterion, which at least gets us a variable number of test
runs so our benchmarks don't take forever.

We also fix the RGS benchmark to pass now that the file in use is
stale compared to today's date.

@TheBlueMatt
Copy link
Collaborator Author

Tagging 116 as this is the first step towards a large scorer refactor which I think we should prioritize as it substantially improves success rates.

@dunxen
Copy link
Contributor

dunxen commented May 1, 2023

Oh this is nice. Actually was unrelatedly looking at criterion. I believe it’s also trivial to generate graphs of benchmark changes with it (or maybe that needs a separate crate locally).

Anyway, concept ACK.

@codecov-commenter
Copy link

codecov-commenter commented May 1, 2023

Codecov Report

Patch coverage: 30.00% and project coverage change: +0.61 🎉

Comparison is base (7b64527) 90.91% compared to head (3debd6a) 91.53%.

❗ Current head 3debd6a differs from pull request most recent head 4b27cc4. Consider uploading reports for the commit 4b27cc4 to get more accurate results

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2235      +/-   ##
==========================================
+ Coverage   90.91%   91.53%   +0.61%     
==========================================
  Files         104      104              
  Lines       52760    52034     -726     
  Branches    52760    52034     -726     
==========================================
- Hits        47969    47628     -341     
+ Misses       4791     4406     -385     
Impacted Files Coverage Δ
lightning-rapid-gossip-sync/src/lib.rs 85.13% <ø> (ø)
lightning/src/chain/keysinterface.rs 88.75% <ø> (ø)
lightning/src/lib.rs 67.74% <ø> (ø)
lightning/src/ln/channelmanager.rs 88.80% <ø> (+1.67%) ⬆️
lightning/src/ln/functional_test_utils.rs 92.79% <ø> (+0.13%) ⬆️
lightning/src/routing/gossip.rs 89.77% <ø> (ø)
lightning/src/sync/mod.rs 100.00% <ø> (ø)
lightning/src/util/test_utils.rs 76.87% <ø> (+4.49%) ⬆️
lightning/src/routing/router.rs 93.68% <28.20%> (-0.75%) ⬇️
lightning-persister/src/lib.rs 89.25% <100.00%> (ø)

... and 23 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@TheBlueMatt
Copy link
Collaborator Author

Addressed comments originally on #2176.

Copy link
Contributor

@dunxen dunxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done some pretty extensive test runs (beyond those of last week) with this and not running into any issues. The stats are pretty handy.

lightning-persister/Cargo.toml Outdated Show resolved Hide resolved
lightning/src/routing/router.rs Show resolved Hide resolved
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, generally looks good after first pass!

use crate::ln::features::InvoiceFeatures;
use crate::routing::gossip::NetworkGraph;
use crate::util::config::UserConfig;
use crate::util::ser::ReadableArgs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this shadows the ReadableArgs in L. 5660, which hence unused in cargo test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're in separate modules, but the fixup i pushed to switch to the other util makes the one on L5660 uneccessary now.

lightning/src/routing/router.rs Outdated Show resolved Hide resolved
.github/workflows/build.yml Show resolved Hide resolved
@dunxen
Copy link
Contributor

dunxen commented May 8, 2023

LGTM and works for me. Happy for squash

@TheBlueMatt
Copy link
Collaborator Author

Let's land #2237 first cause it'll probably conflict a good bit and I'll take the rebase hit.

There's a few route tests which do the same thing as the benchmarks
as they're also a good test. However, they didn't share code, which
is somewhat wasteful, so we fix that here.
When benchmarking our router, we previously only ever tested with
amounts under 1,000 sats, which is an incredibly small amount.
While this ensures we have the maximal number of available channels
to consider, it prevents our scorer from getting exercise across
its range. Further, we only score the immediate path we are
expecting to to send over, and not randomly but rather based on the
amount sent.

Here we try to make the benchmarks a bit more realistic by adding
a new benchmark which attempts to send around 100K sats, which is
a reasonable amount to send over a channel today. We also convert
the scoring data to be randomized based on the seed as well as
attempt to (possibly) find a new route for a much larger value and
score based on that. This potentially allows us to score multiple
potential paths between the source and destination as the large
route-find may return an MPP result.
Rather than using the std benchmark framework (which isn't
maintained and is unlikely to get any further maintenance), we swap
for criterion, which at least gets us a variable number of test
runs so our benchmarks don't take forever.

We also fix the RGS benchmark to pass now that the file in use is
stale compared to today's date.
@TheBlueMatt
Copy link
Collaborator Author

Squashed and rebased, more mechanical changes but it changed the patchset a good bit.

@TheBlueMatt
Copy link
Collaborator Author

Would love to land this, do y'all have time to re-review this one @dunxen @tnull or should I assign other reviewers? (No problem either way).

@dunxen
Copy link
Contributor

dunxen commented May 16, 2023

Would love to land this, do y'all have time to re-review this one @dunxen @tnull or should I assign other reviewers? (No problem either way).

I can do a more thorough re-review in the morning. Did you want to land it it today still? :)

@TheBlueMatt
Copy link
Collaborator Author

No lol, next week is fine, we just are starting to pile up the PRs so trying to move things forward.

Copy link
Contributor

@dunxen dunxen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reACK 4b27cc4

LGTM. Re-reviewed and no objections.

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@TheBlueMatt TheBlueMatt merged commit bada713 into lightningdevkit:main May 20, 2023
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants