Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smooth out channel liquidity bounds decay #1789

Merged
merged 1 commit into from Aug 9, 2023

Conversation

jkczyz
Copy link
Contributor

@jkczyz jkczyz commented Oct 21, 2022

Decaying the channel liquidity bounds by a half life can result in a large decrease, which may have an oscillating affect on whether a channel is retried. Approximate an additional three-quarter life when half of the next half life has passed to help smooth out the decay.

Fixes #1752.

@jkczyz
Copy link
Contributor Author

jkczyz commented Oct 21, 2022

Still need to benchmark and probably add some more robust testing. But figured I'd open this as a draft to get some feedback.

@codecov-commenter
Copy link

codecov-commenter commented Oct 21, 2022

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.01% ⚠️

Comparison is base (dba3e8f) 90.31% compared to head (9a855be) 90.31%.
Report is 130 commits behind head on main.

❗ Current head 9a855be differs from pull request most recent head bdbe9c8. Consider uploading reports for the commit bdbe9c8 to get more accurate results

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1789      +/-   ##
==========================================
- Coverage   90.31%   90.31%   -0.01%     
==========================================
  Files         106      106              
  Lines       55170    55180      +10     
  Branches    55170    55180      +10     
==========================================
+ Hits        49827    49836       +9     
- Misses       5343     5344       +1     
Files Changed Coverage Δ
lightning/src/routing/scoring.rs 94.36% <100.00%> (+0.26%) ⬆️

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jkczyz
Copy link
Contributor Author

jkczyz commented Oct 21, 2022

Still need to benchmark and probably add some more robust testing. But figured I'd open this as a draft to get some feedback.

I'm seeing no discernible difference in the generate_routes_with_probabilistic_scorer benchmark, even when forcing the code to take the new branch. ~48ms on my machine over 5 runs.

Some(decays) => match elapsed_time.checked_div(half_life / 2) {
None => 0,
Some(half_decays) => {
// Decay the offset by the appropriate number of half lives. If half of the next
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we doing this in such a complicated manner if we're trying to increase precision? We can do things much simpler if we permit floating points:

double halving_count = elapsed_time/half_life
double factor = 0.5^halving_count
decayed_msat = int(factor * offset_msat)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't about increasing precision -- right shift divides by two so there would only be half msat loss of precision. It's about having less noticeable jumps between two half lives.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Use floating point division. I can benchmark though previously we wanted to avoid floating-point operations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should give it a try. It's the easiest way to avoid these discrete jumps and smooth out the decay.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Division is pain, I don't think we should divide here, if at all possible. Sadly, just benchmarking it on x86 isn't really sufficient, IMO, we'll want to benchmark, at least, on (a) mobile, ideally both Apple and samsung/more-defualt ARM, (b) WASM, again ideally both Apple/ARM and x86, (c) few-generations-old x86. If we really want to go this route, we should consider just using a timer, see discussion at #1752, which would remove all the complexity or need to consider the performance here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I was mostly curious if the "move to a timer approach" would need more with the above changes. Moving to a HashTable should reduce time waiting for memory loads walking the BTree and was curious if the "well, we can do anything and it never shows up in the benchmarks effect might go away. Should probably also check with the hashbrown feature, which swaps the hash function to ahash, hopefully putting more pressure on the scorer to be faster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting back to this, seeing 30-32ms when running with your branch and using an identity function for decayed_offset_msat. Same take away as before.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I'm confused, 30-32ms is higher than "Simulating a fixed time with SinceEpoch brings it down to 25-28ms.". That seems like a strange result?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with the identity function, there's still a call to T::now() in as_directed. When T is SinceEpoch, T::now() is just wrapping a Duration.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, last request, can you run with both the SinceEpoch and my branch and an identity function? My goal is to understand what would happen with my branch and replacing the decaying with doing it in a timer, which would imply no time-fetching in the scorer and no decaying. ie if we assume my branch lands, what is the performance difference between switching to a timer-based decay and removing both the time and the decay logic, compared to doing more decay logic (but maybe pre-fetching time).

Ultimately that should inform whether we go the timer-based approach or just pre-fetch time.

@tnull tnull self-requested a review October 21, 2022 17:37
@jkczyz
Copy link
Contributor Author

jkczyz commented Oct 21, 2022

Pushed the alternative version using floating-point operations for consideration. Could consider feature guarding it if there's a performance concern on specific devices.

arik-so
arik-so previously approved these changes Oct 21, 2022
Copy link
Contributor

@arik-so arik-so left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great!

let elapsed_time = self.now.duration_since(*self.last_updated).as_secs() as f64;
let half_life = self.params.liquidity_offset_half_life.as_secs() as f64;
let decays = elapsed_time / half_life;
(0.5f64.powf(decays) * offset_msat as f64) as u64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, I guess you could also do 2f64.powf() and then use that as the divisor with some special checked_div function if there's any benefit to that, but I don't know whether there is one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks a little slower that way (49-50ms).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh wow, that's actually a bunch. Thanks for checking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the absolute runtime, FWIW. So 1-2ms slower than baseline (i.e., shifting).

@TheBlueMatt
Copy link
Collaborator

I'm super concerned about doing pow and div operations in the scoring logic. I'd much prefer we go the timer approach before we do that.

Copy link
Contributor Author

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed some fixups based on the original approximation. Was able to remove some checked operations that weren't necessary. Also, used a more accurate approximation using an integer multiplication and division vs two shifts and an addition. The older approximation was off by 6%.

Some(decays) => match elapsed_time.checked_div(half_life / 2) {
None => 0,
Some(half_decays) => {
// Decay the offset by the appropriate number of half lives. If half of the next
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need is a strong word. 😛 There is a tradeoff with the added dependency. I'm not entirely convinced it's warranted yet at least without evidence of a problem.

@TheBlueMatt
Copy link
Collaborator

Did you benchmark the new div-based solution?

@jkczyz
Copy link
Contributor Author

jkczyz commented Oct 24, 2022

Did you benchmark the new div-based solution?

Little more variance on my machine today, but seeing 48-52ms for both versions with the upfront sleep and forcing the branch.

decayed_offset_msat
} else {
// 71 / 100 ~= core::f64::consts::FRAC_1_SQRT_2
(decayed_offset_msat as u128 * 71 / 100) as u64
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this a fraction of a power of two?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing.

@TheBlueMatt
Copy link
Collaborator

Any desire to move this forward? Would probably be good, though needs rebase.

@jkczyz
Copy link
Contributor Author

jkczyz commented Jul 10, 2023

Any desire to move this forward? Would probably be good, though needs rebase.

Forgot that this was lying around still. Would this be for 0.0.117?

@jkczyz jkczyz marked this pull request as ready for review July 10, 2023 19:41
@TheBlueMatt
Copy link
Collaborator

Yea much too late for 116, would love to make good progress on scoring improvements in 117.

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, mind squashing?

lightning/src/routing/scoring.rs Outdated Show resolved Hide resolved
Decaying the channel liquidity bounds by a half life can result in a
large decrease, which may have an oscillating affect on whether a
channel is retried. Approximate an additional three-quarter life when
half of the next half life has passed to help smooth out the decay.
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #2483 to track moving to a better time-caching logic.

@TheBlueMatt TheBlueMatt merged commit 7a63ab7 into lightningdevkit:main Aug 9, 2023
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make ProbailisticScorer half-life more linear
4 participants