-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an option to make the success probability estimation nonlinear #2547
Add an option to make the success probability estimation nonlinear #2547
Conversation
|
||
/// min_zero_implies_no_successes signals that an `amount_msat` of 0 means we've not (recently) | ||
/// seen an HTLC successfully complete over this channel. | ||
fn success_probability( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should think about making this pub so users can figure out what the probability is in the liquidity bounds model. Or we can expose the probability from that model directly.
210c193
to
01985b7
Compare
Rebased, now free-standing. |
57a607e
to
6af62f1
Compare
lightning/src/routing/scoring.rs
Outdated
let num = max_pow - (amount - 0.5).powi(3); | ||
let den = max_pow - (min - 0.5).powi(3); | ||
|
||
// Because our numerator and denominator max out at 2^-3 we need to multiply them by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both the numerator and denominator max out at 1/8th?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I showed the math instead - 0.5^3.
85514dc
to
972b965
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still reviewing 99dc821.
@@ -1887,14 +1906,13 @@ mod bucketed_history { | |||
} | |||
let max_bucket_end_pos = BUCKET_START_POS[32 - highest_max_bucket_with_points] - 1; | |||
if payment_pos < max_bucket_end_pos { | |||
let (numerator, denominator) = success_probability(0, payment_pos as u64, | |||
max_bucket_end_pos as u64, POSITION_TICKS as u64 - 1, params, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for using POSITION_TICKS
for the capacity parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the parameters here are scaled to POSITION_TICKS
(i.e. because they're now pos
', rather than msat
amounts).
if min_zero_implies_no_successes && min_liquidity_msat == 0 && | ||
denominator < u64::max_value() / 21 | ||
{ | ||
// If we have no knowledge of the channel, scale probability down by ~75% | ||
// Note that we prefer to increase the denominator rather than decrease the numerator as | ||
// the denominator is more likely to be larger and thus provide greater precision. This is | ||
// mostly an overoptimization but makes a large difference in tests. | ||
denominator = denominator * 21 / 16 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we instead have a separate function that delegates to success_probability
rather than passing a bool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to avoid it, since the singular method is really "the model", and how the "no data" bit is interpreted may change in the future to be a part of the greater math.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it sounds like the parameter is only true for historical probabilities, right? We pass false for the current liquidity estimate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, yea. I think we probably want to change that in the future, but I didn't want to change everything in one go in the same model change.
@@ -1273,13 +1264,12 @@ impl<G: Deref<Target = NetworkGraph<L>>, L: Deref, T: Time> ScoreLookUp for Prob | |||
_ => {}, | |||
} | |||
|
|||
let amount_msat = usage.amount_msat; | |||
let amount_msat = usage.amount_msat.saturating_add(usage.inflight_htlc_msat); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Won't this affect calculations for parameters like base_penalty_amount_multiplier_msat
, liquidity_penalty_amount_multiplier_msat
, etc? Those help avoid fees dominating the channel cost, but those fees will be calculated using amount_msat
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would, but the other way around - it would result in score penalties dominating fees, which I think is what we want - if we already have a pending payment for X over a channel, we want to be careful paying more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, cost is penalty plus fees, so I think what I said was accurate, no?
Anyhow, we'll need to update the configuration docs now that this includes in-flight HTLCs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I'm not quite sure what you're saying, I think. The docs dont look out of date given they dont mention how we handle in-flight HTLCs, though we could consider describing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few places where we say "payment amount" and sometimes explicitly amount_msat
, e.g.:
rust-lightning/lightning/src/routing/scoring.rs
Lines 457 to 464 in a7e3575
/// A multiplier used with the payment amount to calculate a fixed penalty applied to each | |
/// channel, in excess of the [`base_penalty_msat`]. | |
/// | |
/// The purpose of the amount penalty is to avoid having fees dominate the channel cost (i.e., | |
/// fees plus penalty) for large payments. The penalty is computed as the product of this | |
/// multiplier and `2^30`ths of the payment amount. | |
/// | |
/// ie `base_penalty_amount_multiplier_msat * amount_msat / 2^30` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated a few of them.
972b965
to
e424cd6
Compare
Codecov ReportPatch coverage:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #2547 +/- ##
==========================================
+ Coverage 90.63% 91.15% +0.52%
==========================================
Files 113 115 +2
Lines 59057 90304 +31247
Branches 59057 90304 +31247
==========================================
+ Hits 53524 82321 +28797
- Misses 5533 7497 +1964
- Partials 0 486 +486
☔ View full report in Codecov by Sentry. |
lightning/src/routing/scoring.rs
Outdated
// Because the integral is simply a constant times (x - 0.5)^3, this means we simply | ||
// subtract the two bounds mius 0.5 to the 3rd power. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a little refresher on integrals, but I don't understand this sentence. What is the constant here? And how does the second half of the sentence follow from the first?
My understanding is that a 1/3 constant after integrating disappears because we are dividing two integrals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errr, sorry, i had a constant multiple (to normalize to a total probability of 1) in my head but not in the above comment. I've removed it and reworded the comment to be both correct and clearer.
e424cd6
to
fa20962
Compare
@@ -1051,8 +1051,10 @@ fn success_probability( | |||
// the amount, divided by the same integral from the minimum to the maximum liquidity | |||
// bounds. | |||
// | |||
// Because the integral is simply a constant times (x - 0.5)^3, this means we simply | |||
// subtract the two bounds mius 0.5 to the 3rd power. | |||
// Because the integral from x to y is simply (y - 0.5)^3 - (x - 0.5)^3, we can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the integral technically divided by 3? But then dividing the numerator by denominator eliminates it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, but in practice the actual CDF is N*((y - 0.5)^3 - (x - 0.5)^3) where N is a function of x and y, cause we have to have a total from min to max of 100%/1.
fa20962
to
2057cf0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit message for b9e132a is wrong as it mixes up how linear_success_probability
is interpreted.
Fixed the commit message and no-std. |
cff9d5d
to
901e721
Compare
901e721
to
77342c4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to squash the outstanding fixups.
@@ -1017,9 +1035,9 @@ impl<L: Deref<Target = u64>, BRT: Deref<Target = HistoricalBucketRangeTracker>, | |||
score_params.liquidity_penalty_amount_multiplier_msat) | |||
.saturating_add(score_params.considered_impossible_penalty_msat) | |||
} else { | |||
let numerator = (max_liquidity_msat - amount_msat).saturating_add(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you mention in the commit message why the plus one is removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the commit message!
// Add an additional one in the divisor as the payment bucket has been | ||
// rounded down. | ||
(max_bucket_end_pos + 1) as u64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this no longer required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still a + 1 in the denominator, so its actually still there for the linear scoring, but really we don't need it anymore because we have way more positions in the buckets. It doesnt really make sense now that we aren't doing the crazy sub-bucket positioning thing, but rather just making everything discrete against 16000 points.
Our "what is the success probability of paying over a channel with the given liquidity bounds" calculation is reused in a few places, and is a key assumption across our main score calculation and the historical bucket score calculations. Here we break it out into a function to make it easier to experiment with different success probability calculations. Note that this drops the numerator +1 in the liquidity scorer, which was added to compensate for the divisor + 1 (which exists to avoid divide-by-zero), making the new math slightly less correct but not by any material amount.
If we are examining a channel for which we have no information at all, we traditionally assume the HTLC success probability is proportional to the channel's capacity. While this may be the case, it is not the case that a tiny payment over a huge channel is guaranteed to succeed, as we assume. Rather, the probability of such success is likely closer to 50% than 100%. Here we try to capture this by simply scaling the success probability for channels where we have no information down linearly. We pick 75% as the upper bound rather arbitrarily - while 50% may be more accurate, its possible it would lead to an over-reliance on channels which we have paid through in the past, which aren't necessarily always the best candidates. Note that we only do this scaling for the historical bucket tracker, as there we can be confident we've never seen a successful HTLC completion on the given channel. If we were to apply the same scaling to the simple liquidity bounds based scoring we'd penalize channels we've never tried over those we've only ever fails to pay over, which is obviously not a good outcome.
77342c4
to
689c2bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Feel free to squash.
689c2bc
to
f25535b
Compare
Squashed without further changes. |
When we started considering the in-flight amounts when scoring, we took the approach of considering the in-flight amount as an effective reduction in the channel's total capacity. When we were scoring using a flat success probability PDF, that was fine, however in the next commit we'll move to a highly nonlinear one, which makes this a pretty confusing heuristic. Here, instead, we move to considering the in-flight amount as simply an extension of the amount we're trying to send over the channel, which is equivalent for the flat success probability PDF, but makes much more sense in a nonlinear world.
Our "what is the success probability of paying over a channel with the given liquidity bounds" calculation currently assumes the probability of where the liquidity lies in a channel is constant across the entire capacity of a channel. This is obviously a somewhat dubious assumption given most nodes don't materially rebalance and flows within the network often push liquidity "towards the edges". Here we add an option to consider this when scoring channels during routefinding. Specifically, if a new `linear_success_probability` flag is unset on `ProbabilisticScoringFeeParameters`, rather than assuming a PDF of `1` (across the channel's capacity scaled from 0 to 1), we use `(x - 0.5)^2`. This assumes liquidity is likely to be near the edges, which matches experimental results. Further, calculating the CDF (i.e. integral) between arbitrary points on the PDF is trivial, which we do as our main scoring function. While this (finally) introduces floats in our scoring, its not practical to exponentiate using fixed-precision, and benchmarks show this is a performance regression, but not a huge one, more than made up for by the increase in payment success rates.
f25535b
to
f2b2920
Compare
COrrected the docs one more time, but went ahead and squashed: $ git diff-tree -U2 f25535b7 f2b2920b
diff --git a/lightning/src/routing/scoring.rs b/lightning/src/routing/scoring.rs
index 638e8ddf6..6bdf59e85 100644
--- a/lightning/src/routing/scoring.rs
+++ b/lightning/src/routing/scoring.rs
@@ -494,6 +494,6 @@ pub struct ProbabilisticScoringFeeParameters {
/// The purpose of the amount penalty is to avoid having fees dominate the channel cost (i.e.,
/// fees plus penalty) for large payments. The penalty is computed as the product of this
- /// multiplier and `2^20`ths of the payment amount, weighted by the negative `log10` of the
- /// success probability.
+ /// multiplier and `2^20`ths of the amount flowing over this channel, weighted by the negative
+ /// `log10` of the success probability.
///
/// `-log10(success_probability) * liquidity_penalty_amount_multiplier_msat * amount_msat / 2^20`
@@ -530,6 +530,7 @@ pub struct ProbabilisticScoringFeeParameters {
///
/// The purpose of the amount penalty is to avoid having fees dominate the channel cost for
- /// large payments. The penalty is computed as the product of this multiplier and the `2^20`ths
- /// of the payment amount, weighted by the negative `log10` of the success probability.
+ /// large payments. The penalty is computed as the product of this multiplier and `2^20`ths
+ /// of the amount flowing over this channel, weighted by the negative `log10` of the success
+ /// probability.
///
/// This penalty is similar to [`liquidity_penalty_amount_multiplier_msat`], however, instead
$ |
Our "what is the success probability of paying over a channel with
the given liquidity bounds" calculation currently assumes the
probability of where the liquidity lies in a channel is constant
across the entire capacity of a channel. This is obviously a
somewhat dubious assumption given most nodes don't materially
rebalance and flows within the network often push liquidity
"towards the edges".
Here we add an option to consider this when scoring channels during
routefinding. Specifically, if a new
linear_success_probability
flag is set on
ProbabilisticScoringFeeParameters
, rather thanassuming a PDF of
1
(across the channel's capacity scaled from 0to 1), we use
(x - 0.5)^6
.This assumes liquidity is very likely to be near the edges, which
matches experimental results. Further, calculating the CDF (i.e.
integral) between arbitrary points on the PDF is trivial, which we
do as our main scoring function.
While this (finally) introduces floats in our scoring, its not
practical to exponentiate using fixed-precision, and benchmarks
show this is a performance regression, but not a huge one, more
than made up for by the increase in payment success rates.
Depends on #2176