Skip to content

Conversation

@thomash-acinq
Copy link
Contributor

@thomash-acinq thomash-acinq commented May 16, 2022

Currently when a payment fails we ban the faulty channel for some time (60 seconds by default). There are several problems with that:

  • If we try to relay a huge payment and it fails, we may still be able to relay a smaller payment, but we have temporarily banned the channel
  • If the ban duration is too short, it still won't be able to relay payments when we unban it. Many channels are unbalanced and inactive and if they are considered attractive by our path-finding algorithm (because they're big and cheap), they'll keep coming back and make our payment attempts fail.

In this PR, we introduce a probabilistic estimation of the channel balance to replace this binary behavior. We use information from every payment attempt to update the probability distribution of channels balances. We use a half-life relaxation model to relax our estimations towards the uniform balance distribution baseline when we don't have new data to feed into the estimation engine.

@thomash-acinq thomash-acinq requested a review from t-bast May 16, 2022 16:06
@thomash-acinq thomash-acinq force-pushed the balance-estimate-per-side branch 2 times, most recently from 06e3c41 to 64ccad4 Compare May 17, 2022 07:22
@t-bast
Copy link
Member

t-bast commented May 18, 2022

It make the balance information not symmetrical.

I don't understand why, can you detail this statement?
I believe we can easily keep the information symmetrical, I really don't see why we would lose that property.

Also, the PR doesn't even build, you probably want to fix that.

@thomash-acinq
Copy link
Contributor Author

If you consider the total capacity, being able to send x means that you can't send totalCapacity - x in the other direction. However with maxCapacity, you don't know anything about sending maxCapacity - x in the other direction.

@t-bast
Copy link
Member

t-bast commented May 18, 2022

I understand what you mean, the missing information is that you may send x in one direction through one channel, the other side then cannot send maxCapacity - x through this same channel but could send it through another channel. You really need to detail that in comments in that file otherwise readers will waste understanding this.

I think this really is the right model. Even with totalCapacity, the information actually wasn't symmetrical either because there may be unannounced channels between nodes. I think this PR is going in the right direction, but the build needs to be fixed and this comment should be addressed. Once that's done, I can review it (now that I'm out of my dual funding rabbit hole) and we should be able to integrate this quickly. How does that sound?

EDIT:

Actually on second though, it is more complex than that, I see different limitations to each model:

  • when using the sum of channel capacities, we strongly over-estimate canSend (because we actually can not send more than the biggest channel's capacity in one HTLC)
  • when using the max of channel capacities, we strongly under-estimate canSend (because if we empty one channel, we will think that canSend is now 0, whereas we could potentially still send on other channels)

I think the second model still works better, but on didSend we should not decrease our canSend value (I haven't checked whether that's the behavior of the PR, let me know if it already is).

@thomash-acinq thomash-acinq force-pushed the balance-estimate-per-side branch 4 times, most recently from 6bf5ee2 to e32f06d Compare May 24, 2022 15:40
@thomash-acinq thomash-acinq force-pushed the balance-estimate-per-side branch from e32f06d to 60488df Compare May 25, 2022 16:55
@thomash-acinq thomash-acinq marked this pull request as ready for review May 25, 2022 16:59
@thomash-acinq thomash-acinq force-pushed the balance-estimate-per-side branch from 60488df to 53fa189 Compare May 25, 2022 17:04
@thomash-acinq
Copy link
Contributor Author

Fixed a bunch of things and added some tests. Should be ready now.

@codecov-commenter
Copy link

codecov-commenter commented May 25, 2022

Codecov Report

Merging #2272 (53fa189) into master (9c7ddcd) will increase coverage by 0.11%.
The diff coverage is 96.18%.

@@            Coverage Diff             @@
##           master    #2272      +/-   ##
==========================================
+ Coverage   84.32%   84.43%   +0.11%     
==========================================
  Files         194      195       +1     
  Lines       14411    14508      +97     
  Branches      567      587      +20     
==========================================
+ Hits        12152    12250      +98     
+ Misses       2259     2258       -1     
Impacted Files Coverage Δ
...r/acinq/eclair/payment/send/PaymentLifecycle.scala 87.79% <90.00%> (+0.06%) ⬆️
...main/scala/fr/acinq/eclair/router/Validation.scala 91.22% <90.00%> (+0.78%) ⬆️
...scala/fr/acinq/eclair/router/BalanceEstimate.scala 97.29% <97.29%> (ø)
...re/src/main/scala/fr/acinq/eclair/NodeParams.scala 92.80% <100.00%> (+0.02%) ⬆️
...cinq/eclair/remote/EclairInternalsSerializer.scala 97.77% <100.00%> (ø)
...main/scala/fr/acinq/eclair/router/Monitoring.scala 100.00% <100.00%> (ø)
...src/main/scala/fr/acinq/eclair/router/Router.scala 94.33% <100.00%> (+0.58%) ⬆️
...main/scala/fr/acinq/eclair/io/PeerConnection.scala 85.60% <0.00%> (-0.38%) ⬇️
...cala/fr/acinq/eclair/payment/relay/NodeRelay.scala 97.56% <0.00%> (+1.62%) ⬆️

t-bast added 2 commits June 8, 2022 10:31
We refactor the balance estimates to be bundled with the graph in router data.
This lets the compiler find for us places where we forgot to update
balances, such as when pruning channels.

The changes to Validation.scala are probably easier to compare against
master than comparing them to the base PR, they make the change more
obvious compared to master.

We almost never use the `a max b` syntax in the codebase, we always use a
more explicit function call, so I updated it in BalanceEstimate for
consistency.

This commit doesn't contain meaningful business logic changes and should
be easy to review.
* Fix division by zero
* Add didReceive
* Add tests
@thomash-acinq thomash-acinq merged commit 4722755 into master Jun 8, 2022
@t-bast t-bast deleted the balance-estimate-per-side branch June 8, 2022 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants