Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stuck channels because of small fee increase #728

Closed
t-bast opened this issue Jan 17, 2020 · 23 comments · Fixed by #740
Closed

Stuck channels because of small fee increase #728

t-bast opened this issue Jan 17, 2020 · 23 comments · Fixed by #740

Comments

@t-bast
Copy link
Collaborator

t-bast commented Jan 17, 2020

I've been banding my head because of fee management and stuck channels. I've found a not so unreasonable situation where channels can become completely unusable and I don't see a way to unblock them. I'm probably re-discovering something people already know, and I'm interested in your feedback and potential mitigations @cdecker @rustyrussell @pm47 @Roasbeef @cfromknecht @joostjager. I've tested eclair, c-lighting and lnd and they all behave the same (it seems lnd goes into an infinite loop though, might be worth investigating).

Problem statement

Imagine we have Alice open a 150.000 sat channel to Bob, a feerateperkw of 10.000 sat/kw and a 1% reserve.
The CommitTx fee is 724*10 = 7240 sat when there's no pending HTLC.
The additional fee per HTLC is 172*10 = 1720 sat.

The balance of the channel can then be:

  • 7240+1500 = 8740 sat on Alice's side
  • 141260 sat on Bob's side

At that point the channel is completely valid but is now unusable.
Alice can't send any HTLC to Bob: that's expected, it's not an issue.
However Bob can't send any HTLC to Alice either: when Bob prepares adding this HTLC, the CommitTx weight becomes 724+172. Bob notices that sending this HTLC would make Alice unable to afford the fee for this updated CommitTx, so Bob aborts and doesn't send the HTLC (regardless of the HTLC amount, unless it's below dust). The only way to unblock the channel is to wait for a feerate decrease of at least 19.2% (172/(172+724)). And when that feerate decrease happens, you need to send an HTLC Bob -> Alice before the fee increases again, otherwise you missed your fix window.

Proposed solution

Now why doesn't Bob send the HTLC? In my opinion Bob can safely send the HTLC; however Alice takes a risk accepting it (and currently Alice would reject the HTLC in all 3 implementations). The only risk I see is that if Bob broadcasts the updated CommitTx while the HTLC is pending, the fee will be lower than the channel's feerate, so the transaction may take a while to confirm.

If the HTLC is fulfilled, but the CommitTx doesn't confirm, there will be an on-chain race condition between Alice's HTLC-success and Bob's HTLC timeout; if Bob wins the race, Bob will have stolen the HTLC amount from Alice.

It seems to me that Alice could easily take this risk; the estimated fee is supposed to be quite high already (the spec says the current fee rate is sufficient (by a significant margin) for timely processing of the commitment transaction), so even with an added 23.7% weight to the commit tx it should confirm in time to avoid the race condition. But maybe we should restrict that case to a single added HTLC (otherwise the weight can grow unbounded); that would be sufficient to unblock the channel. Would that be too reckless in your opinion? Or is it a reasonable trade-off to avoid this problem?

Notes (readers may skip)

CPFP / Anchor outputs

With CPFP on the CommitTx (anchor outputs) we should allow such HTLCs; if Bob broadcasts the CommitTx and it doesn't confirm, just CPFP it before the race condition is reached.

But I'd like to fix this in the shorter term too if possible :).

Reaching the faulty state

It's easy to reach that faulty state for testing/repro by using the push_msat in the open_channel.
Just set push_msat to 141260000 (with the values used in this issue) and you should be in that state.

In real life though that's not how channels will end up in this state. Alice won't be able to send this amount via HTLCs because before sending, we take into account the weight this HTLC adds to the CommitTx (172*feerate).

However imagine the feerate decreases to 8.000 sat/kw. Alice can now send an HTLC of 141260 sat to Bob. Then the fee increases back to 10.000 sat/kw after the HTLC has been fulfilled and removed from the CommitTx: the update_fee is thus valid, and now Alice and Bob have a stuck channel.

@TheBlueMatt
Copy link
Collaborator

I may be missing some details of the issue here, but it seems like an alternative and somewhat simpler solution would be for Alice to artificially limit herself from getting that close to the reserve value. Ignoring fee updates, if Alice ensures that she always has enough local balance to pay for, eg, five more HTLCs then Bob would have no problem sending. It also would be neatly backwards compatible protocol-wise.

@halseth
Copy link
Contributor

halseth commented Jan 20, 2020

This is something we've had users reporting before (lightningnetwork/lnd#3429) and work around has been to "increase" the fee balance available to Alice by having Bob send dust HTLCs. This is obviously not a good long term solution, but I was hoping this edge case was rare enough to wait for the nondeterminism of update_fee to be a thing of the past to fix this by itself.

This together with the failure case of having simultaneous update_adds leading to an invalid channel state were two big motivation factors behind the work on anchor outputs.

(it seems lnd goes into an infinite loop though, might be worth investigating).

Are you talking about a loop when attempting a payment here? If yes, this is something we are working on in lightningnetwork/lnd#3787

@halseth
Copy link
Contributor

halseth commented Jan 20, 2020

I may be missing some details of the issue here, but it seems like an alternative and somewhat simpler solution would be for Alice to artificially limit herself from getting that close to the reserve value. Ignoring fee updates, if Alice ensures that she always has enough local balance to pay for, eg, five more HTLCs then Bob would have no problem sending. It also would be neatly backwards compatible protocol-wise.

Won't this just make the problem appear earlier? Instead of Alice rejecting new HTLCs becaus it will dip her below the reserve, she'll reject HTLCs since it will dip here below this reserve+5*htlcFee limit.

Maybe I'm just not understanding exactly how this soft limit would work from Alice's POV.

@TheBlueMatt
Copy link
Collaborator

Won't this just make the problem appear earlier? Instead of Alice rejecting new HTLCs becaus it will dip her below the reserve, she'll reject HTLCs since it will dip here below this reserve+5*htlcFee limit.

Nono, the opposite - the issue is that Alice the initiator gets herself into too little reserve by sending too much towards Bob. If Alice stops sending HTLCs towards Bob before reaching the point where she's close to her own reserve, then Bob can always send back towards Alice.

@halseth
Copy link
Contributor

halseth commented Jan 21, 2020

Ah, makes sense! So Alice will keep another "reserve" for fees for Bob's HTLCs. That's quite elegant, I think that would work :) Still it won't handle massive feerate changes in all cases, but the larger you make this buffer the less likely this situation is to occur.

@t-bast
Copy link
Collaborator Author

t-bast commented Jan 21, 2020

Are you talking about a loop when attempting a payment here? If yes, this is something we are working on in lightningnetwork/lnd#3787

Yes that's exactly this. Good to know it's being worked on, nothing new then.

@t-bast
Copy link
Collaborator Author

t-bast commented Jan 21, 2020

Thanks for the feedback guys! An additional reserve is indeed a good idea for a short-term fix without protocol changes, that's probably good enough to mitigate the issue until CPFP on CommitTx lands.

It feels a bit sad though because isn't that what the existing reserve could be used for?

t-bast added a commit to ACINQ/eclair that referenced this issue Jan 27, 2020
Otherwise we may run into the following issue:
lightning/bolts#728
@t-bast
Copy link
Collaborator Author

t-bast commented Feb 4, 2020

I'd like to point out that @TheBlueMatt's suggestion only works if all implementations do this, and only if they do this with the same hard-coded number of htlcs.

Imagine for example that Alice's node implements this, but Bob's does not.
Alice will avoid sending too much to Bob, which is good: the channel won't be stuck with all the balance on Bob's side.
However Bob may try to send too much to Alice, which Alice will reject because of the new rule.
Bob's software won't understand why the HTLC is rejected and won't know that it needs to send slightly less. This is particularly visible when Bob does an "empty wallet" where he sends all of his balance out for a swap-out for example. Bob's software will display a balance of X, but sending those X will always fail. Then it's all downhill with tweet-shaming, "what did you do with my money"-kind of thing, and all types of ugliness which we'd like to avoid.

If we do agree that a mitigation to this issue needs to be implemented by all lightning implementations, I believe we currently have two solutions:

  1. Artificial new reserve (5*htlcFee)
  2. When stuck, allow exactly one HTLC: this could be an issue if fees are rising too much, because after the HTLC-fulfill the sender of the HTLC could force close (instead of signing the new commit-tx) hope for an on-chain race condition to win the HTLC-timeout

I'm leaning towards the safer option 1.
Please vote by liking this comment with 🚀 for option 1 or 👀 for option 2.
I will wait for at least one vote from each implementation before submitting a spec PR @halseth @cdecker @rustyrussell @TheBlueMatt

@ZmnSCPxj
Copy link
Collaborator

ZmnSCPxj commented Feb 11, 2020

@m-schmoock has discovered an even worse way to trigger this, remotely, by having Alice send trimmed HTLCs to send over the fee that the first HTLC used to have. This triggers this even without an onchain feerate increase I think. ElementsProject/lightning#3498

rustyrussell added a commit to rustyrussell/lightning that referenced this issue Feb 11, 2020
Add new check if we're funder trying to add HTLC, keeping us
with enough extra funds to pay for another HTLC the peer might add.

Changelog-Fixed: Corner case where channel could become unusable (lightning/bolts#728)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
rustyrussell added a commit to rustyrussell/lightning that referenced this issue Feb 11, 2020
Add new check if we're funder trying to add HTLC, keeping us
with enough extra funds to pay for another HTLC the peer might add.

Changelog-Fixed: Corner case where channel could become unusable (lightning/bolts#728)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@t-bast
Copy link
Collaborator Author

t-bast commented Feb 11, 2020

Good catch, that's another way to lock the channels. We should definitely do something about that, sooner than later IMHO.
I was waiting for feedback before opening a spec PR, but looking at the discussions you had on the c-lightning repo I believe you're also leaning more towards the "additional reserve" solution (where we would just need to bikeshed the multiplier).
I'll open a PR today to get things moving.

rustyrussell added a commit to rustyrussell/lightning that referenced this issue Feb 11, 2020
Add new check if we're funder trying to add HTLC, keeping us
with enough extra funds to pay for another HTLC the peer might add.

We also need to adjust the spendable_msat calculation, and update
various tests which try to unbalance channels.  We eliminate
the now-redundant test_channel_drainage entirely.

Changelog-Fixed: Corner case where channel could become unusable (lightning/bolts#728)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
cdecker pushed a commit to ElementsProject/lightning that referenced this issue Feb 11, 2020
Add new check if we're funder trying to add HTLC, keeping us
with enough extra funds to pay for another HTLC the peer might add.

We also need to adjust the spendable_msat calculation, and update
various tests which try to unbalance channels.  We eliminate
the now-redundant test_channel_drainage entirely.

Changelog-Fixed: Corner case where channel could become unusable (lightning/bolts#728)
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
t-bast added a commit that referenced this issue Feb 11, 2020
Add an additional "reserve" for funders on top of the real reserve to
avoid getting in a state where the channel is unusable because
of the increased commit tx cost of a new HTLC.

Requirements are only added for the funder sending an HTLC.
Fundee receiving HTLCs may choose to verify that funders apply
this, but it may lead to an unusable UX.

Fixes #728.
t-bast added a commit that referenced this issue Feb 11, 2020
Add an additional "reserve" for funders on top of the real reserve to
avoid getting in a state where the channel is unusable because
of the increased commit tx cost of a new HTLC.

Requirements are only added for the funder sending an HTLC.
Fundee receiving HTLCs may choose to verify that funders apply
this, but it may lead to an unusable UX.

Fixes #728.
t-bast added a commit to ACINQ/eclair that referenced this issue Feb 11, 2020
See lightning/bolts#728

Add an additional reserve on the funder to prevent emptying and then
being stuck with an unusable channel.

As fundee we don't verify funders comply with that change.
We may enforce it in the future when we're confident the network as a
whole enforces that.
t-bast added a commit to ACINQ/eclair that referenced this issue Feb 12, 2020
See lightning/bolts#728

Add an additional reserve on the funder to prevent emptying and then
being stuck with an unusable channel.

As fundee we don't verify funders comply with that change.
We may enforce it in the future when we're confident the network as a
whole enforces that.
@TheBlueMatt
Copy link
Collaborator

@t-bast I dont see a reason why Bob needs to care - HTLCs can be rejected for any reason by Alice (including "I'm feeling sleepy today, I don't feel like dealing with a forward"). Bob either has to close in response or deal with it.

@ZmnSCPxj correct. The meeting discussion centered around the fact that this really has little to do with fees, and more to do with reserves.

@t-bast
Copy link
Collaborator Author

t-bast commented Feb 13, 2020

I dont see a reason why Bob needs to care

I agree that we can definitely do without it, especially at first to unblock the situation (and this is what I proposed in #740) but that provides a poorer UX.

If Bob's software tells him "hey your balance is X, you can send up to X to Alice" and then sending X to Alice consistently fails, that's weird. We've had support tickets for less than that 😄. Right now lnd, c-lightning and eclair both apply fee and reserve conditions symmetrically so when it says "you can send X" you can really send that amount regardless of what implementation your counter-party runs, which is nice.

@m-schmoock
Copy link
Collaborator

m-schmoock commented Feb 13, 2020

I really like Rustys proposal that allows fundee to dip into funder reserves (once).(ElementsProject/lightning#3501)

This will not add additional reserve margins and does not require assumptions on how fees might behave. However, this behavior must be known and accepted to the peer at a protocol level.

Edit: We have to make sure he does not dip too deep into reserves :/ Ideas?

@t-bast
Copy link
Collaborator Author

t-bast commented Feb 13, 2020

I really like Rustys proposal that allows fundee to dip into funder reserves (once).(ElementsProject/lightning#3501)

I also believe the reserve could be used for that (it was mentioned earlier in this issue).
But right now this would require a change on all implementations, because lnd, c-lightning and eclair will not even send such an HTLC because it knows the remote cannot afford the increased fee.
So it would need a change on all implementations (which can be ok IMHO but needs others to ACK).

@halseth
Copy link
Contributor

halseth commented Feb 13, 2020

It sounds like it could open another can of worms, since you would need to validate the size of the "remaining" channel reserve. It would also make it impossible to have channels with zero reserve (because then you could easily get into this situation again). In high fee scenarios you could even risk that the reserve wont be enough to cover the HTLC fee.

Not saying it won't work, but we have to start altering the spec to mitigate this issue, maybe worth starting to think about how to fundamentally fix it? (the party offering the HTLC also pay fees)

@t-bast
Copy link
Collaborator Author

t-bast commented Feb 13, 2020

you would need to validate the size of the "remaining" channel reserve.

That can indeed get nasty, as @ZmnSCPxj mentioned in ElementsProject/lightning#3501. It does require more thoughts and experiments before we can safely do that.

It would also make it impossible to have channels with zero reserve (because then you could easily get into this situation again).

Only for the funder, you can still have 0-reserve if you're the fundee, right?

I think that the fundamental change is that once there's a way to do CPFP and RBF (anchor outputs), we can simply allow HTLCs regardless of whether the increased commit tx fee can be paid or not. If no-one closes the channel the HTLCs will resolve and the situation gets back to normal, and if someone broadcasts the commit tx hoping to steal funds with an on-chain race-condition, the attacked party can CPFP to prevent the attacker from stealing funds.

@halseth
Copy link
Contributor

halseth commented Feb 13, 2020

I think that the fundamental change is that once there's a way to do CPFP and RBF (anchor outputs), we can simply allow HTLCs regardless of whether the increased commit tx fee can be paid or not. If no-one closes the channel the HTLCs will resolve and the situation gets back to normal, and if someone broadcasts the commit tx hoping to steal funds with an on-chain race-condition, the attacked party can CPFP to prevent the attacker from stealing funds.

This is true in a no-commitfee world, since we would still need to keep the fee above the min-relay fee. With package relay and anchor outputs this will be much simplified indeed :)

@t-bast
Copy link
Collaborator Author

t-bast commented Feb 13, 2020

we would still need to keep the fee above the min-relay fee.

Arf I tend to forget about this, thanks for reminding me about this annoying detail :)

@TheBlueMatt
Copy link
Collaborator

We would lose the current assumption that your honest peer never should send you updates you won't accept

I don't know how we could do that? There is no way to reject an update your counterparty sent you without first accepting the commitment transaction that does so.

It sounds like it could open another can of worms, since you would need to validate the size of the "remaining" channel reserve

Agreed here. Trying to reduce the effective reserve value from today doesn't make sense to me - its there and selected to be a value that has specific meaning. If we aren't able to meet the security requirements of said meaning, we should add a second reserve, not remove the existing one.

@joostjager
Copy link
Collaborator

I agree with @halseth that we can have a transition period in which the protocol isn't changed and also no hard rejection of certain messages is implemented.

If all implementations make sure that they never send htlcs or fee updates that would push either one of the parties into the unusable state, isn't the problem going away when the network upgrades?

@m-schmoock
Copy link
Collaborator

@joostjager

make sure that they never send htlcs or fee updates that would push either one of the parties into the unusable state...

Yes, but fee updates are there for a reason, we can't just suppress or delay them indefinitely them if remote happens run very low on capacity or fees happened to have changed in an unfortunate way. Not accepting an (trimmed or untrimmed) HTLC when already very low is an option, but then you have to make assumptions on when you think a HTLC would still be 'acceptable', which again depends on how the fees will behave in the future.

As interim mitigation, latest c-lightning will now raise CAPACITY_EXCEEDED on a new HTLC if a fee increase of 50% would lead to a lockup situation, see: ElementsProject/lightning@86c28b2 . Not elegant, but it reduces the risk of running into this situation for now.

@joostjager
Copy link
Collaborator

Creating more leeway is fine. But then still you have to decide whether to update or not update the fee if the update would push you into the lockup situation. Do you choose a fee that is too low or do you lock yourself up?

t-bast added a commit that referenced this issue Feb 24, 2020
Allow funders to dip into their channel reserve once to be able
to pay the increased commit tx fee for a pending HTLC.

This prevents channels from getting in a state where the channel is
unusable because of the increased commit tx cost of a new HTLC.

Fixes #728.
t-bast added a commit that referenced this issue Feb 24, 2020
Allow funders to dip into their channel reserve once to be able
to pay the increased commit tx fee for a pending HTLC.

This prevents channels from getting in a state where the channel is
unusable because of the increased commit tx cost of a new HTLC.

Fixes #728.
t-bast added a commit to ACINQ/eclair that referenced this issue Feb 24, 2020
See lightning/bolts#728

Allow a funder to dip into its channel reserve to pay the increased
commit tx fee for one incoming HTLC.
This prevents being stuck with an unusable channel.
t-bast added a commit to ACINQ/eclair that referenced this issue Mar 3, 2020
See lightning/bolts#728

Add an additional reserve on the funder to prevent emptying and then
being stuck with an unusable channel.

As fundee we don't verify funders comply with that change.
We may enforce it in the future when we're confident the network as a
whole enforces that.
t-bast added a commit that referenced this issue Mar 3, 2020
Add an additional "reserve" for funders on top of the real reserve to
avoid getting in a state where the channel is unusable because
of the increased commit tx cost of a new HTLC.

Requirements are only added for the funder sending an HTLC.
Fundee receiving HTLCs may choose to verify that funders apply
this, but it may lead to an unusable UX.

Fixes #728.
t-bast added a commit to ACINQ/eclair that referenced this issue Mar 11, 2020
See lightning/bolts#728

Add an additional reserve on the funder to prevent emptying and then
being stuck with an unusable channel.

As fundee we don't verify funders comply with that change.
We may enforce it in the future when we're confident the network as a
whole enforces that.
t-bast added a commit that referenced this issue Apr 27, 2020
Add an additional "reserve" for funders on top of the real reserve to
avoid getting in a state where the channel is unusable because
of the increased commit tx cost of a new HTLC.

Requirements are only added for the funder sending an HTLC.
Fundee receiving HTLCs may choose to verify that funders apply
this, but it may lead to an unusable UX.

Fixes #728.
remyers added a commit to remyers/eclair that referenced this issue Aug 18, 2023
remyers added a commit to remyers/eclair that referenced this issue Aug 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants