Stuck channels because of small fee increase #728

t-bast · 2020-01-17T14:37:41Z

I've been banding my head because of fee management and stuck channels. I've found a not so unreasonable situation where channels can become completely unusable and I don't see a way to unblock them. I'm probably re-discovering something people already know, and I'm interested in your feedback and potential mitigations @cdecker @rustyrussell @pm47 @Roasbeef @cfromknecht @joostjager. I've tested eclair, c-lighting and lnd and they all behave the same (it seems lnd goes into an infinite loop though, might be worth investigating).

Problem statement

Imagine we have Alice open a 150.000 sat channel to Bob, a feerateperkw of 10.000 sat/kw and a 1% reserve.
The CommitTx fee is 724*10 = 7240 sat when there's no pending HTLC.
The additional fee per HTLC is 172*10 = 1720 sat.

The balance of the channel can then be:

7240+1500 = 8740 sat on Alice's side
141260 sat on Bob's side

At that point the channel is completely valid but is now unusable.
Alice can't send any HTLC to Bob: that's expected, it's not an issue.
However Bob can't send any HTLC to Alice either: when Bob prepares adding this HTLC, the CommitTx weight becomes 724+172. Bob notices that sending this HTLC would make Alice unable to afford the fee for this updated CommitTx, so Bob aborts and doesn't send the HTLC (regardless of the HTLC amount, unless it's below dust). The only way to unblock the channel is to wait for a feerate decrease of at least 19.2% (172/(172+724)). And when that feerate decrease happens, you need to send an HTLC Bob -> Alice before the fee increases again, otherwise you missed your fix window.

Proposed solution

Now why doesn't Bob send the HTLC? In my opinion Bob can safely send the HTLC; however Alice takes a risk accepting it (and currently Alice would reject the HTLC in all 3 implementations). The only risk I see is that if Bob broadcasts the updated CommitTx while the HTLC is pending, the fee will be lower than the channel's feerate, so the transaction may take a while to confirm.

If the HTLC is fulfilled, but the CommitTx doesn't confirm, there will be an on-chain race condition between Alice's HTLC-success and Bob's HTLC timeout; if Bob wins the race, Bob will have stolen the HTLC amount from Alice.

It seems to me that Alice could easily take this risk; the estimated fee is supposed to be quite high already (the spec says the current fee rate is sufficient (by a significant margin) for timely processing of the commitment transaction), so even with an added 23.7% weight to the commit tx it should confirm in time to avoid the race condition. But maybe we should restrict that case to a single added HTLC (otherwise the weight can grow unbounded); that would be sufficient to unblock the channel. Would that be too reckless in your opinion? Or is it a reasonable trade-off to avoid this problem?

Notes (readers may skip)

CPFP / Anchor outputs

With CPFP on the CommitTx (anchor outputs) we should allow such HTLCs; if Bob broadcasts the CommitTx and it doesn't confirm, just CPFP it before the race condition is reached.

But I'd like to fix this in the shorter term too if possible :).

Reaching the faulty state

It's easy to reach that faulty state for testing/repro by using the push_msat in the open_channel.
Just set push_msat to 141260000 (with the values used in this issue) and you should be in that state.

In real life though that's not how channels will end up in this state. Alice won't be able to send this amount via HTLCs because before sending, we take into account the weight this HTLC adds to the CommitTx (172*feerate).

However imagine the feerate decreases to 8.000 sat/kw. Alice can now send an HTLC of 141260 sat to Bob. Then the fee increases back to 10.000 sat/kw after the HTLC has been fulfilled and removed from the CommitTx: the update_fee is thus valid, and now Alice and Bob have a stuck channel.

The text was updated successfully, but these errors were encountered:

TheBlueMatt · 2020-01-20T20:07:38Z

I may be missing some details of the issue here, but it seems like an alternative and somewhat simpler solution would be for Alice to artificially limit herself from getting that close to the reserve value. Ignoring fee updates, if Alice ensures that she always has enough local balance to pay for, eg, five more HTLCs then Bob would have no problem sending. It also would be neatly backwards compatible protocol-wise.

halseth · 2020-01-20T21:00:14Z

This is something we've had users reporting before (lightningnetwork/lnd#3429) and work around has been to "increase" the fee balance available to Alice by having Bob send dust HTLCs. This is obviously not a good long term solution, but I was hoping this edge case was rare enough to wait for the nondeterminism of update_fee to be a thing of the past to fix this by itself.

This together with the failure case of having simultaneous update_adds leading to an invalid channel state were two big motivation factors behind the work on anchor outputs.

(it seems lnd goes into an infinite loop though, might be worth investigating).

Are you talking about a loop when attempting a payment here? If yes, this is something we are working on in lightningnetwork/lnd#3787

halseth · 2020-01-20T21:03:30Z

I may be missing some details of the issue here, but it seems like an alternative and somewhat simpler solution would be for Alice to artificially limit herself from getting that close to the reserve value. Ignoring fee updates, if Alice ensures that she always has enough local balance to pay for, eg, five more HTLCs then Bob would have no problem sending. It also would be neatly backwards compatible protocol-wise.

Won't this just make the problem appear earlier? Instead of Alice rejecting new HTLCs becaus it will dip her below the reserve, she'll reject HTLCs since it will dip here below this reserve+5*htlcFee limit.

Maybe I'm just not understanding exactly how this soft limit would work from Alice's POV.

TheBlueMatt · 2020-01-20T21:28:18Z

Won't this just make the problem appear earlier? Instead of Alice rejecting new HTLCs becaus it will dip her below the reserve, she'll reject HTLCs since it will dip here below this reserve+5*htlcFee limit.

Nono, the opposite - the issue is that Alice the initiator gets herself into too little reserve by sending too much towards Bob. If Alice stops sending HTLCs towards Bob before reaching the point where she's close to her own reserve, then Bob can always send back towards Alice.

halseth · 2020-01-21T08:16:54Z

Ah, makes sense! So Alice will keep another "reserve" for fees for Bob's HTLCs. That's quite elegant, I think that would work :) Still it won't handle massive feerate changes in all cases, but the larger you make this buffer the less likely this situation is to occur.

t-bast · 2020-01-21T08:53:53Z

Are you talking about a loop when attempting a payment here? If yes, this is something we are working on in lightningnetwork/lnd#3787

Yes that's exactly this. Good to know it's being worked on, nothing new then.

t-bast · 2020-01-21T08:55:48Z

Thanks for the feedback guys! An additional reserve is indeed a good idea for a short-term fix without protocol changes, that's probably good enough to mitigate the issue until CPFP on CommitTx lands.

It feels a bit sad though because isn't that what the existing reserve could be used for?

Otherwise we may run into the following issue: lightning/bolts#728

t-bast · 2020-02-04T15:42:08Z

I'd like to point out that @TheBlueMatt's suggestion only works if all implementations do this, and only if they do this with the same hard-coded number of htlcs.

Imagine for example that Alice's node implements this, but Bob's does not.
Alice will avoid sending too much to Bob, which is good: the channel won't be stuck with all the balance on Bob's side.
However Bob may try to send too much to Alice, which Alice will reject because of the new rule.
Bob's software won't understand why the HTLC is rejected and won't know that it needs to send slightly less. This is particularly visible when Bob does an "empty wallet" where he sends all of his balance out for a swap-out for example. Bob's software will display a balance of X, but sending those X will always fail. Then it's all downhill with tweet-shaming, "what did you do with my money"-kind of thing, and all types of ugliness which we'd like to avoid.

If we do agree that a mitigation to this issue needs to be implemented by all lightning implementations, I believe we currently have two solutions:

Artificial new reserve (5*htlcFee)
When stuck, allow exactly one HTLC: this could be an issue if fees are rising too much, because after the HTLC-fulfill the sender of the HTLC could force close (instead of signing the new commit-tx) hope for an on-chain race condition to win the HTLC-timeout

I'm leaning towards the safer option 1.
Please vote by liking this comment with 🚀 for option 1 or 👀 for option 2.
I will wait for at least one vote from each implementation before submitting a spec PR @halseth @cdecker @rustyrussell @TheBlueMatt

ZmnSCPxj · 2020-02-11T02:18:14Z

@m-schmoock has discovered an even worse way to trigger this, remotely, by having Alice send trimmed HTLCs to send over the fee that the first HTLC used to have. This triggers this even without an onchain feerate increase I think. ElementsProject/lightning#3498

Add new check if we're funder trying to add HTLC, keeping us with enough extra funds to pay for another HTLC the peer might add. Changelog-Fixed: Corner case where channel could become unusable (lightning/bolts#728) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

t-bast · 2020-02-11T08:26:48Z

Good catch, that's another way to lock the channels. We should definitely do something about that, sooner than later IMHO.
I was waiting for feedback before opening a spec PR, but looking at the discussions you had on the c-lightning repo I believe you're also leaning more towards the "additional reserve" solution (where we would just need to bikeshed the multiplier).
I'll open a PR today to get things moving.

Add new check if we're funder trying to add HTLC, keeping us with enough extra funds to pay for another HTLC the peer might add. We also need to adjust the spendable_msat calculation, and update various tests which try to unbalance channels. We eliminate the now-redundant test_channel_drainage entirely. Changelog-Fixed: Corner case where channel could become unusable (lightning/bolts#728) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Add an additional "reserve" for funders on top of the real reserve to avoid getting in a state where the channel is unusable because of the increased commit tx cost of a new HTLC. Requirements are only added for the funder sending an HTLC. Fundee receiving HTLCs may choose to verify that funders apply this, but it may lead to an unusable UX. Fixes #728.

See lightning/bolts#728 Add an additional reserve on the funder to prevent emptying and then being stuck with an unusable channel. As fundee we don't verify funders comply with that change. We may enforce it in the future when we're confident the network as a whole enforces that.

TheBlueMatt · 2020-02-12T19:24:38Z

@t-bast I dont see a reason why Bob needs to care - HTLCs can be rejected for any reason by Alice (including "I'm feeling sleepy today, I don't feel like dealing with a forward"). Bob either has to close in response or deal with it.

@ZmnSCPxj correct. The meeting discussion centered around the fact that this really has little to do with fees, and more to do with reserves.

t-bast · 2020-02-13T08:44:12Z

I dont see a reason why Bob needs to care

I agree that we can definitely do without it, especially at first to unblock the situation (and this is what I proposed in #740) but that provides a poorer UX.

If Bob's software tells him "hey your balance is X, you can send up to X to Alice" and then sending X to Alice consistently fails, that's weird. We've had support tickets for less than that 😄. Right now lnd, c-lightning and eclair both apply fee and reserve conditions symmetrically so when it says "you can send X" you can really send that amount regardless of what implementation your counter-party runs, which is nice.

m-schmoock · 2020-02-13T12:45:25Z

I really like Rustys proposal that allows fundee to dip into funder reserves (once).(ElementsProject/lightning#3501)

This will not add additional reserve margins and does not require assumptions on how fees might behave. However, this behavior must be known and accepted to the peer at a protocol level.

Edit: We have to make sure he does not dip too deep into reserves :/ Ideas?

t-bast · 2020-02-13T13:16:01Z

I really like Rustys proposal that allows fundee to dip into funder reserves (once).(ElementsProject/lightning#3501)

I also believe the reserve could be used for that (it was mentioned earlier in this issue).
But right now this would require a change on all implementations, because lnd, c-lightning and eclair will not even send such an HTLC because it knows the remote cannot afford the increased fee.
So it would need a change on all implementations (which can be ok IMHO but needs others to ACK).

halseth · 2020-02-13T13:47:55Z

It sounds like it could open another can of worms, since you would need to validate the size of the "remaining" channel reserve. It would also make it impossible to have channels with zero reserve (because then you could easily get into this situation again). In high fee scenarios you could even risk that the reserve wont be enough to cover the HTLC fee.

Not saying it won't work, but we have to start altering the spec to mitigate this issue, maybe worth starting to think about how to fundamentally fix it? (the party offering the HTLC also pay fees)

t-bast · 2020-02-13T13:55:23Z

you would need to validate the size of the "remaining" channel reserve.

That can indeed get nasty, as @ZmnSCPxj mentioned in ElementsProject/lightning#3501. It does require more thoughts and experiments before we can safely do that.

It would also make it impossible to have channels with zero reserve (because then you could easily get into this situation again).

Only for the funder, you can still have 0-reserve if you're the fundee, right?

I think that the fundamental change is that once there's a way to do CPFP and RBF (anchor outputs), we can simply allow HTLCs regardless of whether the increased commit tx fee can be paid or not. If no-one closes the channel the HTLCs will resolve and the situation gets back to normal, and if someone broadcasts the commit tx hoping to steal funds with an on-chain race-condition, the attacked party can CPFP to prevent the attacker from stealing funds.

halseth · 2020-02-13T14:09:17Z

I think that the fundamental change is that once there's a way to do CPFP and RBF (anchor outputs), we can simply allow HTLCs regardless of whether the increased commit tx fee can be paid or not. If no-one closes the channel the HTLCs will resolve and the situation gets back to normal, and if someone broadcasts the commit tx hoping to steal funds with an on-chain race-condition, the attacked party can CPFP to prevent the attacker from stealing funds.

This is true in a no-commitfee world, since we would still need to keep the fee above the min-relay fee. With package relay and anchor outputs this will be much simplified indeed :)

t-bast · 2020-02-13T14:18:25Z

we would still need to keep the fee above the min-relay fee.

Arf I tend to forget about this, thanks for reminding me about this annoying detail :)

TheBlueMatt · 2020-02-13T17:37:36Z

We would lose the current assumption that your honest peer never should send you updates you won't accept

I don't know how we could do that? There is no way to reject an update your counterparty sent you without first accepting the commitment transaction that does so.

It sounds like it could open another can of worms, since you would need to validate the size of the "remaining" channel reserve

Agreed here. Trying to reduce the effective reserve value from today doesn't make sense to me - its there and selected to be a value that has specific meaning. If we aren't able to meet the security requirements of said meaning, we should add a second reserve, not remove the existing one.

joostjager · 2020-02-14T13:19:36Z

I agree with @halseth that we can have a transition period in which the protocol isn't changed and also no hard rejection of certain messages is implemented.

If all implementations make sure that they never send htlcs or fee updates that would push either one of the parties into the unusable state, isn't the problem going away when the network upgrades?

m-schmoock · 2020-02-14T16:56:32Z

@joostjager

make sure that they never send htlcs or fee updates that would push either one of the parties into the unusable state...

Yes, but fee updates are there for a reason, we can't just suppress or delay them indefinitely them if remote happens run very low on capacity or fees happened to have changed in an unfortunate way. Not accepting an (trimmed or untrimmed) HTLC when already very low is an option, but then you have to make assumptions on when you think a HTLC would still be 'acceptable', which again depends on how the fees will behave in the future.

As interim mitigation, latest c-lightning will now raise CAPACITY_EXCEEDED on a new HTLC if a fee increase of 50% would lead to a lockup situation, see: ElementsProject/lightning@86c28b2 . Not elegant, but it reduces the risk of running into this situation for now.

joostjager · 2020-02-14T18:06:38Z

Creating more leeway is fine. But then still you have to decide whether to update or not update the fee if the update would push you into the lockup situation. Do you choose a fee that is too low or do you lock yourself up?

Allow funders to dip into their channel reserve once to be able to pay the increased commit tx fee for a pending HTLC. This prevents channels from getting in a state where the channel is unusable because of the increased commit tx cost of a new HTLC. Fixes #728.

See lightning/bolts#728 Allow a funder to dip into its channel reserve to pay the increased commit tx fee for one incoming HTLC. This prevents being stuck with an unusable channel.

See lightning/bolts#728 Add an additional reserve on the funder to prevent emptying and then being stuck with an unusable channel. As fundee we don't verify funders comply with that change. We may enforce it in the future when we're confident the network as a whole enforces that.

Add an additional "reserve" for funders on top of the real reserve to avoid getting in a state where the channel is unusable because of the increased commit tx cost of a new HTLC. Requirements are only added for the funder sending an HTLC. Fundee receiving HTLCs may choose to verify that funders apply this, but it may lead to an unusable UX. Fixes #728.

See lightning/bolts#728 Add an additional reserve on the funder to prevent emptying and then being stuck with an unusable channel. As fundee we don't verify funders comply with that change. We may enforce it in the future when we're confident the network as a whole enforces that.

Add an additional "reserve" for funders on top of the real reserve to avoid getting in a state where the channel is unusable because of the increased commit tx cost of a new HTLC. Requirements are only added for the funder sending an HTLC. Fundee receiving HTLCs may choose to verify that funders apply this, but it may lead to an unusable UX. Fixes #728.

…fee buffer - see lightning/bolts#728

t-bast added bug help wanted labels Jan 17, 2020

t-bast added a commit to ACINQ/eclair that referenced this issue Jan 27, 2020

Fix fuzz test

1628192

Otherwise we may run into the following issue: lightning/bolts#728

t-bast mentioned this issue Jan 27, 2020

Fix availableForSend/Receive ACINQ/eclair#1293

Merged

cdecker mentioned this issue Feb 3, 2020

Lightning Specification Meeting 2020/02/03 #731

Closed

17 tasks

rustyrussell mentioned this issue Feb 11, 2020

test: poc that locks up a channel ElementsProject/lightning#3498

Closed

This was referenced Feb 11, 2020

Channel lockup corner case workaround ElementsProject/lightning#3500

Merged

[DRAFT] channeld: allow fundee dip into funder reserves, once. ElementsProject/lightning#3501

Closed

t-bast mentioned this issue Feb 11, 2020

Avoid stuck channels after fee increase with additional reserve #740

Merged

t-bast mentioned this issue Feb 11, 2020

Funder reserve for future fee increase ACINQ/eclair#1319

Merged

halseth mentioned this issue Feb 18, 2020

lnwallet: Make available balance HTLC fee aware lightningnetwork/lnd#3691

Merged

t-bast mentioned this issue Feb 24, 2020

Avoid stuck channels after fee increase by dipping into reserve #750

Closed

t-bast mentioned this issue Feb 24, 2020

Allow funder to dip into reserve if increased fee ACINQ/eclair#1331

Closed

t-bast closed this as completed in #740 Apr 27, 2020

Roasbeef mentioned this issue Nov 4, 2022

[bug]: unable to forward, wrong "insufficient bandwidth" lightningnetwork/lnd#7108

Closed

ziggie1984 mentioned this issue Nov 6, 2022

Failing rapid rebalance rkfg/regolancer#32

Closed

C-Otto mentioned this issue May 23, 2023

[bug]: implement "fee spike buffer" from spec lightningnetwork/lnd#7721

Closed

t-bast mentioned this issue May 24, 2023

Allow HTLC receiver to dip into channel reserve #1083

Closed

remyers added a commit to remyers/eclair that referenced this issue Aug 18, 2023

Check that splice out does not reduce local balance below the funder …

bd6e76e

…fee buffer - see lightning/bolts#728

remyers added a commit to remyers/eclair that referenced this issue Aug 29, 2023

Check that splice out does not reduce local balance below the funder …

3b9cbcd

…fee buffer - see lightning/bolts#728

YusukeShimizu mentioned this issue Nov 24, 2023

probe payment as sanity check ElementsProject/peerswap#260

Merged

cdecker mentioned this issue Jan 3, 2024

Tweaking the feerate security margin ElementsProject/lightning#6974

Open

grubles mentioned this issue Mar 28, 2024

Failed peerswap from 63 days ago blocking two peers doing any further swapping ElementsProject/peerswap#290

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stuck channels because of small fee increase #728

Stuck channels because of small fee increase #728

t-bast commented Jan 17, 2020

TheBlueMatt commented Jan 20, 2020

halseth commented Jan 20, 2020

halseth commented Jan 20, 2020

TheBlueMatt commented Jan 20, 2020

halseth commented Jan 21, 2020

t-bast commented Jan 21, 2020

t-bast commented Jan 21, 2020

t-bast commented Feb 4, 2020

ZmnSCPxj commented Feb 11, 2020 •

edited

t-bast commented Feb 11, 2020

TheBlueMatt commented Feb 12, 2020

t-bast commented Feb 13, 2020

m-schmoock commented Feb 13, 2020 •

edited

t-bast commented Feb 13, 2020

halseth commented Feb 13, 2020

t-bast commented Feb 13, 2020 •

edited

halseth commented Feb 13, 2020

t-bast commented Feb 13, 2020

TheBlueMatt commented Feb 13, 2020

joostjager commented Feb 14, 2020

m-schmoock commented Feb 14, 2020

joostjager commented Feb 14, 2020

Stuck channels because of small fee increase #728

Stuck channels because of small fee increase #728

Comments

t-bast commented Jan 17, 2020

Problem statement

Proposed solution

Notes (readers may skip)

CPFP / Anchor outputs

Reaching the faulty state

TheBlueMatt commented Jan 20, 2020

halseth commented Jan 20, 2020

halseth commented Jan 20, 2020

TheBlueMatt commented Jan 20, 2020

halseth commented Jan 21, 2020

t-bast commented Jan 21, 2020

t-bast commented Jan 21, 2020

t-bast commented Feb 4, 2020

ZmnSCPxj commented Feb 11, 2020 • edited

t-bast commented Feb 11, 2020

TheBlueMatt commented Feb 12, 2020

t-bast commented Feb 13, 2020

m-schmoock commented Feb 13, 2020 • edited

t-bast commented Feb 13, 2020

halseth commented Feb 13, 2020

t-bast commented Feb 13, 2020 • edited

halseth commented Feb 13, 2020

t-bast commented Feb 13, 2020

TheBlueMatt commented Feb 13, 2020

joostjager commented Feb 14, 2020

m-schmoock commented Feb 14, 2020

joostjager commented Feb 14, 2020

ZmnSCPxj commented Feb 11, 2020 •

edited

m-schmoock commented Feb 13, 2020 •

edited

t-bast commented Feb 13, 2020 •

edited