htlcswitch: final settle signal #6517

joostjager · 2022-05-09T14:46:50Z

This PR adds a new final settle state for incoming htlcs. An htlc reaches this state after the update_settle_htlc update has been fully locked in, off-chain or on-chain.

A new rpc LookupHtlc is added that can be used to check whether an htlc reached the new state.

Additionally an event is added to the existing SubscribeHtlcEvents stream. This allows users to wait for final settlement without polling LookupHtlc. It is important to have a non-polling mechanism because waiting for final settlement is likely to be part of the critical path of a lightning payment.

There is no FINAL_SETTLE state on the invoice level yet. To use this api to wait for final settlement of an invoice, the following steps can be used:

Wait for invoice to be settled on invoices subscription stream
Extract htlc ids from settle update event.
Subscribe to the htlc notifier api SubscribeHtlcEvents.
Wait for the htlc notifier SubscribedEvent to be received. Otherwise we may not really be subscribed yet. It is important to do this before looking up htlcs to avoid a race condition.
Lookup all htlcs retrieved in step 2 via LookupHtlc
If they are all settled, the invoice is settled
If some are not yet settled, wait for the final settle events for those to come in through the htlc notifier api.

More background info and design discussion: https://groups.google.com/a/lightning.engineering/g/lnd/c/Sfy_M9avQoU

Fixes #6208

otech47 · 2022-05-25T13:22:42Z

Concept ACK

Is there perhaps any unconsidered complexity for situations where multiple HTLCs exist in the commitment transaction for a single channel? Or would the invoice status not depend at all on any other HTLCs in the channel?

joostjager · 2022-05-25T17:29:19Z

Is there perhaps any unconsidered complexity for situations where multiple HTLCs exist in the commitment transaction for a single channel?

There should not be such complexity. The code as is should record all final settlements of htlcs, regardless of whether they are grouped together on the same commitment.

Or would the invoice status not depend at all on any other HTLCs in the channel?

For a multi-part payment, the invoice status could depend on multiple htlcs in the same channel. But the scope of this PR does not include invoices at all.

To determine whether a received payment is final, one should retrieve all the settled htlcs from the invoice database and then look them up one by one via for example the api that is added in this PR to see if the settlements are final.

Obviously this can be streamlined later on for invoices specifically.

joostjager · 2022-07-20T17:20:50Z

Waiting for design ack for the direction set out in this PR. In particular storing the settled htlc ids in a new bucket settled-htlcs and doing so atomically in the channel state update transaction.

channeldb/channel.go

Crypt-iQ · 2022-08-25T12:18:06Z

I think this patch shouldn't use the shortchanid, and should instead use the funding outpoint - we shouldn't add more code that needs reorg handling. I also think there should be benchmarks for lookup latency with N settles over M channels.

In the revocation PR, you brought up the idea of looking at the resolvers instead of doing this. Why would you need this if you weren't the exit hop? Maybe this is because I don't understand the htlc interceptor settle flow where you intercept and settle back, already knowing the preimage. Another case is maybe you want to see if you've lost money as an intermediate node due to not being able to claim on the incoming?

Another thought is: could this use some sort of hash accumulator? It seems that you're only performing lookups on the set... Not sure of a golang library that can help here.

joostjager · 2022-08-25T14:30:29Z

I think this patch shouldn't use the shortchanid, and should instead use the funding outpoint - we shouldn't add more code that needs reorg handling.

Are reorgs even supported generally for channels that are fully confirmed?

But I get the point. I use short channel id, because we also use that in the htlc interception api. These two apis are related and can be used together. Short chan id may also have not been the best decision for htlc interception?

I could convert the PR to outpoints and then leave it up to the user to do the mapping from short chan ids if necessary?

I also think there should be benchmarks for lookup latency with N settles over M channels.

What scenario are you thinking of specifically when performance would be critical? LND internally doesn't do any lookups, it would just be load generated by a connected application.

In the revocation PR, you brought up the idea of looking at the resolvers instead of doing this. Why would you need this if you weren't the exit hop? Maybe this is because I don't understand the htlc interceptor settle flow where you intercept and settle back, already knowing the preimage. Another case is maybe you want to see if you've lost money as an intermediate node due to not being able to claim on the incoming?

Linking the resolver idea here: #6347 (comment)

I think just looking at the resolvers is not sufficient, because you don't get a signal when an htlc setlles off-chain in the happy flow. Without that signal, an application still doesn't know when the payment truly cleared.

One scenario where you need this as a routing node is if you use the htlc interceptor to create virtual channels. We use it with lnmux. I believe LSPs for mobile users have similar requirements for just-in-time opened channels. And indeed, more generally for exact accounting on a routing node you probably want to know if you were able to claim all your incoming forwards. For that though, the resolver information might be sufficient as it is not time critical. Maybe there is an edge case if the counterparty doesn't go to chain (yet) to timeout the htlc.

Another thought is: could this use some sort of hash accumulator? It seems that you're only performing lookups on the set... Not sure of a golang library that can help here.

Not familiar with hash accumulators, but isn't that a probabilistic structure that can lead to false positives? False positives wouldn't be good for this application.

Crypt-iQ · 2022-08-25T15:32:57Z

Are reorgs even supported generally for channels that are fully confirmed?

No, my point is that (I think) eventually we'll be moving away from SCIDs to identify a channel, so it may not make sense to make another place where that's done.

Short chan id may also have not been the best decision for htlc interception?

I don't think you have a choice, the switch is pretty tied to SCIDs at the moment.

What scenario are you thinking of specifically when performance would be critical? LND internally doesn't do any lookups, it would just be load generated by a connected application.

What about the rpcserver call to check if something is settled? Additionally, some nodes will be inserting frequently into this global settled bucket so that's where the concern comes from. I'm basically wondering if a large bucket here makes insertion/lookups very slow due to bbolt traversing the b+ tree and having to cache a lot of leaves.

Not familiar with hash accumulators, but isn't that a probabilistic structure that can lead to false positives? False positives wouldn't be good for this application.

I was thinking of something like utreexo that had proof inclusion, but after glancing at the paper, that doesn't actually help with storage here.

joostjager · 2022-08-25T18:14:14Z

No, my point is that (I think) eventually we'll be moving away from SCIDs to identify a channel, so it may not make sense to make another place where that's done.

Can you get confirmation on that before I convert this PR?

What scenario are you thinking of specifically when performance would be critical? LND internally doesn't do any lookups, it would just be load generated by a connected application.

What about the rpcserver call to check if something is settled? Additionally, some nodes will be inserting frequently into this global settled bucket so that's where the concern comes from. I'm basically wondering if a large bucket here makes insertion/lookups very slow due to bbolt traversing the b+ tree and having to cache a lot of leaves.

My thinking here is that that single db lookup should be negligible compared to all the updates that are required to just lock in and settle an htlc on the channel. It can also be a batch lookup for more efficiency. The insert itself is part of an existing transaction, so at least that wouldn't add another write lock.

I can do some benchmarking once the shape of this PR is more final. Linking related PR lightninglabs/neutrino#213, which @Roasbeef brought up in the review club.

Crypt-iQ · 2022-08-25T21:18:39Z

Can you get confirmation on that before I convert this PR?

Yes, I confirmed it

The insert itself is part of an existing transaction, so at least that wouldn't add another write lock.

I am concerned about the insert here

joostjager · 2022-08-26T10:07:28Z

@Crypt-iQ if you talk about the funding outpoint, do you mean txid:index, or the "compressed" 32-byte format that xors the index into the txid? I don't think we use the 32-byte format on the rpc interface anywhere yet.

Crypt-iQ · 2022-08-26T12:47:21Z

@Crypt-iQ if you talk about the funding outpoint, do you mean txid:index, or the "compressed" 32-byte format that xors the index into the txid? I don't think we use the 32-byte format on the rpc interface anywhere yet.

the outpoint, not the channel id (which is the 32-byte representation)

joostjager · 2022-08-26T12:48:52Z

Another thing to be aware of is that we need to keep the mapping around for closed channels too. For example when you LookupInvoice and want to get the final status of the htlc, you still need to map the short chan ids in that output to outpoints and the channels may be closed by that time. Do you foresee any problems with this or is it just a matter of downloading the mapping from ClosedChannels, or looking up on-chain based on short chan id? It's not the most convenient for the caller though.

joostjager · 2022-10-26T12:59:22Z

And making this opt-in can complicate things. So I guess I'm not sure, tho leaning towards making it opt-in, at lease give users an option.

Given the advanced state of this PR, would you be okay with discussing and implementing the opt-in in another PR, possibly combined with delete functionality?

yyforyongyu

Given the advanced state of this PR, would you be okay with discussing and implementing the opt-in in another PR, possibly combined with delete functionality?

Sure, let's continue there then. Again good work!

yyforyongyu · 2022-10-26T14:08:11Z

channeldb/db.go

+
+	finalHtlcsBucket := tx.ReadBucket(finalHtlcsBucket)
+	if finalHtlcsBucket == nil {
+		return nil, kvdb.ErrBucketNotFound


nit: this should be a defined error, something like ErrNoChanDBExists, otherwise we won't know which bucket is missing by looking at the error alone.

The caller is wrapping it:

return fmt.Errorf("cannot fetch final htlcs bucket: %w", err)

Do you think that is insufficient?

That's sufficient for the caller's caller, not the method itself. So whoever calls LookupFinalHtlc understands it, but another method using fetchFinalHtlcsBucket won't, unless we wrap it there again. Plus I won't be able to tell whether it's the root bucket is missing, or the nested bucket is missing here. This is nit, but may give us headache in the long run.

Yes, so this is getting back to #6517 (comment). Does every error need to contain an indication of the function that generated it? I am not sure if this is done generally.

I see there is also a problem with the pr currently, because in rpcserver.go, I still match against ErrHtlcUnknown. Bubbling up those very granular errors also requires them to be matched higher up, which can be considered a layering violation.

I did add specific errors for the two missing bucket cases, replacing kvdb.ErrBucketNotFound. Btw, the linter forcing Error.Is in switch statements doesn't feel completely right to me.

It looks like the inverse of how %w is commonly used. Don't you think that there is a limit to how much low-level error information you can carry across package boundaries? I'd think that logging may be a better way to keep the layers separated cleanly.

That's also true. I don't like the error chain becoming too long, and logging is definitely very helpful. But it seems like we don't do any loggings in channel.go. As for the new pattern, I think we need to gather more opinions. Also I'll try it out to see if it's really better in my other PRs.

Btw, the linter forcing Error.Is in switch statements doesn't feel completely right to me.

I get it. But I just tell myself errors.Is is just like == with the ability to compare wrapped errors.

My personal opinion is that errors.Is everywhere does not always make it better. Also for the common check if ctx.Err() == context.Cancelled it feels a bit weird. Almost like you don't really know what you're checking and instead use a catch-all to be sure. But ofc a minor nit in the grand scheme of things.

For this PR, shall I leave it as is now?

Yeah I think it's good enough! Thanks for the work and discussion, it's been very helpful to me!

Definitely not in love with errorlint: https://github.com/lightningnetwork/lnd/pull/7067/files#r1007218950

joostjager · 2022-10-27T14:44:52Z

Removed dev cli command. Saved here: bottlepay@5292416

casey-bowman · 2022-11-06T17:53:55Z

I'd like to clarify what to expect in the API of lnd 0.16 regarding this issue. I saw @joostjager 's tweet

To address this, lnd 0.16 will contain a new api that provides insight into the final resolution of htlcs. It's another step in the maturation process of lightning. Use it to your advantage.

I'm currently using Alex Bosworth's lightning library, checking for settlement on an invoice subscription. Is there or will there be an event in the invoices subscription stream from lnd for final settlement in v0.16?

joostjager · 2022-11-07T06:54:18Z

Is there or will there be an event in the invoices subscription stream from lnd for final settlement in v0.16?

Currently there is not. Added steps for how to do it at the moment to the pr description.

JssDWt · 2023-01-16T13:10:03Z

@joostjager I seem to end up in this piece of code here for every new block for a channel that's still open. The htlc is apparently not handled.

This PR appears to fix the problem by finalizing sub dust incoming htlcs. So I'm considering applying this PR. But the following comment suggests that the channel should already be closed: https://github.com/lightningnetwork/lnd/pull/6517/files#diff-a0b8064876b1b1d6085fa7ffdbfd38c81cb06c1ca3f34a08dbaacba203cda3ebR2227-R2228

Do you have an idea whether it will be problematic that the channel is still open when handling the HtlcIncomingDustFinalAction?

joostjager · 2023-01-16T13:25:16Z

I seem to end up in this piece of code here for every new block for a channel that's still open.

What is it that you try to do exactly? I could use a bit more context to answer the question.

If the channel is still open and there are no htlcs at risk (or htlcs at risk are dust), there is no action to be taken for the channel arbitrator.

JssDWt · 2023-01-16T13:38:02Z

If the channel is still open and there are no htlcs at risk (or htlcs at risk are dust), there is no action to be taken for the channel arbitrator.

@joostjager I've got expired htlcs, some are even 520 days old. I get go to chain for incoming htlc logs for these htlcs on every new block. So it seems these htlcs are never resolved. Then I noticed that these HTLCs never get any action assigned if they're sub dust. So it never goes to chain, but also never ends up in a final state. I currently have the suspicion many of these sub dust incoming htlcs are clogging up the work LND is doing and that is causing new htlcs to expire. Hence I thought applying this PR, where the htlcs do end up in a final state, could help.

joostjager · 2023-01-16T13:49:36Z

I don't think that this PR is going to help with that. It only adds reporting of the on-chain resolution to the final resolution table, but doesn't change the actual resolution / mark as complete process itself.

JssDWt · 2023-01-16T13:57:07Z

I don't think that this PR is going to help with that. It only adds reporting of the on-chain resolution to the final resolution table, but doesn't change the actual resolution / mark as complete process itself.

Thanks for the quick reply.

joostjager mentioned this pull request May 9, 2022

lnwallet+channeldb: store minimal info in revocation log bucket #6347

Merged

3 tasks

joostjager force-pushed the lookup-htlc branch 2 times, most recently from 9c0c0d9 to 98bde69 Compare May 10, 2022 12:27

joostjager mentioned this pull request May 10, 2022

Reliable HTLC outcome lookup #6208

Closed

joostjager force-pushed the lookup-htlc branch 4 times, most recently from e03c8d1 to 0be85ca Compare July 11, 2022 14:08

joostjager marked this pull request as ready for review July 11, 2022 14:13

joostjager requested a review from Roasbeef July 11, 2022 14:19

Roasbeef added this to the v0.16.0 milestone Jul 14, 2022

Roasbeef added safety General label for issues/PRs related to the safety of using the software htlcswitch invoices labels Jul 14, 2022

joostjager mentioned this pull request Jul 20, 2022

remove settled state NYDIG-OSS/lnmux#29

Merged

joostjager mentioned this pull request Jul 21, 2022

Revive invoice settled state NYDIG-OSS/lnmux#33

Closed

yyforyongyu self-requested a review August 3, 2022 12:27

hieblmi reviewed Aug 21, 2022

View reviewed changes

channeldb/channel.go Outdated Show resolved Hide resolved

saubyk removed this from the v0.16.0 milestone Aug 27, 2022

joostjager force-pushed the lookup-htlc branch 2 times, most recently from 923633e to 9339943 Compare October 26, 2022 12:57

joostjager requested a review from yyforyongyu October 26, 2022 13:00

yyforyongyu approved these changes Oct 26, 2022

View reviewed changes

joostjager force-pushed the lookup-htlc branch 2 times, most recently from cf8b018 to 642ee7e Compare October 27, 2022 06:53

joostjager added 4 commits October 27, 2022 16:42

routerrpc: add htlc notifier subscribed event

b85cda2

htlcswitch: keep final htlc outcome

28256b7

htlcswitch: add final htlc event to notifier

511fb00

rpcserver: add LookupHtlc call

3a89a84

joostjager force-pushed the lookup-htlc branch from 642ee7e to 5292416 Compare October 27, 2022 14:43

joostjager force-pushed the lookup-htlc branch from 5292416 to 3a89a84 Compare October 27, 2022 14:45

guggero merged commit 81b17ef into lightningnetwork:master Oct 27, 2022

guggero mentioned this pull request Oct 27, 2022

htlcswitch: make final settle signal opt-in, add delete functionality #7089

Closed

joostjager mentioned this pull request Oct 31, 2022

Use final settle signal from lnd NYDIG-OSS/lnmux#105

Closed

casey-bowman mentioned this pull request Nov 4, 2022

Clarify and supplement invoice.is_confirmed given new final settle signal in LND alexbosworth/ln-service#169

Closed

joostjager mentioned this pull request Jan 24, 2023

channeldb: final htlc resolution storage opt-in #7341

Merged

JssDWt mentioned this pull request Feb 3, 2023

Premature invoice settled notification breez/breez#175

Closed

This was referenced Feb 16, 2023

Await htlc resolution breez/breez#181

Merged

LookupHtlc returns error 'unknown htlc' for htlcs that are not finalized #7428

Closed

yyforyongyu mentioned this pull request Feb 27, 2023

[bug]: reverse the order inside invoice settlement flow #7463

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

htlcswitch: final settle signal #6517

htlcswitch: final settle signal #6517

joostjager commented May 9, 2022 •

edited

Loading

otech47 commented May 25, 2022

joostjager commented May 25, 2022 •

edited

Loading

joostjager commented Jul 20, 2022 •

edited

Loading

Crypt-iQ commented Aug 25, 2022 •

edited

Loading

joostjager commented Aug 25, 2022 •

edited

Loading

Crypt-iQ commented Aug 25, 2022

joostjager commented Aug 25, 2022

Crypt-iQ commented Aug 25, 2022

joostjager commented Aug 26, 2022 •

edited

Loading

Crypt-iQ commented Aug 26, 2022

joostjager commented Aug 26, 2022 •

edited

Loading

joostjager commented Oct 26, 2022

yyforyongyu left a comment

yyforyongyu Oct 26, 2022

joostjager Oct 26, 2022

yyforyongyu Oct 26, 2022

joostjager Oct 26, 2022 •

edited

Loading

joostjager Oct 26, 2022

joostjager Oct 27, 2022

yyforyongyu Oct 27, 2022

joostjager Oct 27, 2022

yyforyongyu Oct 27, 2022

joostjager Oct 27, 2022

joostjager commented Oct 27, 2022 •

edited

Loading

casey-bowman commented Nov 6, 2022 •

edited

Loading

joostjager commented Nov 7, 2022 •

edited

Loading

JssDWt commented Jan 16, 2023

joostjager commented Jan 16, 2023

JssDWt commented Jan 16, 2023

joostjager commented Jan 16, 2023 •

edited

Loading

JssDWt commented Jan 16, 2023

htlcswitch: final settle signal #6517

htlcswitch: final settle signal #6517

Conversation

joostjager commented May 9, 2022 • edited Loading

otech47 commented May 25, 2022

joostjager commented May 25, 2022 • edited Loading

joostjager commented Jul 20, 2022 • edited Loading

Crypt-iQ commented Aug 25, 2022 • edited Loading

joostjager commented Aug 25, 2022 • edited Loading

Crypt-iQ commented Aug 25, 2022

joostjager commented Aug 25, 2022

Crypt-iQ commented Aug 25, 2022

joostjager commented Aug 26, 2022 • edited Loading

Crypt-iQ commented Aug 26, 2022

joostjager commented Aug 26, 2022 • edited Loading

joostjager commented Oct 26, 2022

yyforyongyu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joostjager Oct 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joostjager commented Oct 27, 2022 • edited Loading

casey-bowman commented Nov 6, 2022 • edited Loading

joostjager commented Nov 7, 2022 • edited Loading

JssDWt commented Jan 16, 2023

joostjager commented Jan 16, 2023

JssDWt commented Jan 16, 2023

joostjager commented Jan 16, 2023 • edited Loading

JssDWt commented Jan 16, 2023

joostjager commented May 9, 2022 •

edited

Loading

joostjager commented May 25, 2022 •

edited

Loading

joostjager commented Jul 20, 2022 •

edited

Loading

Crypt-iQ commented Aug 25, 2022 •

edited

Loading

joostjager commented Aug 25, 2022 •

edited

Loading

joostjager commented Aug 26, 2022 •

edited

Loading

joostjager commented Aug 26, 2022 •

edited

Loading

joostjager Oct 26, 2022 •

edited

Loading

joostjager commented Oct 27, 2022 •

edited

Loading

casey-bowman commented Nov 6, 2022 •

edited

Loading

joostjager commented Nov 7, 2022 •

edited

Loading

joostjager commented Jan 16, 2023 •

edited

Loading