-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
htlcswitch: final settle signal #6517
Conversation
9c0c0d9
to
98bde69
Compare
Concept ACK Is there perhaps any unconsidered complexity for situations where multiple HTLCs exist in the commitment transaction for a single channel? Or would the invoice status not depend at all on any other HTLCs in the channel? |
There should not be such complexity. The code as is should record all final settlements of htlcs, regardless of whether they are grouped together on the same commitment.
For a multi-part payment, the invoice status could depend on multiple htlcs in the same channel. But the scope of this PR does not include invoices at all. To determine whether a received payment is final, one should retrieve all the settled htlcs from the invoice database and then look them up one by one via for example the api that is added in this PR to see if the settlements are final. Obviously this can be streamlined later on for invoices specifically. |
e03c8d1
to
0be85ca
Compare
Waiting for design ack for the direction set out in this PR. In particular storing the settled htlc ids in a new bucket |
I think this patch shouldn't use the shortchanid, and should instead use the funding outpoint - we shouldn't add more code that needs reorg handling. I also think there should be benchmarks for lookup latency with N settles over M channels. In the revocation PR, you brought up the idea of looking at the resolvers instead of doing this. Why would you need this if you weren't the exit hop? Maybe this is because I don't understand the htlc interceptor settle flow where you intercept and settle back, already knowing the preimage. Another case is maybe you want to see if you've lost money as an intermediate node due to not being able to claim on the incoming? Another thought is: could this use some sort of hash accumulator? It seems that you're only performing lookups on the set... Not sure of a golang library that can help here. |
Are reorgs even supported generally for channels that are fully confirmed? But I get the point. I use short channel id, because we also use that in the htlc interception api. These two apis are related and can be used together. Short chan id may also have not been the best decision for htlc interception? I could convert the PR to outpoints and then leave it up to the user to do the mapping from short chan ids if necessary?
What scenario are you thinking of specifically when performance would be critical? LND internally doesn't do any lookups, it would just be load generated by a connected application.
Linking the resolver idea here: #6347 (comment) I think just looking at the resolvers is not sufficient, because you don't get a signal when an htlc setlles off-chain in the happy flow. Without that signal, an application still doesn't know when the payment truly cleared. One scenario where you need this as a routing node is if you use the htlc interceptor to create virtual channels. We use it with lnmux. I believe LSPs for mobile users have similar requirements for just-in-time opened channels. And indeed, more generally for exact accounting on a routing node you probably want to know if you were able to claim all your incoming forwards. For that though, the resolver information might be sufficient as it is not time critical. Maybe there is an edge case if the counterparty doesn't go to chain (yet) to timeout the htlc.
Not familiar with hash accumulators, but isn't that a probabilistic structure that can lead to false positives? False positives wouldn't be good for this application. |
No, my point is that (I think) eventually we'll be moving away from SCIDs to identify a channel, so it may not make sense to make another place where that's done.
I don't think you have a choice, the switch is pretty tied to SCIDs at the moment.
What about the rpcserver call to check if something is settled? Additionally, some nodes will be inserting frequently into this global settled bucket so that's where the concern comes from. I'm basically wondering if a large bucket here makes insertion/lookups very slow due to bbolt traversing the b+ tree and having to cache a lot of leaves.
I was thinking of something like utreexo that had proof inclusion, but after glancing at the paper, that doesn't actually help with storage here. |
Can you get confirmation on that before I convert this PR?
My thinking here is that that single db lookup should be negligible compared to all the updates that are required to just lock in and settle an htlc on the channel. It can also be a batch lookup for more efficiency. The insert itself is part of an existing transaction, so at least that wouldn't add another write lock. I can do some benchmarking once the shape of this PR is more final. Linking related PR lightninglabs/neutrino#213, which @Roasbeef brought up in the review club. |
Yes, I confirmed it
I am concerned about the insert here |
@Crypt-iQ if you talk about the funding outpoint, do you mean txid:index, or the "compressed" 32-byte format that xors the index into the txid? I don't think we use the 32-byte format on the rpc interface anywhere yet. |
the outpoint, not the channel id (which is the 32-byte representation) |
Another thing to be aware of is that we need to keep the mapping around for closed channels too. For example when you |
923633e
to
9339943
Compare
Given the advanced state of this PR, would you be okay with discussing and implementing the opt-in in another PR, possibly combined with delete functionality? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the advanced state of this PR, would you be okay with discussing and implementing the opt-in in another PR, possibly combined with delete functionality?
Sure, let's continue there then. Again good work!
channeldb/db.go
Outdated
|
||
finalHtlcsBucket := tx.ReadBucket(finalHtlcsBucket) | ||
if finalHtlcsBucket == nil { | ||
return nil, kvdb.ErrBucketNotFound |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this should be a defined error, something like ErrNoChanDBExists
, otherwise we won't know which bucket is missing by looking at the error alone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The caller is wrapping it:
return fmt.Errorf("cannot fetch final htlcs bucket: %w", err)
Do you think that is insufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's sufficient for the caller's caller, not the method itself. So whoever calls LookupFinalHtlc
understands it, but another method using fetchFinalHtlcsBucket
won't, unless we wrap it there again. Plus I won't be able to tell whether it's the root bucket is missing, or the nested bucket is missing here. This is nit, but may give us headache in the long run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, so this is getting back to #6517 (comment). Does every error need to contain an indication of the function that generated it? I am not sure if this is done generally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see there is also a problem with the pr currently, because in rpcserver.go
, I still match against ErrHtlcUnknown
. Bubbling up those very granular errors also requires them to be matched higher up, which can be considered a layering violation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did add specific errors for the two missing bucket cases, replacing kvdb.ErrBucketNotFound
. Btw, the linter forcing Error.Is
in switch statements doesn't feel completely right to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the inverse of how %w is commonly used. Don't you think that there is a limit to how much low-level error information you can carry across package boundaries? I'd think that logging may be a better way to keep the layers separated cleanly.
That's also true. I don't like the error chain becoming too long, and logging is definitely very helpful. But it seems like we don't do any loggings in channel.go
. As for the new pattern, I think we need to gather more opinions. Also I'll try it out to see if it's really better in my other PRs.
Btw, the linter forcing Error.Is in switch statements doesn't feel completely right to me.
I get it. But I just tell myself errors.Is
is just like ==
with the ability to compare wrapped errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My personal opinion is that errors.Is
everywhere does not always make it better. Also for the common check if ctx.Err() == context.Cancelled
it feels a bit weird. Almost like you don't really know what you're checking and instead use a catch-all to be sure. But ofc a minor nit in the grand scheme of things.
For this PR, shall I leave it as is now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think it's good enough! Thanks for the work and discussion, it's been very helpful to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely not in love with errorlint
: https://github.com/lightningnetwork/lnd/pull/7067/files#r1007218950
cf8b018
to
642ee7e
Compare
642ee7e
to
5292416
Compare
Removed dev cli command. Saved here: bottlepay@5292416 |
5292416
to
3a89a84
Compare
I'd like to clarify what to expect in the API of lnd 0.16 regarding this issue. I saw @joostjager 's tweet
I'm currently using Alex Bosworth's lightning library, checking for settlement on an invoice subscription. Is there or will there be an event in the invoices subscription stream from lnd for final settlement in v0.16? |
Currently there is not. Added steps for how to do it at the moment to the pr description. |
@joostjager I seem to end up in this piece of code here for every new block for a channel that's still open. The htlc is apparently not handled. This PR appears to fix the problem by finalizing sub dust incoming htlcs. So I'm considering applying this PR. But the following comment suggests that the channel should already be closed: https://github.com/lightningnetwork/lnd/pull/6517/files#diff-a0b8064876b1b1d6085fa7ffdbfd38c81cb06c1ca3f34a08dbaacba203cda3ebR2227-R2228 Do you have an idea whether it will be problematic that the channel is still open when handling the HtlcIncomingDustFinalAction? |
What is it that you try to do exactly? I could use a bit more context to answer the question. If the channel is still open and there are no htlcs at risk (or htlcs at risk are dust), there is no action to be taken for the channel arbitrator. |
@joostjager I've got expired htlcs, some are even 520 days old. I get |
I don't think that this PR is going to help with that. It only adds reporting of the on-chain resolution to the final resolution table, but doesn't change the actual resolution / mark as complete process itself. |
Thanks for the quick reply. |
This PR adds a new final settle state for incoming htlcs. An htlc reaches this state after the
update_settle_htlc
update has been fully locked in, off-chain or on-chain.A new rpc
LookupHtlc
is added that can be used to check whether an htlc reached the new state.Additionally an event is added to the existing
SubscribeHtlcEvents
stream. This allows users to wait for final settlement without pollingLookupHtlc
. It is important to have a non-polling mechanism because waiting for final settlement is likely to be part of the critical path of a lightning payment.There is no
FINAL_SETTLE
state on the invoice level yet. To use this api to wait for final settlement of an invoice, the following steps can be used:SubscribeHtlcEvents
.SubscribedEvent
to be received. Otherwise we may not really be subscribed yet. It is important to do this before looking up htlcs to avoid a race condition.LookupHtlc
More background info and design discussion: https://groups.google.com/a/lightning.engineering/g/lnd/c/Sfy_M9avQoU
Fixes #6208