Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[htlcnotifier 3/4]: Add HTLC Notifier #3781

Merged
merged 10 commits into from
Feb 20, 2020

Conversation

carlaKC
Copy link
Collaborator

@carlaKC carlaKC commented Dec 2, 2019

This PR is change 3 of 4 proposed to add a htlc event notifier to lnd.
1: #3843
2: #3844
3: This PR
4: #3848

Since the PR is dependent on the preceding PRs, it has been rebased on #3844.
Only the last 9 commits are eligible for review.


This PR adds a htlcnotifier to the htlcswitch which provides a stream of notifications with the following types of events:

  • Forwards: when we forward a HTLC or initiate a send from our node
  • Forward Failures: when a HTLC we forwarded fails down the line
  • Settles: when a HTLC we forwarded is settled, or we receive a payment
  • Link Failures: when we fail a HTLC on our incoming or outgoing link

@carlaKC carlaKC added htlcswitch rpc Related to the RPC interface v0.9.0 labels Dec 2, 2019
@carlaKC carlaKC added HTLC P3 might get fixed, nice to have labels Dec 2, 2019
@carlaKC carlaKC removed the request for review from cfromknecht December 2, 2019 07:26
lntest/itest/lnd_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@joostjager joostjager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass done, I like the direction this is going. My two main thoughts:

  • Push the refactor of ForwardingError further. Get rid of ExtraMsg completely and make setting FailureDetail mandatory when instantiating the ForwardingError, possibly by creating a constructor function that takes both FailureMessage and FailureDetail. Otherwise it remains a bit loose and bugs may sneak in.
  • Check carefully whether the notification interface is consistent. I am a little worried about mixing up fail+settle (link level) with forwarding (two links involved), but maybe there is no reason to be worried. I think that going through the various scenarios and validating that the produced event(s) are logical will help (if you haven't already done so)

htlcswitch/switch.go Outdated Show resolved Hide resolved
htlcswitch/failure.go Outdated Show resolved Hide resolved
htlcswitch/htlcnotifier/htlcnotifier.go Outdated Show resolved Hide resolved
htlcswitch/link.go Outdated Show resolved Hide resolved
@carlaKC
Copy link
Collaborator Author

carlaKC commented Dec 2, 2019

Running through some different scenarios, as suggested by @joostjager, to make sure we can distinguish between different types of events.

✅ = Value set ; ❌ = Value not set

Forward Events: Ok

Forward Event Incoming Key Outgoing Key
Send
Forward

Forward Failure: Ok

Forward Event Incoming Key Outgoing Key
Send
Forward

Settle Event: Potential overlap

Settle Event Incoming Key Outgoing Key
Own Receive
Own Send
Forward

In the ideal case, we anything with a zero outgoing key belonging to us then check for matching forward events to determine whether the settle is for a send or receive. However, we will not have a forward event for sends which were forwarded before restart and settled afterwards.

Link Failures: Overlap

Link Failure Incoming Key Outgoing Key IncomingFailed Detail
Receive fails true Invoice specific
HODL Cancel true HODL Cancel
Send fails false failure specific
Fwd fails incoming true failure specific
Fwd Fails outgoing false failure specific

For this set of events, there is an overlap issue between forwarding failures which occur on the incoming link and failures paying our own invoices. Receives have a specific set of failure details, which we could use to distinguish, but that is not forward thinking because we may have a failure which is common for receives and forwards.

Solution:
The overlap issues here are mostly due to the inclusion of local sends/receives. A naive solution would be to match on payment hashes from listinvoices/payments, but that is clunky and not suitable in the case of reused payment hashes. Given that we aren't guaranteed to have forward events for local sends (which differentiate them from receives), I would say that a type field with forward, send and receive would be the simplest way to go? This will solve all the overlap issues that the current event structs have.

htlcswitch/switch_test.go Outdated Show resolved Hide resolved
@joostjager
Copy link
Contributor

With regards to the event sequences, would it make sense to have only a single event per htlc as it is handled by our system? So a forward will generate just a single even, even though there are an incoming and an outgoing link involved. Or is that what you suggest with the send/receive/fwd event type?

@carlaKC
Copy link
Collaborator Author

carlaKC commented Dec 3, 2019

With regards to the event sequences, would it make sense to have only a single event per htlc as it is handled by our system? So a forward will generate just a single even, even though there are an incoming and an outgoing link involved.

As is we have a single event for a forward, which has the information for the incoming and outgoing links in HTLCInfo. Running through scenarios is a great idea, this is how events will be generated for the scenarios you've provided:

Incoming exit hop htlc rejected
Link failure {Incoming link key, rejection reason, incomingLink = true}

Incoming exit hop htlc settled
Settle {Incoming link key, payment hash}

Incoming exit hop htlc accepted but not yet resolved (hodl invoice), then settled
No event on acceptance
Settle {Incoming link key, payment hash}

Incoming exit hop htlc accepted but not yet resolved (hodl invoice), then rejected
No event on acceptance
Link failure {Incoming link key, fail detail = hodlCancel, incomingLink = true}

Outgoing origin htlc (our payment) not handed out (didn't even leave the link)
Link failure {Incoming & outgoing link key, failure detail, incomingLink = false}

Outgoing origin htlc (our payment) failed back by peer
Forward {Incoming & outgoing key, payment hash}
Forward Failure {Incoming & outgoing key, payment hash}

Outgoing origin htlc (our payment) settled
Forward {Incoming & outgoing key, payment hash}
Settle {Incoming link key, payment hash} + TBD: would set type = receive

Incoming htlc to forward not accepted (assuming this is on incoming link?):
Link failure {Incoming key, fail detail, incomingLink = true}

Incoming htlc to forward accepted, but rejected by outgoing link
Link failure {Incoming key, fail detail, incomingLink = false}

Htlc forwarded, but failed back by downstream peer
Forward {Incoming & outgoing key, payment hash}
Forward Failure {Incoming & outgoing key, payment hash}

Htlc forwarded and settled
Forward {Incoming & outgoing key, payment hash}
Settle {Incoming link key, payment hash}

Or is that what you suggest with the send/receive/fwd event type?

A type enum would go in the settle events to distinguish our own sends being settled from our receives (since they have the same data right now). Unrelated to the four main event types. I will update the PR with this today, which will hopefully clarify what I mean by it.

@joostjager
Copy link
Contributor

Ok, nice overview. I see that there are multiple events fired for a single forward, one for the forward-going path and one for the backward path. This is also the case for an outgoing origin htlc. For incoming there is only one event. Maybe it is more consistent to always do two events and cover the hodl invoice accept too? So one event that describes what is requested from us by the htlc metadata and the second event what happened to the request.

Btw, don't forget the on-chain resolution flow where htlc decision are made too.

htlcswitch/htlcnotifier/htlcnotifier.go Outdated Show resolved Hide resolved
htlcswitch/htlcnotifier/htlcnotifier.go Outdated Show resolved Hide resolved
htlcswitch/htlcnotifier/htlcnotifier.go Outdated Show resolved Hide resolved
htlcswitch/htlcnotifier/htlcnotifier.go Outdated Show resolved Hide resolved
htlcswitch/htlcnotifier/htlcnotifier.go Outdated Show resolved Hide resolved
htlcswitch/link.go Outdated Show resolved Hide resolved
htlcswitch/failure.go Outdated Show resolved Hide resolved
htlcswitch/htlcnotifier/htlcnotifier.go Outdated Show resolved Hide resolved
htlcswitch/link.go Outdated Show resolved Hide resolved
htlcswitch/link.go Outdated Show resolved Hide resolved
@joostjager joostjager added this to the 0.9.0 milestone Dec 3, 2019
@carlaKC
Copy link
Collaborator Author

carlaKC commented Dec 4, 2019

PR needs some work adding on chain events, not ready for review right now.

@manreo
Copy link
Contributor

manreo commented Dec 8, 2019

as specified here: #3420

@carlaKC
Copy link
Collaborator Author

carlaKC commented Dec 11, 2019

make setting FailureDetail mandatory when instantiating the ForwardingError, possibly by creating a constructor function that takes both FailureMessage and FailureDetail. Otherwise it remains a bit loose and bugs may sneak in.

This is a little tricky because the point of FailureDetail was for it to be optional (otherwise we end up with duplicate messages to those in lnwire), and there are some legacy cases where we don't even set a failure message (SphinxErrorDecrypter.DecryptError).

One approach for this would be to implement an internal error interface htlcSwitch.linkError which will map to wire messages with somegetWireMessage which does not implement lnwire.FailureMessage itself, so there are no accidental overlaps. You suggested something of this sort in an earlier comment?

Check carefully whether the notification interface is consistent.

Updated in this version. There is some ambiguity for certain events (eg an incoming link failure for a forward and an invoice look the same), so I have gone with a event type enum which indicates whether an event is linked to a send/receive/forward. This makes it clear to the user, and is easily implemented because forwards and our own sends/receives are mostly handled by different code.

@carlaKC
Copy link
Collaborator Author

carlaKC commented Dec 11, 2019

Requesting review on this, although I suspect there's more discussion to be had around ForwardingError, so after a conceptual pass at this stage rather than in depth review.

htlcswitch/htlcnotifier.go Outdated Show resolved Hide resolved
htlcswitch/htlcnotifier.go Outdated Show resolved Hide resolved
HtlcKey: newHtlcKey(pkt),
HtlcInfo: newHtlcInfo(pkt, hash, getEventType(pkt)),
Timestamp: time.Now(),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is much cleaner now that you keep the packet in the link/switch.

htlcswitch/htlcnotifier.go Show resolved Hide resolved
htlcswitch/link.go Outdated Show resolved Hide resolved
}

assertCircuit(t, zeroCircuit, settle.IncomingCircuit)
assertCircuit(t, aliceToBob, settle.OutgoingCircuit)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some method extraction is possible to reduce duplication

// checkSuccessful is a function that checks that Alice, Bob and Carol
// have the correct set of event for a payment that is successfully
// routed over Alice -> Bob -> Carol.
func checkSuccessful(t *testing.T, alice, bob, carol <-chan interface{},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is quite a bit of assertion code here. Can't this test be written by just having three (one for every node) lists of fully-specified events that are expected? The actual events are then just read from the channels and deep-equaled with the expectations.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched over to a set of results, still quite a lot of setup code, but I think it's better. Less duplication at least.

@carlaKC carlaKC force-pushed the 3420-htlcForwarding branch 3 times, most recently from dc28cdf to 5bd23a2 Compare February 18, 2020 17:13
// SettleEvent represents a htlc that was settled.
type SettleEvent struct {
// HtlcKey uniquely identifies the htlc.
HtlcKey
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No HtlcInfo here?

// because errors returned down the route are encrypted.
type ForwardingFailEvent struct {
// HtlcKey uniquely identifies the htlc.
HtlcKey
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No HtlcInfo here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasnt't it also suggested earlier to merge HlcJey and HtlcInfo?

case *lnwire.UpdateFailHTLC:
s.cfg.HtlcNotifier.NotifyForwardingFailEvent(key, eventType)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newline

HtlcEventType: HtlcEventTypeReceive,
Timestamp: ts,
},
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice set of expectations

Copy link
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

echoing @joostjager's question about HtlcKey/HtlcInfo, otherwise this looks good to me now :=

// because errors returned down the route are encrypted.
type ForwardingFailEvent struct {
// HtlcKey uniquely identifies the htlc.
HtlcKey
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasnt't it also suggested earlier to merge HlcJey and HtlcInfo?

incomingHTLCID: packet.incomingHTLCID,
circuit: packet.circuit,
linkFailure: failure,
sourceRef: packet.sourceRef,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this change could be made its own commit with an explaining commit message. Reason being this is touching existing code, so it would be nice (IMO) to see it in isolation to deem whether it is safe.

@@ -811,6 +782,47 @@ func (s *Switch) handleLocalDispatch(pkt *htlcPacket) error {
return nil
}

// handleLocalAddHTLC handles the addition of a htlc for a send that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice to do code moves in separate commits 👍

htlcswitch/switch_test.go Outdated Show resolved Hide resolved
carol *HtlcNotifier) serverOption {

return func(aliceServer, bobServer, carolServer *mockServer) {
aliceServer.htlcSwitch.cfg.HtlcNotifier = alice
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add the htlc notifier by default when creating the mockServers? It should not interfere with existing test cases,

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets a bit more involved, because we then need to pass mocked time into newThreeHopNetwork (which would probably need to be with functional options like this anyway, so all the other calls don't need to be updated), thread the notifier down to initSwithcWithDB (more functional opts so we don't need to update every caller) and either expand the htlcNotifier interface to include Start/Stop/SubscribeEvents or do a type assertion in the test to get the event subscription. If it's not a blocker, I think I'd rather leave it like this?

@carlaKC
Copy link
Collaborator Author

carlaKC commented Feb 19, 2020

Some clarification on HtlcKey/HtlcInfo: we do not have the information we need for HtlcInfo reliably around for settles/fails, particularly on startup so I thought it would be better to split them out and exclude them entirely from settle and fail events. I see this being used in rate limiting/ performance metrics rather than exact accounting of htlc movement. If people need to look up additional information for events, they can look in forwarding log/payments/invoices.

This commit adds a HTLCNotifier to htlcswitch which HTLC events
will be piped through to provide clients with subscriptions to
HTLC level events.

The event types added are forward events (which occur for sends
from and forwards through our node), forward failues (when a
send or forward fails down the line), settles for forwards or
receives to our node and link failures which occur when a htlc
is failed at our node (which may occur for a send, receive or
foreward).
In this commit, a htlcNotifier interface is added to allow for easy
unit testing. Instances of the HtlcNotifier are added to the server,
switch and link.
This commit sets more fields on the htlcPacket created to fail adding
a htlc packet to the switch for notification purposes. This new data is
copied by value from the original packet. The packet is then failed
back to the peer that forwarded us the packet, which is handled by
handledownstream packet. The values added to the packet are not used
in the handling of a failed packet.
This commit adds notifications for htlcs which are forwarded through
our node. Forwards are notified when the htlc is added on our ougoing
link, settles when we send a settle message to the downstream peer.
If a failure occurs, we check whether it occurred at our node, then
notify a link or forwarding failure accordingly.

Note that this change also adds forward event notifications for sends
which are initiated by our node because the handling code for adding
a htlc which originates from our node is the same as that for handling
forwards. Htlcs for our locally initiated sends have our internal pid
set in the incoming htlcs id field, so we extract this value and notify
with a zero htlc id to be consistent with receives (which have zero
outgoing circuits). Subsequent settles or failures are not noitfied
for local sends in this commit, and will be handled in a follow up.
Split handleLocalDispatch into an extra handleLocalAddHTLC function so
we can easily notify an error should one occur adding the htlc.
Notify link failures for our own payments. Separate handling code is
required for local payment link failures because we do not pass these
failures back through the switch (like we do for link failures for
forwards), but rather send them straight back to the router. Our own
sends have the payment ID saved in the incoming htlc ID of the packet's
incoming circuit. This change replaces that value with for the sake
of consistent notifying of sends and receives from our node.
Add notifications for local initiated sends settles and forwarding
failures. As with link failures, local send settles and forwarding
failures are reported directly to the router so must have their own
notification handling.
This commit adds link failure notifications for failures which occur
on our incoming link. These failures may be receives which we failed or
forwards which we could not parse.
Copy link
Contributor

@joostjager joostjager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been quite an evolution to design the event structure and get access to all the required data in a clean way, but I think it was worth it. Not only are those events exposed now, but also technical debt was cleaned up along the way. That will benefit further development in this area.

Final question: did you try running a three hop regtest network and sending some payments back and forth manually with log on trace? As a rough sanity check to see that the log messages make sense.

Copy link
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very happy about the final outcome of this PR, I agree that the extra effort put into it was worth it in the end. LGTM ✅

@halseth halseth mentioned this pull request Feb 20, 2020
@carlaKC
Copy link
Collaborator Author

carlaKC commented Feb 20, 2020

Thank you for all the review @halseth and @joostjager! The saga is almost complete 🎢

Final question: did you try running a three hop regtest network and sending some payments back and forth manually with log on trace? As a rough sanity check to see that the log messages make sense.

Indeed, I have a very rich local forwarding node after all the iterations of this PR.

Send from Alice -> Bob -> Carol looks like this in logs:

Alice:

Notifying forward event: send over (Chan ID=6160:1:1, HTLC ID=4), outgoing amount: 2000 mSAT, outgoing timelock: 6260

Notifying settle event: send over (Chan ID=6160:1:1, HTLC ID=4)

Bob:

Notifying forward event: forward over (Chan ID=6160:1:1, HTLC ID=4) -> (Chan ID=6172:1:1, HTLC ID=0), incoming amount: 2000 mSAT, incoming timelock: 6260, outgoing amount: 1000 mSAT, outgoing timelock: 6220

Notifying settle event: forward over (Chan ID=6160:1:1, HTLC ID=4) -> (Chan ID=6172:1:1, HTLC ID=0)

Carol:

Notifying settle event: receive over (Chan ID=6172:1:1, HTLC ID=0)

@carlaKC carlaKC merged commit 80af295 into lightningnetwork:master Feb 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
HTLC htlcswitch P3 might get fixed, nice to have
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants