-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
htlcswitch: implement strict forwarding for locally dispatched payments #1527
htlcswitch: implement strict forwarding for locally dispatched payments #1527
Conversation
@wpaulino this fixes only the first hop channel, not the rest of the channels. Is that correct? |
@federicobond Correct. We might consider adding a flag that enables strict forwarding all throughout. |
71d40bf
to
7e00637
Compare
The BOLT for Onion Routing mentions a Is the specification outdated or am I misreading it? I'm working on a simulation using lnd and need such strict forwarding to compare results. |
@federicobond it uses the pubkey as it's possible that we have multiple links to a particular node. At that instance, we know better than the source w.r.t which link we want to actually forward on. AFAIK, we're the only implementation that even supports having multiple channels with a particular peer. |
Yes, and I agree it’s the most sensible thing to do for each hop, though I would be more careful with the answer from SendToRoute, which assumes that those are the actual channels taken along the route. Do you think I might be able to provide optional channel hints in the extra data field? Would you be interested in incorporating that changeset upstream using a feature flag? We need it for some rebalancing simulations we are doing at Muun. |
For
I'm not sure what you're referring to here. |
For end users, that's true. But our use case is different as we are simulating flows of payments for rebalancing research and we need those payments to be routed exactly through certain channels. As we were running these simulation, we were thrown off by the response of SendToRoute, because it returns a list of specific channels, when those channels may not be the ones that were actually used. I think those channel IDs should be removed from the response, as they can only be guaranteed to be true for the first hop.
We need some way to hint the next hop as to what specific channel should be chosen (the node can be configured to either read or ignore it). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reads well to me at fist read through!
@@ -608,16 +611,18 @@ func TestSendPaymentErrorPathPruning(t *testing.T) { | |||
t.Fatalf("unable to fetch source node pub: %v", err) | |||
} | |||
|
|||
roasbeefLuoji := lnwire.NewShortChanIDFromInt(689530843) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where are these IDs coming from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From basic_graph.json
😛
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would recommend using genIDs()
to ensure we don't reuse short chan ids
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is different as the point of the test is to skip the hop between roasbeef
and luoji
and take an alternate path within the graph in basic_graph.json
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, I missed that this test also uses that graph. I had assumed you were reusing those chan ids in a diff test. I think should be okay then
1bae284
to
df7a0ec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Changes look good to me apart from minor comments in test
@@ -608,16 +611,18 @@ func TestSendPaymentErrorPathPruning(t *testing.T) { | |||
t.Fatalf("unable to fetch source node pub: %v", err) | |||
} | |||
|
|||
roasbeefLuoji := lnwire.NewShortChanIDFromInt(689530843) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would recommend using genIDs()
to ensure we don't reuse short chan ids
@@ -370,7 +371,7 @@ func TestSendPaymentErrorRepeatedFeeInsufficient(t *testing.T) { | |||
// We'll also fetch the first outgoing channel edge from roasbeef to | |||
// son goku. We'll obtain this as we'll need to to generate the | |||
// FeeInsufficient error that we'll send back. | |||
chanID := uint64(3495345) | |||
chanID := uint64(12345) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar comment wrt randomizing sids
I am working on the routing retry behaviour in #1706. If a route fails, it may be that edges are either marked as "failed once" or (on second failure) pruned out completely (not attempted anymore for this payment). In the light of the discussion in this PR, it could be useful to also acknowledge in the retry behaviour that the actual channel being used isn't known. It may be better to associate failures not with specific channels, but with the two endpoints (pubkeys) of the channel. So instead of pruning channel 123, we prune the set of channels from node A to node B. (In case of a single channel, the result is the same) |
htlcswitch/switch.go
Outdated
@@ -364,8 +365,8 @@ func (s *Switch) SendHTLC(nextNode [33]byte, htlc *lnwire.UpdateAddHTLC, | |||
// system and something wrong happened. | |||
packet := &htlcPacket{ | |||
incomingChanID: sourceHop, | |||
outgoingChanID: firstHop, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would move this to be after incomingHTLCID
for clarity, each chanid/htlcid pair corresponds to one circuit key. maintaining those groupings has proved helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
df7a0ec
to
864abd6
Compare
@joostjager most certainly, that's something we've discussed before as it's very related to this PR. Since we don't have certainty about what channel the other hops try, the feedback to the router is lossy (and potentially malicious/incorrect). With sufficient fees, multiple channels between the same peers can be effectively treated as one channel from the source's perspective, so the suggestion to prune all channels connecting the pubkeys makes sense to me. It also shouldn't be an issue if we assume that a node will always try the "best" (or at least sufficient) channel first, as then there'd no point retrying if the others are worse. However, this change could still cause us to miss a possible route if they maintain a strict forwarding policy. Atm we will try to find the hop with the most bandwidth. I think c-lightning has a max of one channel per peer, so it wouldn't be an issue there. Eclair looks like it abides by the channel requested in the payload, though as a disclaimer, my scala is a little rusty. Just based on that knowledge, it may produce less-than-optimal results when routing through an eclair node that has multiple outgoing channels with the next peer. |
Don't we encapsulate which channels to use? If we just specify the nodes... we could end up with problems: Say I have two channels from A to B: I then set the fees of Channel 1 to zero, and channel 2 to 5 BTC... if the sender doesn't know which channel it's going to use... wouldn't this be a problem if others tried to route through that node? If the sender is leaving the determination of which channel to use to the hops... couldn't this be exploited in some way, having different fees for different channels? Multiple channels definitely should not be treated the same. The best way to get liquidity is to open a channel to a peer, and then have them open a channel back. You start out with two channels that are excessively unbalanced, but as a pair they perfectly balanced. C-lightning allows one channel to be opened from each side, so two channels total between peers. We don't try to find the hop with the most bandwidth, we try to find one with enough bandwidth that cost the least in fees. And there's no way to tell what the local balance is on a remote node's channel, so you have to try all of them that fit. And from my understanding (which definitely could be wrong) eclair nodes explicitly do not allow "being routed through." But that could just be the phone app... I haven't looked into their desktop app. As far as pruning an entire route... what exactly do you mean? Do you mean remove all edges along that path from ever being tried again, individually? Or Just the route as a collection? How would that affect channels with rapidly shifting balances that just happened to be insufficient that millisecond that you tried to send through it... but is fine the next? Is there no way to get feedback on exactly which channel failed? |
@ZapUser77 I'd recommend taking a look at the code. All those questions can be answered at a quick glance. All the information is available, just on the RPC layer it may not always be bubbled up. There's a PR to return more accurate failure information back to each peer. The sender always knows which channel it's using. Note that this PR is for local dispatch, so we always traverse the channel specified. |
@cfromknecht I opened a new PR with some additional analysis on the pruning question, #1734 |
"I'd recommend taking a look at the code. All those questions can be answered at a quick glance." I was replying specifically to "Since we don't have certainty about what channel the other hops try", as what you're saying "The sender always knows which channel it's using." directly contradicts. And from the bits of code that I had seen, I believed the sender always knew... but didn't want to be a **** about it. |
@ZapUser77 if you're attempting to join discussions concerning code architecture and design tradeoffs, you're expected to do your homework such that you can positively contribute to the conversation. This is open source, resources are scarce, we can't sit down an explain all prior decisions on manually walk through the code, you're expected to have a certain level of knowledge if you jump into a discussion. The sender always knows. We use source routing. |
@joostjager thanks, I will take a look! @ZapUser77 the sender always knows which channels it wants to use. The sender also knows the outgoing channel they actually use because they choose a channel they own and explicitly update that channel with the direct peer. Once the HTLC travels beyond the first channel, the other nodes aren't required to follow the requested channel and there's no way for us to verify which channel they used.
Nope, a node other than your direct peer could lie about which channel it tried to update, or even if it tried at all. |
Thank you @cfromknecht. @Roasbeef: I have looked at the code... but no, I haven't read and walked through all 100,000+ lines. Rather that be condescending, perhaps you can just ignore my comments that aren't directed towards you? I'm not asking to be walk you through anything. I know you're busy. And I intentionally do not direct any of my statements to you, as every time you communicate with me, I get this distinct feeling of extreme disdain you have for me. [And if you treat all newcomers to the project like that... that may be why you only have 3-5 people doing 95% of the work.] "The sender always knows. We use source routing" I know that. However, what You are saying is directly contradicting what @cfromknecht is saying "Once the HTLC travels beyond the first channel, the other nodes aren't required to follow the requested channel". Anyway, I'm trying to lighten your workload, not add more too it. It'll take some time for me to work through all the code, and understand it fully. |
@ZapUser77 if you treat us with respect, we'll do the same towards you. Your veiled jabs at those that actively contribute to the project are unwarranted, and unproductive.
They aren't required, but they can only send the HTLC to the target node. Therefore, they can't deviate beyond the path aside from cancelling or just holding onto the HTLC. Based on the discussion here, I think we're going to go with strict forwarding everywhere, so even when we forward remote HTLCs. |
@Roasbeef o_0 If you could point me towards anything that you felt was disrespectful to anyone, I'll apologize for it. "They aren't required" So what Conner said is correct, and you're statement "The sender always knows" is incorrect? I was under the impression that the source picked the channels to use, but you're saying it only picks the nodes, and 'recommends' channels. |
864abd6
to
b24551b
Compare
This PR has been amended to enable strict forwarding all throughout. |
In this commit, we address an issue that could arise when using the SendToRoute RPC. In this RPC, we specify the exact hops that a payment should take. However, within the switch, we would set a constraint for the first hop to be any hop as long as the first peer was at the end of it. This would cause discrepancies when attempting to use the RPC as the payment would actually go through another hop with the same peer. We fix this by explicitly specifying the channel ID of the first hop.
In this commit, we constrain our packet forwarding so that the packet is _deterministically_ forwarded to the outgoing channel as before we would attempt to select the best link out of all possible links with the destination peer. This is needed as users begin to manually rebalance their channels with the SendToRoute RPC, expecting their payment to take the _exact_ route they crafted.
b24551b
to
5f5e44b
Compare
@wpaulino Is this really a good idea? If a forwarding node has multiple channels with varying balances open to the same next hop, the source node needs to try them all out until it finds the one that can carry the payment. To me it looks like sacrificing the performance of normal payments to benefit the specialized operation of rebalancing. Besides, you can never be sure that every node is really forwarding strictly. A routing node might reverse this change to generate less failed payments and improve its reputation once there is some kind of reputation management for nodes in lnd. Also, as far as I know the discussion hasn't been closed whether rebalancing should be an action or can also be carried out passively by manipulating fees. And for rebalancing, how exactly does it hurt to have the forwarding node select the channel? And if it does hurt and make the operation less efficient, I am quite sure that is still possible to come up with an algorithm that takes into account the uncertainty of the chosen channel. For execution of payments, a similar challenge exists: how to best process the failure message to maximize the chance of the next route to succeed. |
Balancing can not be performed by fees. Fees only affect 'outgoing' transaction, so they'd be no benefit to channels that you need incoming transactions on. "how exactly does it hurt to have the forwarding node select the channel?" Can they not set the fee of one channel really low to get it to show up as the "cheapest hop" then refuse to use it, and instead pick a much more expensive channel to the same node? Is there some mechanism already to prevent that? |
The sender will select that cheapest hop, and provide the exact amount required to traverse that link. So all fees are already pre-computed, if one nodes takes more than advertised, then there won't be enough fees to distribute to the rest of the nodes, so the route will fail downstream.
It's one piece of the solution. Consider that you can increase fees for a link in order to prevent the bandwidth from being further depleted. Also consider that by decreasing fees on certain links, i can incentivize HTLCs to be routed towards be, thereby giving me higher bandwidth on a select set of links. I tend to agree that it's better network wide for us to select the best link to our ability for the case of remote HTLCs. |
Scenario: P.S. Thank you for being respectful with your reply. |
@federicobond there are no actual guarantees w.r.t which link a node will select. As a result, any rebalancing code should itself be written to account for nodes having leeway w.r.t which channel they forward over. |
Simplified version (no strict remote) committed as 7535371. |
In this PR, we address an issue that could arise when using the
SendToRoute RPC. In this RPC, we specify the exact hops that a payment
should take. However, within the switch, we would set a constraint for
the first hop to be any hop as long as the first peer was at the end of
it. This would cause discrepancies when attempting to use the RPC as the
payment would actually go through another hop with the same peer. We fix
this by explicitly specifying the channel ID of the first hop.
Fixes #1500 and #1515.