htlcswitch: non-strict forwarding via lowest capacity channel by joostjager · Pull Request #4646 · lightningnetwork/lnd

joostjager · 2020-09-24T06:43:47Z

This is a security measure that increases the cost of channel jamming when two nodes have multiple channels open between them with non-equal channel capacities.

By always selecting the lowest capacity channel, an attacker will need to send more htlcs to start tapping into the htlc slots of the bigger (wumbo) channel(s).

This is a security measure that increases the cost of channel jamming when two nodes have multiple channels open between them with non-equal channel capacities.

Crypt-iQ · 2020-09-24T10:39:02Z

I think that a better solution would be to have some sort of pluggable way for a user to specify their non-strict forwarding policy. Then we can have a reasonable default, they can just plug-in different strategies, and we don't have to worry too much about this. I think your approach does make a sane default though.

joostjager · 2020-09-24T11:08:46Z

Yes, a plug-in would be nice via an HtlcInterceptor-like bidirectional stream. Not sure if users have specific requirements for non-strict forwarding for their daily operation though. In any case, this PR is intended as a quick improvement with no downsides as far as I can see.

Crypt-iQ · 2020-09-24T12:13:58Z

But this is only used as a tie-breaker if the channel policies are all the same? Ideally the source node sending the payment would send according to some fee schedules. So the idea is that the intermediate node will charge more for smaller-sized htlc's and less for bigger htlc's with the assumption that channel jamming won't occur with big htlc's. Perhaps @halseth can illuminate if that last bit didn't translate?

joostjager · 2020-09-24T12:43:58Z

Yes, only as a tie breaker. Using the minimum htlc policy is a different way to force small htlcs through the small channels, but that doesn't seem to be implemented widely yet.

The 'charging more' part may not be so relevant, because with jamming nothing is paid anyway.

Crypt-iQ · 2020-09-24T12:49:45Z

Yes, only as a tie breaker. Using the minimum htlc policy is a different way to force small htlcs through the small channels, but that doesn't seem to be implemented widely yet.

The 'charging more' part may not be so relevant, because with jamming nothing is paid anyway.

I think channel jamming does come with some risk that your funds could get locked up, your direct channel peers start throttling your traffic, or even close the channel.

My general approach to code changes is to be minimal and necessary especially if it touches part of the code that could lead to nasty bugs. Which this is minimal, but not sure if necessary. I think I would want to see some spec change that can actually deal with this once and for all.

If a fee scheduling system were widely implemented, or some web-of-trust system (think opt-in rate-limiting feature bits), then this could be completely mitigated. Just my 2 cents on the threat of channel jamming. Ping @alexbosworth for thoughts.

joostjager · 2020-09-24T17:20:27Z

Eclair PR: ACINQ/eclair#1539

alexbosworth · 2020-09-24T17:24:44Z

If a fee scheduling system were widely implemented, or some web-of-trust system (think opt-in rate-limiting feature bits), then this could be completely mitigated. Just my 2 cents on the threat of channel jamming. Ping @alexbosworth for thoughts.

I'd also like to see a more interactive HTLC interception system that gives you the power to shape where HTLCs execute. Management overlay systems that leverage APIs can react quickly to attacks and can present a diverse culture that is complex to attack, daemon upgrades are necessarily slower and more of a mono-culture, so for those type of changes I think it needs to be carefully considered and probably across the entire ecosystem to leverage the strength of a common set of behaviors

cfromknecht · 2020-09-25T01:32:57Z

@joostjager this is interesting, though it's not clear to me that there are "no downsides" or that this is universally better than the status quo.

Consider nodes A and B with two channels, AB1 and AB2, with capacities 10 BTC and 1 BTC respectively. Assume it costs 20 sats to fill up the slots, and that the channels are mostly unbalanced: A has 20 sats on AB1 and 1 BTC - 20 sats on AB2. It's easy to imagine this scenario occurring shortly after AB1 was funded by B and AB2 was funded by A.

Now, the proposed rule says that A should forward malicious HTLCs over AB2 rather than AB1, since AB2 has less capacity. However, this inherently locks up all but 20 sats of A's outbound capacity. Had A forwarded the HTLCs over AB1, it would still have 1 BTC of outbound.

If B is forwarding HTLCs, the proposed rule makes sense. It wants to forward them over AB2 where it only has 20 sats of outbound to preserve its ~10 BTC on AB1. However, this has more to do with the outgoing bandwidth and less to do with the channel capacity (which, in this unbalanced example, is guaranteed to coincide with the capacity tie breaker in one direction and not the other).

To me, the most immediate question is: why not use outgoing bandwidth as the tie breaker, rather than capacity? This has the effect of protecting the instantaneous "potential energy" held by the forwarding node, rather than protecting the maximum potential energy the node could have in theory. Just because you have a wumbo channel doesn't necessarily mean that you will ever have a wumbo balance on your side. If each side does this, they are able to keep more of their existing outbound further into the attack (which is the goal, no?).

IMO tie-breaking on capacity seems to make more complex assumptions about the net flow of funds between the nodes, how static or dynamic the balances are over the lock up period, etc. which probably require some sort of projective modeling based on txn history to fully capture the opportunity costs accurately.

Then again, given we can't possibly predict the future flow of funds across the channel, it could be argued that the current heuristic (which just selects at random) is actually quite suitable. This has the effect of smoothing out the worst case scenario, since in expectation it requires jamming both channels completely in order to prevent a node from being able to forward the sum of both outbounds. It also captures some uncertainty about the future, since sometimes it selects the wumbo channel that currently has little outbound, but whose outgoing could become larger than the instantaneous outbound bandwidth of the smaller channel if the larger one can remain unjammed.

FWIW similar approaches are taken to onchain UTXO selection, where UTXOs are selected at random until the desired amount is reached, to account for uncertainty about the users future spending habits or changes in the fee market. I believe there are parallels here that give this argument some legs, but some more research on this front would be desirable.

Of these three proposals, my initial reaction is that capacity tie-breaking is in fact the least beneficial. Whether random or bandwidth tie-breaking are superior I think depends on how you define the worst case and how you model future usage of the channel. Interested to hear your thoughts.

joostjager · 2020-09-25T09:20:50Z

Good points. One thing I found is that iterating through a map, which is done at the moment, isn't really random but just undefined: https://twitter.com/CAFxX/status/1135190309514620928

If we indeed think that random is the best, it could be made truly random.

Another direction that I was thinking in is to take into account:

remaining number of htlc slots
channel balance
htlc amount

With these, an estimation could be made of the 'liquidity hit'. I was thinking of max(amount, balance/slots)

cfromknecht · 2020-09-25T20:04:52Z

One thing I found is that iterating through a map, which is done at the moment, isn't really random but just undefined:

Yes, it is well known to many lnd developers that, in practice, selection via map iteration does not produce a uniformly random distribution of the map elements.

With these, an estimation could be made of the 'liquidity hit'. I was thinking of max(amount, balance/slots)

Indeed metrics like these seem interesting, but it is difficult to evaluate their robustness without some some sort of framework or concrete objective. Under certain conditions the above behaves just like outgoing-bandwidth tie-breaker, which we already agreed does not account for future liquidity. Perhaps there is some hybrid algorithm that stacks one or more of these metrics and also introduces an element of randomness, but now we are getting into research paper territory...

If we indeed think that random is the best, it could be made truly random.

It is also interesting to consider whether having some coefficient of bias is useful, in the sense of what a node optimizes for along the bandwidth-capacity spectrum. Some nodes may prefer to weight more heavily towards bandwidth now rather than future capacity.

That said, a uniformly random selection seems like a good starting point and an improvement over the biased selection we have now. I would welcome such a PR.

joostjager · 2020-09-28T07:04:41Z

it could be argued that the current heuristic (which just selects at random) is actually quite suitable

This comment triggered me to start thinking about how random the map order is. I am sure many developers (including myself) know that it isn't uniformly random, but how bad is it? I hadn't seen the chart in that tweet before, so I thought it could be nice background information for some.

Also I was wondering how deterministic the order is. Maybe it would be the same every time we add the same channel links to the map (which happens in the switch). Then it would still always pick the same channel first, even though it may not be the one that was added first.

I ran a quick test where I repeatedly build a 2-element int map (having two channels with a peer is probably the most common configuration after having just one channel). For that map, the element that was added first is returned 90% of the time. So a uniformly random selection should have noticeable impact.

joostjager · 2020-09-29T09:07:22Z

Randomization PR: #4659

joostjager · 2020-10-11T10:36:20Z

Closing PR until there is more support for a forwarding strategy that is driven more by heuristics.

joostjager requested review from Roasbeef and cfromknecht as code owners September 24, 2020 06:43

htlcswitch: non-strict forwarding via lowest capacity channel

6e17e3a

This is a security measure that increases the cost of channel jamming when two nodes have multiple channels open between them with non-equal channel capacities.

joostjager force-pushed the fwd-smallest-chan branch from 4b281db to 6e17e3a Compare September 24, 2020 06:56

Crypt-iQ added dos/hardening Related to the resilience of LND against denial of service or other related attacks htlcswitch labels Sep 24, 2020

joostjager mentioned this pull request Sep 25, 2020

[ChannelRelay] Prioritize lowest capacity channels ACINQ/eclair#1539

Merged

joostjager closed this Oct 11, 2020

Conversation

joostjager commented Sep 24, 2020

Uh oh!

Crypt-iQ commented Sep 24, 2020

Uh oh!

joostjager commented Sep 24, 2020

Uh oh!

Crypt-iQ commented Sep 24, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joostjager commented Sep 24, 2020

Uh oh!

Crypt-iQ commented Sep 24, 2020

Uh oh!

joostjager commented Sep 24, 2020

Uh oh!

alexbosworth commented Sep 24, 2020

Uh oh!

cfromknecht commented Sep 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joostjager commented Sep 25, 2020

Uh oh!

cfromknecht commented Sep 25, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joostjager commented Sep 28, 2020

Uh oh!

joostjager commented Sep 29, 2020

Uh oh!

joostjager commented Oct 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Crypt-iQ commented Sep 24, 2020 •

edited

Loading

cfromknecht commented Sep 25, 2020 •

edited

Loading

cfromknecht commented Sep 25, 2020 •

edited

Loading