htlcswitch: non-strict forwarding via lowest capacity channel#4646
htlcswitch: non-strict forwarding via lowest capacity channel#4646joostjager wants to merge 1 commit into
Conversation
This is a security measure that increases the cost of channel jamming when two nodes have multiple channels open between them with non-equal channel capacities.
4b281db to
6e17e3a
Compare
|
I think that a better solution would be to have some sort of pluggable way for a user to specify their non-strict forwarding policy. Then we can have a reasonable default, they can just plug-in different strategies, and we don't have to worry too much about this. I think your approach does make a sane default though. |
|
Yes, a plug-in would be nice via an HtlcInterceptor-like bidirectional stream. Not sure if users have specific requirements for non-strict forwarding for their daily operation though. In any case, this PR is intended as a quick improvement with no downsides as far as I can see. |
|
But this is only used as a tie-breaker if the channel policies are all the same? Ideally the source node sending the payment would send according to some fee schedules. So the idea is that the intermediate node will charge more for smaller-sized htlc's and less for bigger htlc's with the assumption that channel jamming won't occur with big htlc's. Perhaps @halseth can illuminate if that last bit didn't translate? |
|
Yes, only as a tie breaker. Using the minimum htlc policy is a different way to force small htlcs through the small channels, but that doesn't seem to be implemented widely yet. The 'charging more' part may not be so relevant, because with jamming nothing is paid anyway. |
I think channel jamming does come with some risk that your funds could get locked up, your direct channel peers start throttling your traffic, or even close the channel. My general approach to code changes is to be minimal and necessary especially if it touches part of the code that could lead to nasty bugs. Which this is minimal, but not sure if necessary. I think I would want to see some spec change that can actually deal with this once and for all. If a fee scheduling system were widely implemented, or some web-of-trust system (think opt-in rate-limiting feature bits), then this could be completely mitigated. Just my 2 cents on the threat of channel jamming. Ping @alexbosworth for thoughts. |
|
Eclair PR: ACINQ/eclair#1539 |
I'd also like to see a more interactive HTLC interception system that gives you the power to shape where HTLCs execute. Management overlay systems that leverage APIs can react quickly to attacks and can present a diverse culture that is complex to attack, daemon upgrades are necessarily slower and more of a mono-culture, so for those type of changes I think it needs to be carefully considered and probably across the entire ecosystem to leverage the strength of a common set of behaviors |
|
@joostjager this is interesting, though it's not clear to me that there are "no downsides" or that this is universally better than the status quo. Consider nodes A and B with two channels, AB1 and AB2, with capacities 10 BTC and 1 BTC respectively. Assume it costs 20 sats to fill up the slots, and that the channels are mostly unbalanced: A has 20 sats on AB1 and 1 BTC - 20 sats on AB2. It's easy to imagine this scenario occurring shortly after AB1 was funded by B and AB2 was funded by A. Now, the proposed rule says that A should forward malicious HTLCs over AB2 rather than AB1, since AB2 has less capacity. However, this inherently locks up all but 20 sats of A's outbound capacity. Had A forwarded the HTLCs over AB1, it would still have 1 BTC of outbound. If B is forwarding HTLCs, the proposed rule makes sense. It wants to forward them over AB2 where it only has 20 sats of outbound to preserve its ~10 BTC on AB1. However, this has more to do with the outgoing bandwidth and less to do with the channel capacity (which, in this unbalanced example, is guaranteed to coincide with the capacity tie breaker in one direction and not the other). To me, the most immediate question is: why not use outgoing bandwidth as the tie breaker, rather than capacity? This has the effect of protecting the instantaneous "potential energy" held by the forwarding node, rather than protecting the maximum potential energy the node could have in theory. Just because you have a wumbo channel doesn't necessarily mean that you will ever have a wumbo balance on your side. If each side does this, they are able to keep more of their existing outbound further into the attack (which is the goal, no?). IMO tie-breaking on capacity seems to make more complex assumptions about the net flow of funds between the nodes, how static or dynamic the balances are over the lock up period, etc. which probably require some sort of projective modeling based on txn history to fully capture the opportunity costs accurately. Then again, given we can't possibly predict the future flow of funds across the channel, it could be argued that the current heuristic (which just selects at random) is actually quite suitable. This has the effect of smoothing out the worst case scenario, since in expectation it requires jamming both channels completely in order to prevent a node from being able to forward the sum of both outbounds. It also captures some uncertainty about the future, since sometimes it selects the wumbo channel that currently has little outbound, but whose outgoing could become larger than the instantaneous outbound bandwidth of the smaller channel if the larger one can remain unjammed. FWIW similar approaches are taken to onchain UTXO selection, where UTXOs are selected at random until the desired amount is reached, to account for uncertainty about the users future spending habits or changes in the fee market. I believe there are parallels here that give this argument some legs, but some more research on this front would be desirable. Of these three proposals, my initial reaction is that capacity tie-breaking is in fact the least beneficial. Whether random or bandwidth tie-breaking are superior I think depends on how you define the worst case and how you model future usage of the channel. Interested to hear your thoughts. |
|
Good points. One thing I found is that iterating through a map, which is done at the moment, isn't really random but just undefined: https://twitter.com/CAFxX/status/1135190309514620928 If we indeed think that random is the best, it could be made truly random. Another direction that I was thinking in is to take into account:
With these, an estimation could be made of the 'liquidity hit'. I was thinking of |
Yes, it is well known to many lnd developers that, in practice, selection via map iteration does not produce a uniformly random distribution of the map elements.
Indeed metrics like these seem interesting, but it is difficult to evaluate their robustness without some some sort of framework or concrete objective. Under certain conditions the above behaves just like outgoing-bandwidth tie-breaker, which we already agreed does not account for future liquidity. Perhaps there is some hybrid algorithm that stacks one or more of these metrics and also introduces an element of randomness, but now we are getting into research paper territory...
It is also interesting to consider whether having some coefficient of bias is useful, in the sense of what a node optimizes for along the bandwidth-capacity spectrum. Some nodes may prefer to weight more heavily towards bandwidth now rather than future capacity. That said, a uniformly random selection seems like a good starting point and an improvement over the biased selection we have now. I would welcome such a PR. |
This comment triggered me to start thinking about how random the map order is. I am sure many developers (including myself) know that it isn't uniformly random, but how bad is it? I hadn't seen the chart in that tweet before, so I thought it could be nice background information for some. Also I was wondering how deterministic the order is. Maybe it would be the same every time we add the same channel links to the map (which happens in the switch). Then it would still always pick the same channel first, even though it may not be the one that was added first. I ran a quick test where I repeatedly build a 2-element int map (having two channels with a peer is probably the most common configuration after having just one channel). For that map, the element that was added first is returned 90% of the time. So a uniformly random selection should have noticeable impact. |
|
Randomization PR: #4659 |
|
Closing PR until there is more support for a forwarding strategy that is driven more by heuristics. |
This is a security measure that increases the cost of channel jamming when two nodes have multiple channels open between them with non-equal channel capacities.
By always selecting the lowest capacity channel, an attacker will need to send more htlcs to start tapping into the htlc slots of the bigger (wumbo) channel(s).