-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
draft: HTLC Endorsement to Mitigate Channel Jamming #1071
base: master
Are you sure you want to change the base?
draft: HTLC Endorsement to Mitigate Channel Jamming #1071
Conversation
591e524
to
db043e9
Compare
ec6eb65
to
a7075f7
Compare
04-onion-routing.md
Outdated
|
||
### Rationale | ||
If a HTLC is endorsed by a peer they have signaled that they expect the HTLC | ||
to resolve honestly, so will be held accountable for the manner in which they |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this again now, it makes me think of https://lists.linuxfoundation.org/pipermail/lightning-dev/2023-February/003842.html. But that was of course for the sender.
04-onion-routing.md
Outdated
@@ -1407,6 +1438,88 @@ The _origin node_: | |||
- MAY use the data specified in the various failure types for debugging | |||
purposes. | |||
|
|||
## Recommendations for Reputation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is helpful, makes it much more concrete what to think of when talking about a reputation system in the context of a routing node.
a7075f7
to
6e221f8
Compare
04-onion-routing.md
Outdated
for `resolution_time` incurred. | ||
- `fees`: the fees paid by a forwarded HTLC (as described in [BOLT #7](07-routing-gossip.md#htlc-fees), | ||
equal to 0 if the HTLC was not fulfilled). | ||
- `opportunity_cost`: `ceil ( (resolution_time - resolution_period) / resolution_period) * fees` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a worst-case resolution_period
which is by far larger than resolution_time
I can create a big enough negative opportunity_cost
along a route.
Given that LN payments unfold over a big route, if I control the source & destination of a payment I can decide for how long to hold the payment for.
Given this layout
A -- B -- C -- D
- I control
A
&D
B
andC
are some very big routing nodes
We consider the past relationship of B
and C
to be very good (i.e when B
forwards to C
then they will be endorsed because they have accumulated a lot of effective_fees
over the window of interest)
I can follow these steps:
- I make a few good payments from
A
toD
until I notice thatendorsed
is turned on (effectively meaning thatB
now endorses me) - Given access to the endorsed slots, I now create one or multiple payments from
A
toD
that upon success would pay a big amount offees
. - After I receive the
endorsed
HTLCs (that correspond to the above payments) onD
I hold on to them for as long as possible by not releasing the preimage. - Just before the CLTV timeout I fail the payment.
A -- B
,B -- C
,C -- D
all damage their local reputation byfees - opportunity_cost
, whereopportunity_cost
can be a big enough multiple offees
.
If the above flow is feasible, then a node that just earned the endorsed
flag by their peer can now cause them reputation damage by far greater than what it cost them (in fees) to earn that flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To mitigate this possible attack, perhaps opportunity_cost
formula can have an upper limit?
Something that can still allow it to create damage on effective_fees
(in order to maintain the earn slow / lose fast attribute) but not large enough to cause significant damage on other links further into the route, links with much stronger reputation relationships.
2600b6f
to
9b97e28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be reduced to two variables per (outgoing) channel: reputation and exhaustion cost.
So, if incoming endorsed
and outgoing reputation is greater than exhaustion cost of channel*:
- outgoing endorsed = 1
Otherwise: - outgoing endorsed = 0
If outgoing endorsed is 0:
- reduce the effective max_htlc_value_in_flight_msat and max_accepted_htlcs of the outgoing channel by 50% for purposes of this htlc.
Otherwise: - reduce outgoing reputation by
fee
* (cltv_expiry
- current block height) * 600, - Record the start time of the HTLC.
When the HTLC is resolved:
- Record the end time of the HTLC.
- If outgoing endorsed was 0:
- If the HTLC was successful, and the end time - start time was less than 60 seconds
- Increase the outgoing reputation by 50% of the htlc fee
- If the HTLC was successful, and the end time - start time was less than 60 seconds
- Otherwise (endorsed = 1):
- Increase the reputation by
fee
* (cltv_expiry
- current block height) * 600 - If the end time - start time < 60 seconds and the HTLC was successful
- Increase reputation by
(end time - start time - 60) / 60) * fee
- Increase reputation by
- Increase the reputation by
Every 1 days (or X blocks?)
- Reduce all outgoing reputations by 1% (? depends on how long you're aiming for, see below)
- I didn't see this clearly spelled out in the draft, but on the call you used these terms. Ideally, it's "how much money did this channel make in the last max-ctlv-delta-allowed blocks", but practically it's probably a decaying average:
- If HTLC is successful:
- Add fee to exhaustion cost of outgoing channel
- Every block:
- Multiply the exhaustion cost of the outgoing channel by
1 - (1 - n)^(1/n)
wheren
is the max ctlv-delta you allow. - (ChatGPT tells me that's how you calculate the exp decay factor, but I haven't run tests to check...)
- Multiply the exhaustion cost of the outgoing channel by
04-onion-routing.md
Outdated
`max_accepted_htlcs`. | ||
- MUST choose `unknown_allocation_liquidity` <= the remote channel peer's | ||
`max_htlc_value_in_flight_msat`. | ||
- If `endorsed` is set to 1 in the incoming `update_add_htlc` AND the HTLC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably explicitly allow (and ignore!) the other bits for future use. So this should be:
"If endorsed
is non-zero in the incoming..."
04-onion-routing.md
Outdated
Peers build reputation by forwarding successful HTLCs that resolve quickly, and | ||
lose reputation if they endorse failing or slow-resolving HTLCs. Reputation is | ||
only _negatively_ affected if an endorsed HTLC resolves undesirably, to hold | ||
nodes accountable for their endorsement signal while still allowing them to | ||
forward unendorsed HTLCs that they are not certain about. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this statement is true, because of the following scenario (let me know if I'm missing the point entirely):
- A wants to send a payment to D: A -> B -> C -> D
- the A -> B and B -> C channels are empty: A and B both endorse the HTLC
- the C -> D channel is full (because of payments coming from unrelated nodes, e.g. E -> C -> D)
- when C receives the HTLC, even it's fully endorsed and reputable, C has to fail the payment
- when B receives the failure, B will then decrease A's reputation
Is the negative impact on A an issue? Is it something that can be abused? Can we do something about it (should we)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A quick failure will not harm your reputation.
Only slow resolving endorsed HTLC can harm your reputation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I believe that's one of the main differences between @thomash-acinq's proposal and this one, it's hard to evaluate which is the right choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments/questions:
- if I'm an honest new node and the network is jammed, I think you can have a trust-based pay-for-endorsement scheme so they could send payments. I don't think it needs to be described here though
- if I have the topology A---B---C it seems like C can grief A's reputation with B?:
- A and B are both honest, C is malicious
- A sends an endorsed payment through B to C
- C holds the payment for as long as possible and then fails it back
- B punishes A for sending this endorsed payment that turned out to be jam-like
Hi, I find this method of mitigating HTLC jamming quite interesting, however, I have one question. Previously a node could achieve a higher payment success rate if it had more channels to more nodes in the network and it would possibly achieve more privacy if it utilized different nodes to route its payments through. |
04-onion-routing.md
Outdated
which capture the fees that it paid and the opportunity cost that holding it | ||
for `resolution_time` incurred. | ||
- `fees`: the fees paid by a forwarded HTLC (as described in [BOLT #7](07-routing-gossip.md#htlc-fees), | ||
equal to 0 if the HTLC was not fulfilled). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm not understanding something. If fees
are equal to zero for unfulfilled HTLCs, then it means that opportunity_cost
is also zero. Does this mean failed HTLCs won't result in losing reputation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For zero fees a default ppm
will be assumed (100ppm?) in order to bypass this.
Also, if we're talking about a fast failing HTLC then even if we have fees the loss is going to be zero (can be seen on opportunity_cost
formula)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that for when the payment is fulfilled but fees sent by the sender are zero? Or for the case mentioned here equal to 0 if the HTLC was not fulfilled
?
A lot of the discussion revolves around the specific reputation scheme proposed here, however I don't think that this should be part of bolts which only describe rules for communication between peers. While it is crucial to find a good way to compute reputation, this topic is already discussed elsewhere (mailing list, meetings), we should focus here on the actual spec change: a way to signal to the next node how confident we are that this HTLC will succeed. The questions that need answering here are:
I personally think that it is useful to transmit our confidence to the next peer and that the more precision we give, the more useful it is. However too much precision could be a privacy leak (if you receive two HTLCs with the same confidence, it probably means that they followed the same path and came from the same sender) so I think that having 8 confidence buckets (3 bits of information) would be a good compromise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I think that resource bucketing can make sense as an MVP for how to interpret the endorsement mechanic laid out in BOLT2, I find myself resistant to this being in the main BOLT sections. Even with the designation of "MAY", I think this is better suited to be an extension BOLT or perhaps even a BLIP.
The reason for this is that you state in the proposal that reputation is a local phenomenon. Each node not only gets to make a decision for how to measure reputation and how to update the priors based on activity, but also probably ought to be free to select among a nigh infinite number of slot/liquidity allocation strategies between endorsed and unendorsed HTLCs.
In a prior conversation you had explained that the endorsement mechanic requires a strategy for how that endorsement can be used to mitigate jamming to demonstrate the utility of endorsements at all. I agree with this assessment, but I still find the particular strategy in the proposal to be lacking (more on that below). However, there's nothing intrinsically broken about the resource bucketing strategy you present, it just is probably far more rudimentary than a mature deployment of this would look like.
Further, because this decision is ultimately a local one and the notion of reputation is also local, no matter what strategy you present here, even if it's one that all of us are delighted by, we should expect nodes on the network to experiment and deploy their own solutions. Because they can, and because better solutions to this problem will yield better risk-adjusted returns, we can also expect that a portion of these strategies will be proprietary as well.
Considering all of these factors, I think it is more appropriate to consider the resource allocation strategy as a recommendation, and should probably be placed into an appendix of some kind, be it an extension BOLT, a BLIP or otherwise. I could be misunderstanding the scope and responsibilities of the main BOLTs but if I was trying to bootstrap another implementation, I would be required to understand the endorsement specification to be compatible with the rest of the network, but I would have no need whatsoever to implement the exact reputation and resource allocation strategies to remain compatible with the network.
With the more organizational critiques out of the way, I will motivate where I'm coming from with my exact concerns with resource bucketing in general. All resources in economics have marginal value, meaning that each additional unit of that resource you consume costs more than the last one (in real terms). The resources you've identified here (slots and liquidity) are no different. As a result, the "real cost" of allocating the last slot or sat of liquidity is greater than the first. There is no way to set the parameters described in this section to accurately model this phenomenon. What if I want to have the required reputation for forwarding increase as my available resources decreases?
Ok, that's the cost/risk side of the equation, but what about the benefit/reward?
As a routing node, every decision to accept an HTLC is essentially granting an option on a liquidity trade that trades liquidity on the downstream link for liquidity on the upstream link, with a probable increase in the total liquidity (taken as fees). The jamming problem represents a risk in being able to execute that trade to completion. In any scenario where we are taking risks for potential benefits, it makes little sense to analyze the risks (which this proposal does with the endorsement mechanic and reputation recommendations) without also considering the potential benefits. This proposal ignores the potential benefits of forwarding an HTLC (chiefly the fees).
It can make sense for me as a node operator to let a node with lower reputation offer an HTLC forward with a large fee, when I'd be hesitant to do so at a lower fee. Similar to the way that higher interest rates are charged for borrowers with lower credit scores, we need not deny a forwarding request simply because the upstream link doesn't have the reputation we'd want.
So to summarize my criticisms of the resource bucketing strategy, it comes down to two things: 1. It does not account for the continuously variable nature of the costs of offering the slots/sats, 2. It does not account for the potential benefits of forwarding the HTLC. That said, I don't think it's reasonable to require an airtight algorithm that takes these things into account for the endorsement mechanic to be a useful improvement to the status quo. I also don't have any issue with large swaths of the network deploying this strategy and seeing if it improves jamming incidence rates. Despite its incompleteness in modeling the incentives of the operator, it may be a dramatic improvement over today, I don't know. However, because of this incompleteness, I don't think it should be in the part of the spec that I view to be required for interop with the rest of the network.
Agreed.
That's very dangerous as an attacker can trivially exploit this: they just need to offer very high fees to compensate for their bad reputation (it doesn't cost them anything because they don't intend to actually pay the fees, they will just fail the HTLC).
That's only a limitation of this specific algorithm to assign reputation, which as you said should not be part of the spec. However even when using a continuous reputation scheme, the binary endorsement forces you to discretize to 0 or 1. That's why I'm suggesting to replace the binary endorsement with a confidence value on 3 bits. A fully continuous value could be a privacy leak but I think that 3 bits is a good balance between the 1 bit of this proposal and a fully continuous value. |
This is far from a trivial exploit. It is already the case that the attacker has no way to know what their reputation is with respect to their peers. For them to be able to exploit it, they would need to know what your threshold for endorsement is, which isn't a publicly knowable thing. Additionally, even while offering high fees for offered HTLCs does not guarantee the loss of those sats, it is still a capital outlay requirement that can reduce the reach of these attacks as well as well as reduces the attacker's bandwidth to accomplish them. That said, I'd imagine the reduction in effectiveness of the attack as a result of this increased cost is probably marginal at best, but this was also not suggested as a security scheme, I was simply pointing out that we cannot ignore the reward side of the incentive scheme when considering a node operator's interests.
I actually think that this is a good thing. By forcing nodes to make a decision between 0 or 1 at the protocol level, you force the inputs to that decision to be a private matter, which ultimately it is. The node operator can either choose to tie its reputation to an HTLC or not.
I think that this convolutes things in a way that conceals the real dynamic in play. It is not the role of the endorser to "proxy" the reputation of its peers. The role of the endorser is to tie its own reputation to the HTLC it is offering. It is hard to understand how else to interpret the endorsement mechanic if it is allowed to have any more than 1 bit of signaling. Let's say we have 3 bits as you suggest, what happens if I endorse it to a level of 001 (000 being lowest and 111 being highest), and then the HTLC fails? What if the HTLC succeeds? What is my peer even trying to tell me when it gives a "partial endorsement"? The other issue with a continuous value is that it can basically be used as a measurement for how close to the payment source you are. Why would I endorse someone else's HTLC at a higher level than the upstream link did? Why wouldn't I ever endorse my own HTLC as 111? Ultimately I believe the forced discretization of the endorsement is a good thing. In fact I believe that simply specifying that and having some discussion and recommendations around possible ways of interpreting endorsement (or non-endorsement), is enough for this proposal to be self-justifying and complete. I believe that the specifics of how to measure reputation and how to allocate HTLC slots/sats based off of reputation is beyond the scope of what this specification should offer. Very often when we provide libraries we may also provide code examples to demonstrate how to use it, and I believe the resource bucketing scheme and ideas on how to measure and update reputation should not be viewed as anything more significant than a spec level code example. Compliance with these suggested schemes is neither enforceable nor can we expect nodes to adopt the same behaviors, so it really ought to be considered as a demo use of that endorsement bit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK 9b97e28
The problem is so big that a deep evaluation of the solution is a lot time-consuming for people not daily thinking about this problem (at least for me), so I like the approach to start with the peer's reputation and see how the network reacts to this.
So, I will ack the approach and start to write some code for this in cln and then evaluate some real data. I guess the more difficult part here is the reputation algorithm
P.S': Some small nits found while reading have been reported.
P.S'': I agree that the reputation should be separate from the BOL. I have started the lnmetrics.rfc for this particular reason.
04-onion-routing.md
Outdated
HTLC resolution time is assessed relative to a threshold that the node | ||
considers to be a reasonable amount of time for a HTLC to resolve: | ||
- `resolution_period`: the amount of time a HTLC is allowed to resolve in that | ||
is classified as "good" behavior, expressed in seconds (default: 60 seconds). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should mention why 60 seconds, iirc this is the default mpp timeout! We should report it there
04-onion-routing.md
Outdated
successful, fast resolving HTLCs during the `resolution_time` the HTLC was | ||
locked in the channel. | ||
|
||
For every resolved incoming HLTC a peer has forwarded through a node, its |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For every resolved incoming HLTC a peer has forwarded through a node, its | |
For every resolved incoming HTLC a peer has forwarded through a node, its |
nit, there are other few around the doc
This commit introduce the wire change to the wire sysytem of core lightning in order to implement the [1]. [1] lightning/bolts#1071 Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
This commit introduce the wire change to the wire sysytem of core lightning in order to implement the [1]. [1] lightning/bolts#1071 Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
This commit introduce the wire change to the wire sysytem of core lightning in order to implement the [1]. [1] lightning/bolts#1071 Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
This commit introduce the wire change to the wire sysytem of core lightning in order to implement the [1]. [1] lightning/bolts#1071 Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
This commit introduce the wire change to the wire sysytem of core lightning in order to implement the [1]. [1] lightning/bolts#1071 Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
This commit introduce the wire change to the wire sysytem of core lightning in order to implement the [1]. [1] lightning/bolts#1071 Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
This commit introduce the wire change to the wire sysytem of core lightning in order to implement the [1]. [1] lightning/bolts#1071 Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com>
@@ -995,7 +995,10 @@ is destined, is described in [BOLT #4](04-onion-routing.md). | |||
1. type: 0 (`blinding_point`) | |||
2. data: | |||
* [`point`:`blinding`] | |||
|
|||
1. type: 1 (`endorsed`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1. type: 1 (`endorsed`) | |
1. type: 3 (`endorsed`) |
Can we move this to a new optional/required pair (type 2/3)?
(see [Local Reputation](recommendations/local-resource-conservation.md#local-reputation)): | ||
- SHOULD set `endorsed` to `1` | ||
- otherwise: | ||
- SHOULD set `endorsed` to `0`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to prefer including the tlv with a non-endorsement as opposed to omitting it? As far as I can tell, the downstream node interprets a missing endorsement tlv exactly the same as setting it to 0. I suppose I'm wondering if there's any reason to track upstream endorsement as a ternary option. Does it help peers to signal that the node is actively resource bucketing / tracking reputation with endorsements even if the payments carry no endorsement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc the reason what that we want to know the people who are using the endorsed
and the people that does not.
I also remember that we were discussing a feature bit that makes the endorsed
optional when it is 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iirc the reason what that we want to know the people who are using the endorsed and the people that does not.
This is the case for the experiment outlined in lightning/blips#27, but for the spec proposal I @endothermicdev is correct and we can just omit the field completely to save a few bytes if it's zero-value 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mh @carlaKC I think that the blips were overriding this, what is the correlation between the two proposals?
This PR introduces an
endorsed
TLV toupdate_add_htlc
as a way for nodes to indicate whether they expect a HTLC to resolve "honestly". Nodes are advised to allocate a limited portion of their outbound liquidity and slots to HTLCs that are not endorsed by peers that they consider to have high reputation.Opening early for discussion on structure, not ready for review - discussions around recommendations for local reputation scoring are ongoing.
Slides for the visually-minded here