New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Direct connect for OnionMessage
sending
#2723
Direct connect for OnionMessage
sending
#2723
Conversation
secp_ctx: Secp256k1<secp256k1::All>, | ||
message_router: MR, | ||
offers_handler: OMH, | ||
custom_handler: CMH, | ||
} | ||
|
||
/// [`OnionMessage`]s buffered to be sent. | ||
enum OnionMessageBuffer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was thinking OnionMessageRecipient
may be a better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed as mentioned in a separate commit.
@@ -516,18 +583,23 @@ where | |||
pub fn send_onion_message<T: OnionMessageContents>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if this should still be pub
? We send through the handlers, so this is not used internally outside of this module. Internally, we need to examine the return value to decide if a connection is needed. So users calling this won't trigger a connection request nor will the message be buffered for them, except if we previously decided that a connection is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its public so users can send custom messages easily, which I think should support, if possible. We'll need to handle the needs-connection case internally here, though, if we want to keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interface here isn't congruent with other sending, FWIW. This takes an OnionMessagePath
instead of a Destination
, so the assumption is that the MessageRouter
has already been called prior to calling send_onion_message
. We'd probably want to do message routing internally, I'd imagine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm, yea, I think so. I mean we don't really need the interface at all, but its nice to have, as long as it is easy to use.
lightning/src/ln/peer_handler.rs
Outdated
let nodes_to_connect = | ||
self.message_handler.onion_message_handler.release_pending_connections(); | ||
let mut pending_events = self.pending_events.lock().unwrap(); | ||
for node_announcement in nodes_to_connect { | ||
pending_events.push(Event::ConnectionNeeded { node_announcement }); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally, I had this in process_events
, but I moved it to timer_tick_occurred
so the timeout could be in terms of ticks. However, it has occurred to me that having it here means we may wait up to 10 seconds before even generating the events, which would result in bad UX. We could instead simply generate events directly in process_pending_events
and process_pending_events_async
, dropping the intermediary pending_events
Vec
. But, as before, we'd have to define the timeout in something other than ticks, or possibly add another method to OnionMessageHandler
for calling in timer_tick_occurred
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the OnionMessageHandler
trait to have a timer_tick_occurred
method so that dropping buffered messages is independent of providing events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some high level design questions I'd like to understand first.
@@ -516,18 +583,23 @@ where | |||
pub fn send_onion_message<T: OnionMessageContents>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its public so users can send custom messages easily, which I think should support, if possible. We'll need to handle the needs-connection case internally here, though, if we want to keep it.
/// [`MessageRouter`]: crate::onion_message::MessageRouter | ||
/// [`Destination`]: crate::onion_message::Destination | ||
/// [`OnionMessageHandler`]: crate::ln::msgs::OnionMessageHandler | ||
ConnectionNeeded { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you end up going the event route vs having an "start opening a connection async" trait (or even just doing it via the onion message router)? I guess an event could let us reuse it later in ChannelManager
to just tell the user to reconnect every time a peer disconnects we care about, rather than users having to poll that, but it does seem like a lot of indirection to get the needs-connection event out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would require adding another parameterization for users to implement (i.e., either a new trait on OnionMessenger
or one on DefaultMessageRotuer
), which would then need to bubble up to the Simple*
aliases.
And since it needs to be async, the trait implementation wouldn't be enough. Users would need to write code to process connection requests in the background from what their trait enqueued. If we wrote some of this logic for them, then we'd need to do it for each language's IO layer. But that wouldn't fit well into any peer management logic that they may have (e.g., LDK Node's PeerStore
).
Seems much simpler just to surface an event for them so that no additional parameterization and processing code is needed, other than handling one more event variant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, fair, I guess. I think the current piping through of the needs-connected nodes is also a bit gross, but I'm not sure its more gross. One further thing to consider is what we want the eventual end state to be. If we want to to generate ConnectionNeeded
events for peers with which we have a channel, we should stick with the event route (and probably start generating such events in the same release). If we don't, adding it via the router is easier to "walk back", and is at least only gross temporarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we pipe the needs-connect nodes through OffersMessageHandler
/ CustomOMHandler
with a new trait method? Then the ChannelManager
could generate ConnectionNeeded
events and we wouldn't have to trouble the background processor. I do see Jeff's argument that the current design aligns with the peer manager doing peer stuff though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... that wouldn't work for uses of send_onion_message
or anyone using something other than ChannelManager
to implement one of those traits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The custom trait implementations would need to support connecting to peers outbound, would it not work in some other way? I'm not convinced it's cleaner than the current approach though :/ just nicer in terms of the PeerMan
/BgProcessor
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure that I follow. All custom implementations should work with the current approach of using OnionMessenger
. Or are you saying a custom implementation of OnionMessageHandler
(i.e., something other than OnionMessenger
)? Note that CustomOnionMessageHandler
and OffersMessageHandler
are specific to OnionMessenger
. Any implementations of those work fine with the current approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline. Updated to have events processed directly from any OnionMessageHandler
by requiring implementations to also implement EventsProvider
. I was able to do so without adding another type parameter to BackgroundProcessor
since the handler can be fetched from PeerManager
.
return; | ||
match self.send_onion_message(path, contents, reply_path) { | ||
Err(SendError::InvalidFirstHop(first_node_id, onion_message)) => { | ||
let network_graph = self.network_graph.deref().read_only(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, is this the right place for this? We could avoid the network graph reference by pushing this into the onion message routing interface, and have it be allowed to return some kind of "My route requires you to connect to (Vec<SocketAddress>, PublicKey)
flag". Alternatively, we could push it into the event handler, which probably already has such a reference, but some users implement that themselves so they'd be stuck doing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, is this the right place for this? We could avoid the network graph reference by pushing this into the onion message routing interface, and have it be allowed to return some kind of "My route requires you to connect to
(Vec<SocketAddress>, PublicKey)
flag".
Eh, I suppose if DefaultMessageRouter
will ultimately need a NetworkGraph
reference, perhaps it would be better to have it there?
Having it in OnionMessenger
was as a check against malicious actors sending onion messages containing bogus reply paths. Otherwise, we'd buffer messages that could never be delivered until the timeout drops them, which could be problematic if it causes us to exceed MAX_TOTAL_BUFFER_SIZE
. So I figured that logic shouldn't be left to implementors of MessageRouter
.
Also note that currently a MessageRouter
is allowed to return a partial path if it can't find a full one back from the destination. And we'll direct connect to the start of that path. So returning a "needs connection" from there would at very least complicate the result type, if we want to support that still.
Further, it's a little roundabout to have MessageRouter
tell OnionMessenger
that it needs a connection to a node when OnionMessenger
is the one that already is keeping track of the peers. And note that currently send_onion_message
takes the path and contents, which it takes and creates an OnionMessage
. So if MessageRouter
is responsible for saying "needs connection" then we still need to call send_onion_message
in order to have something to buffer. That at very least complicates the processing of MessageRouter
's return value.
I'm a little torn on the right approach here.
Alternatively, we could push it into the event handler, which probably already has such a reference, but some users implement that themselves so they'd be stuck doing it.
Not sure I understand this. Are you saying to just have the event say needs connection to some node id and have the handler look it up in the NetworkGraph
? We'd blindly buffer messages in that case, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, I suppose if DefaultMessageRouter will ultimately need a NetworkGraph reference, perhaps it would be better to have it there?
Makes sense to me.
Having it in OnionMessenger was as a check against malicious actors sending onion messages containing bogus reply paths. Otherwise, we'd buffer messages that could never be delivered until the timeout drops them, which could be problematic if it causes us to exceed MAX_TOTAL_BUFFER_SIZE. So I figured that logic shouldn't be left to implementors of MessageRouter.
Hmm, that's a fair point. That said, I'm not sure how an implementor of MessageRouter could screw that up - like if they're required to return a NetAddress they either give us something we concretely should try to connect to, or not.
Also note that currently a MessageRouter is allowed to return a partial path if it can't find a full one back from the destination. And we'll direct connect to the start of that path. So returning a "needs connection" from there would at very least complicate the result type, if we want to support that still.
True, though only in the sense that we'd have to add a Vec<NetAddress>
to OnionMessagePath
, which doesn't seem insane, even if a bit strange.
Further, it's a little roundabout to have MessageRouter tell OnionMessenger that it needs a connection to a node when OnionMessenger is the one that already is keeping track of the peers.
Mmm, true. I mean we do already pass the list of connected peers to the router, so its not like it shouldnt/wont have that info on-hand and for good reason. I guess in my head it doesn't seem wrong at all - the router's job is to look at the network graph plus the peers we have and where we want to get to and decide the best way to get there. If that means connecting, it can make that decision, but its up to the router, and once its decided, it seems like it should be able to communicate that fully to the messenger, which is responsible for actually sending the message (and, awkwardly, passing some event through the peermanager to the user to trigger the connection, but really that is a router-originated event).
And note that currently send_onion_message takes the path and contents, which it takes and creates an OnionMessage. So if MessageRouter is responsible for saying "needs connection" then we still need to call send_onion_message in order to have something to buffer. That at very least complicates the processing of MessageRouter's return value.
Yea, as discussed above istm we should change the send_onion_message
API.
Not sure I understand this. Are you saying to just have the event say needs connection to some node id and have the handler look it up in the NetworkGraph? We'd blindly buffer messages in that case, too.
Yea, I hadn't considered that we're DC'ing in response to invoice_requests, not just send_payment_for_offer
s, so we can't really do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let me go ahead and move NetworkGraph
to DefaultMessageRouter
. I guess the idea would be to add a Vec<SocketAddress>
field to OnionMessagePath
? I used NodeAnnouncement
since it would give the user more information on deciding whether they want to connect to the node. But perhaps SocketAddress
es are sufficient. And I suppose a MessageRouter
may want to connect to a private node, so we should support that.
(and, awkwardly, passing some event through the peermanager to the user to trigger the connection, but really that is a router-originated event)
An alternative could be to parameterize BackgroundProcessor
with OnionMessenger
or MessageRotuer
, but I figured we'd want to avoid adding more parameterization there. Seemed to make sense to have PeerManager
be responsible as a more higher level abstraction for peer-related needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let me go ahead and move NetworkGraph to DefaultMessageRouter. I guess the idea would be to add a Vec field to OnionMessagePath?
Yea, that was my thinking. I mean ideally we wouldn't require returning anything at all, we can figure out if we need to connect without. But to your point we kinda need to here for DoS reasons. One issue is in the ChannelManager
if we want to generate the new events we have no idea what the network address is, which makes it hard to generate the events there.
I used NodeAnnouncement since it would give the user more information on deciding whether they want to connect to the node. But perhaps SocketAddresses are sufficient.
Yea, it just seemed like excess information that wasn't really required, and socket addresses are optional inside that so we kinda want to force them.
And I suppose a MessageRouter may want to connect to a private node, so we should support that.
I wasn't actually even thinking about that, but, yea, fair point.
An alternative could be to parameterize BackgroundProcessor with OnionMessenger or MessageRotuer, but I figured we'd want to avoid adding more parameterization there. Seemed to make sense to have PeerManager be responsible as a more higher level abstraction for peer-related needs.
Yea, seems fine, just a lot of piping stuff through to get from the onion message router to the background processor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related, we'll need to handle sciddir_or_pubkey
somewhere. That is, going from Destination
containing a BlindedPath
with and scid to OnionMessagePath
having a Destination
containing a BlindedPath
with a pubkey.
I suppose doing this in a MessageRouter
is fine. But when wanting to construct such a blinded path, we'll need the NetworkGraph
, too. This responsibility then may want to also live in MessageRouter
, but the current draft is to have it in Router
partly because ChannelManager
already has one. So this may need to be rethought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yea, good point. We can do that when we get there on the receiving side, though.
1915feb
to
3117b94
Compare
Updated per offline discussion. Let me know if this new approach looks good, and I'll write up some tests. |
3117b94
to
74a53a5
Compare
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #2723 +/- ##
==========================================
+ Coverage 88.63% 89.49% +0.85%
==========================================
Files 115 115
Lines 90446 97229 +6783
Branches 90446 97229 +6783
==========================================
+ Hits 80170 87017 +6847
+ Misses 7875 7829 -46
+ Partials 2401 2383 -18 ☔ View full report in Codecov by Sentry. |
74a53a5
to
7774248
Compare
Latest push is to fix CI. I'll need to rebase to fix the fuzz test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good I think. Will give it another glance later.
use lightning::events::EventsProvider; | ||
|
||
let events = core::cell::RefCell::new(Vec::new()); | ||
peer_manager.onion_message_handler().process_pending_events(&|e| events.borrow_mut().push(e)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe after async functions in traits are stabilized, we can add an async event processing method to EventsProvider
with a default implementation 🤔 I'm just wondering if this could be a temporary workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, possibly. EventsProvider
is kinda weird in that up until now it never really needed to be a trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the reason for this now? Can we not just add a process_pending_events_async
method on OnionMessenger
just like we do for ChannelManager
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BackgroundProcessor
gets the OnionMessenger
through PeerManager
, which only knows it is something that implements OnionMessageHandler
and EventsProvider
. If we instead pass OnionMessenger
to BackgroundProcessor
, then we would need to add three more type parameters to BackgroundProcessor
.
lightning/src/ln/msgs.rs
Outdated
@@ -1580,6 +1580,9 @@ pub trait OnionMessageHandler { | |||
/// drop and refuse to forward onion messages to this peer. | |||
fn peer_disconnected(&self, their_node_id: &PublicKey); | |||
|
|||
/// Performs actions that should happen on startup and roughly every ten seconds thereafter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, docs tend to be more extensive on this method elsewhere. You may have been waiting for a concept ACK first though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expanded a bit, but note that other timer_tick_occurred
methods are on structs whereas this is on a trait and thus is not associated with any concrete implementations.
Concept ACK, I think it all looks good to me, can do a thorough review when val's comments are addressed. |
7774248
to
ad36f13
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be good with a rebase to fix CI
|
||
match message_buffers.entry(first_node_id) { | ||
hash_map::Entry::Vacant(e) => match addresses { | ||
None => Err(SendError::InvalidFirstHop(first_node_id)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error message docs need updating if we want to use this here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... this is the only place it is used. What's missing / wrong with the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs say "The first hop does not support onion message forwarding." But we don't really know that they don't advertise OM support at this point, we only know that they're not connected to us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right! This should now mean we don't know how to connect to the first hop.
|
||
match message_buffers.entry(first_node_id) { | ||
hash_map::Entry::Vacant(e) => match addresses { | ||
None => Err(SendError::InvalidFirstHop(first_node_id)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we get to this point, it means that the peer was previously connected+known to support OMs but was disconnected since find_path
was called, right? Given that, would it be better or is there some way to buffer it here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first_node_id
isn't necessarily a peer anymore. DefaultMessageRouter
will look up the node in the NetworkGraph
if it isn't one. More generally, a MessageRouter
can return a partial path back from the Destination
, and we'll happily generated a ConnectionNeeded
event with the first node in such a path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant that if someone is using DefaultMessageRouter
, there's a race where we are connected to a peer when find_path
is initially called, but disconnected by the time we go to actually send the OM to that first-hop peer. A fix would be to always poll the network graph for socket addresses in find_path
, but this may be too rare to bother.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that can occur because we don't hold the lock while finding a path. Seems like we might as well populate addresses
unconditionally. Hmmm... but that won't work if the peer is unannounced (as we don't have the SocketAddress
es handy) or if another implementation of MessageRouter
fails to populate the field, for whatever reason.
I think we'd have to store the addresses from the Init
message on connection. Then upon disconnection, we could transition to a third DisconnectedPeer
state to hold on to the addresses until some timer ticks have passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TheBlueMatt Any thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, also not sure its worth worrying about too much, it should be incredibly rare timing-wise; in general we're supposed to be tolerant of onion messages getting lost. If we're really worried about it we can unconditionally populate the addresses (we should really always be sending OMs to public nodes of they're not reliable anyway), but the extra graph lookup to send a message to our peer kinda sucks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative to the graph lookup is storing them on connection from the Init
message. Either way, we can easily do it later given we don't persist any of this data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We dont have an address list in the Init
, though? (well, our address, but not theirs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. I incorrectly remembered the purpose of those addresses.
ad36f13
to
707bd48
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added tests and fixed a bug that was preventing the timer from ticking once the ConnectionNeeded
event was generated. 🤦♂️
707bd48
to
3868cc9
Compare
Needs rebase. |
3868cc9
to
415d1cf
Compare
Rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only really one comment, basically LGTM. Feel free to squash.
let mut new_pending_messages = VecDeque::new(); | ||
core::mem::swap(pending_messages, &mut new_pending_messages); | ||
*self = OnionMessageBuffer::ConnectedPeer(new_pending_messages); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: else { debug_assert!(false); }
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is chained after .or_insert_with(|| OnionMessageBuffer::ConnectedPeer(VecDeque::new()))
, so it would cause test failures. I think it's fine without the check given in that case it is a no-op.
use lightning::events::EventsProvider; | ||
|
||
let events = core::cell::RefCell::new(Vec::new()); | ||
peer_manager.onion_message_handler().process_pending_events(&|e| events.borrow_mut().push(e)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand the reason for this now? Can we not just add a process_pending_events_async
method on OnionMessenger
just like we do for ChannelManager
?
// Buffer an onion message for a disconnected peer | ||
disconnect_peers(&nodes[0], &nodes[1]); | ||
assert!(nodes[0].messenger.next_onion_message_for_peer(nodes[1].node_id).is_none()); | ||
nodes[0].messenger.send_onion_message(message, destination, None).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodes[0].messenger.send_onion_message(message, destination, None).unwrap(); | |
nodes[0].messenger.send_onion_message(message, destination, None).unwrap(); | |
assert!(nodes[0].messenger.next_onion_message_for_peer(nodes[1].node_id).is_none()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would causes us to hit the debug assertion in dequeue_message
. We don't prevent someone from calling next_onion_message_for_peer
for a disconnected peer. Perhaps we should return None
? Calling it means the PeerManager
state is out-of-sync with the OnionMessenger
state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will address the nits while waiting on responses for some outstanding comments.
use lightning::events::EventsProvider; | ||
|
||
let events = core::cell::RefCell::new(Vec::new()); | ||
peer_manager.onion_message_handler().process_pending_events(&|e| events.borrow_mut().push(e)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BackgroundProcessor
gets the OnionMessenger
through PeerManager
, which only knows it is something that implements OnionMessageHandler
and EventsProvider
. If we instead pass OnionMessenger
to BackgroundProcessor
, then we would need to add three more type parameters to BackgroundProcessor
.
|
||
match message_buffers.entry(first_node_id) { | ||
hash_map::Entry::Vacant(e) => match addresses { | ||
None => Err(SendError::InvalidFirstHop(first_node_id)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TheBlueMatt Any thoughts on this?
415d1cf
to
7530750
Compare
Needs rebase, but would be happy to see it squashed. |
Onion messages are buffered for sending to the next node. Since the network has limited adoption, connecting directly to a peer may be necessary. Add an OnionMessageBuffer abstraction that can differentiate between connected peers and those are pending a connection. This allows for buffering messages before a connection is established and applying different buffer policies for peers yet to be connected.
7530750
to
e8afba4
Compare
Rebased and squashed. Had to refactor the sending methods slightly to log onion message contents. |
let sender = self.node_signer | ||
.get_node_id(Recipient::Node) | ||
.map_err(|_| SendError::GetNodeIdFailed)?; | ||
|
||
let peers = self.message_buffers.lock().unwrap() | ||
.iter() | ||
.filter(|(_, buffer)| matches!(buffer, OnionMessageBuffer::ConnectedPeer(_))) | ||
.map(|(node_id, _)| *node_id) | ||
.collect(); | ||
|
||
let path = self.message_router | ||
.find_path(sender, peers, destination) | ||
.map_err(|_| SendError::PathNotFound)?; | ||
|
||
log_trace!(self.logger, "Constructing onion message {}: {:?}", log_suffix, contents); | ||
|
||
let result = self.send_onion_message_using_path(path, contents, reply_path); | ||
|
||
match result.as_ref() { | ||
Err(SendError::GetNodeIdFailed) => { | ||
log_warn!(self.logger, "Unable to retrieve node id {}", log_suffix); | ||
}, | ||
Err(SendError::PathNotFound) => { | ||
log_trace!(self.logger, "Failed to find path {}", log_suffix); | ||
}, | ||
Err(e) => { | ||
log_trace!(self.logger, "Failed sending onion message {}: {:?}", log_suffix, e); | ||
}, | ||
Ok(SendSuccess::Buffered) => { | ||
log_trace!(self.logger, "Buffered onion message {}", log_suffix); | ||
}, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, this needs to be refactored a little more. Otherwise, not all the errors will be logged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be fixed now.
OnionMessenger::send_onion_message takes an OnionMessagePath. This isn't very useful as it requires finding a path manually. Instead, have the method take a Destination and use OnionMessenger's MessageRouter to construct the path. Later, this will allow for buffering messages where the first node in the path isn't a direct connection.
MessageRouter::find_path returns a path to use when sending an onion message. If the first node on the path is not connected or does not support onion messages, sending will fail with InvalidFirstHop. Instead of failing outright, buffer the message for later sending once the first node is a connected peer.
When buffering onion messages for a node that is not connected as a peer, it's possible that the node does not exist. Include a NetworkGraph reference in DefaultMessageRouter so that it can be used to check if the node actually exists. Otherwise, an malicious node may send an onion message where the reply path's introduction node doesn't exist. This would result in buffering messages that may never be delivered.
MessageRouter::find_path is given a Destination to reach via a set of peers. If a path cannot be found, it may return a partial path such that OnionMessenger can signal a direct connection to the first node in the path is needed. Include a list of socket addresses in the returned OnionMessagePath to allow OnionMessenger to know how to connect to the node. This allows DefaultMessageRouter to use its NetworkGraph to return socket addresses for gossiped nodes.
When there isn't a direct connection with the Destination of an OnionMessage, look up socket addresses from the NetworkGraph. This is used to signal to OnionMessenger that a direct connection is needed to send the message.
A MessageRouter may be unable to find a complete path to an onion message's destination. This could because no such path exists or any needs on a potential path don't support onion messages. Add an event that indicates a connection with a node is needed in order to send the message.
An OnionMessageHandler may buffer messages that can't be sent because the recipient is not a peer. Have the trait extend EventsProvider so that implementation so that an Event::ConnectionNeeded can be generated for any nodes that fall into this category. Also, implement EventsProvider for OnionMessenger and IgnoringMessageHandler.
e8afba4
to
3e87b3c
Compare
OnionMessenger buffers onion messages for nodes that are pending a connection. To prevent DoS concerns, add a timer_tick_occurred method to OnionMessageHandler so that buffered messages can be dropped. This will be called in lightning-background-processor every 10 seconds.
Some code hygiene before another parameter is added and rustfmt is eventually used.
Simply to avoid excessive wrapping when possible.
OnionMessageHandler implementations now also implement EventsProvider. Update lightning-background-processor to also process any events the PeerManager's OnionMessageHandler provides.
lightning-background-processor processes events provided by the PeerManager's OnionMessageHandler for when a connection is needed. If a connection is not established in a reasonable amount of time, drop any buffered onion messages by calling timer_tick_occurred.
Additional tests will be added needing a similar node struct, so consolidate its usage.
Add tests for onion message buffering checking that messages are cleared upon disconnection and timed out after MAX_TIMER_TICKS. Also, checks that ConnectionNeeded events are generated.
3e87b3c
to
0b83116
Compare
Latest push should fix the |
Ok(()) | ||
} | ||
e.get_mut().enqueue_message(onion_message); | ||
Ok(SendSuccess::Buffered) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this return value depend on the buffer's connected/pending state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'll fix it in a follow-up.
/// | ||
/// Only needs to be set if a connection to the node is required. [`OnionMessenger`] may use | ||
/// this to initiate such a connection. | ||
pub addresses: Option<Vec<SocketAddress>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we say, like, first_hop_addresses
or first_hop_socket_addresses
? Just addresses
doesn't tell me much in the context of an onion message path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. Will address.
Not all nodes support onion messages, so it may not be possible for a
MessageRouter
to find a complete path to anOnionMessage
's destination. A direct connection to the first node in anOnionMessagePath
is therefore required. This PR adds aConnectionNeeded
event to signal to users to connect to a node.When sending an
OnionMessage
or replying to one -- but not when forwarding --OnionMessenger
will buffer messages for immediate recipients that are not peers and upon connection make them available for processing as normal. A newrelease_pending_connections
method is added to theOnionMessageHandler
trait for thePeerManager
to call and provide the necessaryConnectionNeeded
events. Thus,PeerManager
now implementsEventsProvider
and is called inBackgroundProcessor
accordingly.PeerManager
calls the new method in its owntimer_ticks_occurred
method. This allowsOnionMessenger
to drop any buffered messages for nodes that haven't been connected in a certain number of timer ticks.