Direct connect for `OnionMessage` sending #2723

jkczyz · 2023-11-09T23:25:45Z

Not all nodes support onion messages, so it may not be possible for a MessageRouter to find a complete path to an OnionMessage's destination. A direct connection to the first node in an OnionMessagePath is therefore required. This PR adds a ConnectionNeeded event to signal to users to connect to a node.

When sending an OnionMessage or replying to one -- but not when forwarding -- OnionMessenger will buffer messages for immediate recipients that are not peers and upon connection make them available for processing as normal. A new release_pending_connections method is added to the OnionMessageHandler trait for the PeerManager to call and provide the necessary ConnectionNeeded events. Thus, PeerManager now implements EventsProvider and is called in BackgroundProcessor accordingly.

PeerManager calls the new method in its own timer_ticks_occurred method. This allows OnionMessenger to drop any buffered messages for nodes that haven't been connected in a certain number of timer ticks.

jkczyz · 2023-11-09T23:27:50Z

lightning/src/onion_message/messenger.rs

 	secp_ctx: Secp256k1<secp256k1::All>,
 	message_router: MR,
 	offers_handler: OMH,
 	custom_handler: CMH,
 }

+/// [`OnionMessage`]s buffered to be sent.
+enum OnionMessageBuffer {


Was thinking OnionMessageRecipient may be a better name.

Renamed as mentioned in a separate commit.

lightning/src/events/mod.rs

jkczyz · 2023-11-09T23:35:00Z

lightning/src/onion_message/messenger.rs

@@ -516,18 +583,23 @@ where
 	pub fn send_onion_message<T: OnionMessageContents>(


Wondering if this should still be pub? We send through the handlers, so this is not used internally outside of this module. Internally, we need to examine the return value to decide if a connection is needed. So users calling this won't trigger a connection request nor will the message be buffered for them, except if we previously decided that a connection is needed.

Its public so users can send custom messages easily, which I think should support, if possible. We'll need to handle the needs-connection case internally here, though, if we want to keep it.

The interface here isn't congruent with other sending, FWIW. This takes an OnionMessagePath instead of a Destination, so the assumption is that the MessageRouter has already been called prior to calling send_onion_message. We'd probably want to do message routing internally, I'd imagine?

Mmm, yea, I think so. I mean we don't really need the interface at all, but its nice to have, as long as it is easy to use.

jkczyz · 2023-11-13T14:48:23Z

lightning/src/ln/peer_handler.rs

+		let nodes_to_connect =
+			self.message_handler.onion_message_handler.release_pending_connections();
+		let mut pending_events = self.pending_events.lock().unwrap();
+		for node_announcement in nodes_to_connect {
+			pending_events.push(Event::ConnectionNeeded { node_announcement });
+		}


Originally, I had this in process_events, but I moved it to timer_tick_occurred so the timeout could be in terms of ticks. However, it has occurred to me that having it here means we may wait up to 10 seconds before even generating the events, which would result in bad UX. We could instead simply generate events directly in process_pending_events and process_pending_events_async, dropping the intermediary pending_events Vec. But, as before, we'd have to define the timeout in something other than ticks, or possibly add another method to OnionMessageHandler for calling in timer_tick_occurred.

Updated the OnionMessageHandler trait to have a timer_tick_occurred method so that dropping buffered messages is independent of providing events.

TheBlueMatt

Some high level design questions I'd like to understand first.

lightning/src/onion_message/messenger.rs

TheBlueMatt · 2023-11-14T00:13:01Z

lightning/src/onion_message/messenger.rs

@@ -516,18 +583,23 @@ where
 	pub fn send_onion_message<T: OnionMessageContents>(


Its public so users can send custom messages easily, which I think should support, if possible. We'll need to handle the needs-connection case internally here, though, if we want to keep it.

TheBlueMatt · 2023-11-14T00:14:29Z

lightning/src/events/mod.rs

+	/// [`MessageRouter`]: crate::onion_message::MessageRouter
+	/// [`Destination`]: crate::onion_message::Destination
+	/// [`OnionMessageHandler`]: crate::ln::msgs::OnionMessageHandler
+	ConnectionNeeded {


Why did you end up going the event route vs having an "start opening a connection async" trait (or even just doing it via the onion message router)? I guess an event could let us reuse it later in ChannelManager to just tell the user to reconnect every time a peer disconnects we care about, rather than users having to poll that, but it does seem like a lot of indirection to get the needs-connection event out.

It would require adding another parameterization for users to implement (i.e., either a new trait on OnionMessenger or one on DefaultMessageRotuer), which would then need to bubble up to the Simple* aliases.

And since it needs to be async, the trait implementation wouldn't be enough. Users would need to write code to process connection requests in the background from what their trait enqueued. If we wrote some of this logic for them, then we'd need to do it for each language's IO layer. But that wouldn't fit well into any peer management logic that they may have (e.g., LDK Node's PeerStore).

Seems much simpler just to surface an event for them so that no additional parameterization and processing code is needed, other than handling one more event variant.

Yea, fair, I guess. I think the current piping through of the needs-connected nodes is also a bit gross, but I'm not sure its more gross. One further thing to consider is what we want the eventual end state to be. If we want to to generate ConnectionNeeded events for peers with which we have a channel, we should stick with the event route (and probably start generating such events in the same release). If we don't, adding it via the router is easier to "walk back", and is at least only gross temporarily.

Could we pipe the needs-connect nodes through OffersMessageHandler / CustomOMHandler with a new trait method? Then the ChannelManager could generate ConnectionNeeded events and we wouldn't have to trouble the background processor. I do see Jeff's argument that the current design aligns with the peer manager doing peer stuff though.

Hmm... that wouldn't work for uses of send_onion_message or anyone using something other than ChannelManager to implement one of those traits.

The custom trait implementations would need to support connecting to peers outbound, would it not work in some other way? I'm not convinced it's cleaner than the current approach though :/ just nicer in terms of the PeerMan/BgProcessor.

Not sure that I follow. All custom implementations should work with the current approach of using OnionMessenger. Or are you saying a custom implementation of OnionMessageHandler (i.e., something other than OnionMessenger)? Note that CustomOnionMessageHandler and OffersMessageHandler are specific to OnionMessenger. Any implementations of those work fine with the current approach.

Discussed offline. Updated to have events processed directly from any OnionMessageHandler by requiring implementations to also implement EventsProvider. I was able to do so without adding another type parameter to BackgroundProcessor since the handler can be fetched from PeerManager.

TheBlueMatt · 2023-11-14T00:18:42Z

lightning/src/onion_message/messenger.rs

-			return;
+		match self.send_onion_message(path, contents, reply_path) {
+			Err(SendError::InvalidFirstHop(first_node_id, onion_message)) => {
+				let network_graph = self.network_graph.deref().read_only();


Hmm, is this the right place for this? We could avoid the network graph reference by pushing this into the onion message routing interface, and have it be allowed to return some kind of "My route requires you to connect to (Vec<SocketAddress>, PublicKey) flag". Alternatively, we could push it into the event handler, which probably already has such a reference, but some users implement that themselves so they'd be stuck doing it.

Hmm, is this the right place for this? We could avoid the network graph reference by pushing this into the onion message routing interface, and have it be allowed to return some kind of "My route requires you to connect to (Vec<SocketAddress>, PublicKey) flag".

Eh, I suppose if DefaultMessageRouter will ultimately need a NetworkGraph reference, perhaps it would be better to have it there?

Having it in OnionMessenger was as a check against malicious actors sending onion messages containing bogus reply paths. Otherwise, we'd buffer messages that could never be delivered until the timeout drops them, which could be problematic if it causes us to exceed MAX_TOTAL_BUFFER_SIZE. So I figured that logic shouldn't be left to implementors of MessageRouter.

Also note that currently a MessageRouter is allowed to return a partial path if it can't find a full one back from the destination. And we'll direct connect to the start of that path. So returning a "needs connection" from there would at very least complicate the result type, if we want to support that still.

Further, it's a little roundabout to have MessageRouter tell OnionMessenger that it needs a connection to a node when OnionMessenger is the one that already is keeping track of the peers. And note that currently send_onion_message takes the path and contents, which it takes and creates an OnionMessage. So if MessageRouter is responsible for saying "needs connection" then we still need to call send_onion_message in order to have something to buffer. That at very least complicates the processing of MessageRouter's return value.

I'm a little torn on the right approach here.

Alternatively, we could push it into the event handler, which probably already has such a reference, but some users implement that themselves so they'd be stuck doing it.

Not sure I understand this. Are you saying to just have the event say needs connection to some node id and have the handler look it up in the NetworkGraph? We'd blindly buffer messages in that case, too.

Eh, I suppose if DefaultMessageRouter will ultimately need a NetworkGraph reference, perhaps it would be better to have it there?

Makes sense to me.

Having it in OnionMessenger was as a check against malicious actors sending onion messages containing bogus reply paths. Otherwise, we'd buffer messages that could never be delivered until the timeout drops them, which could be problematic if it causes us to exceed MAX_TOTAL_BUFFER_SIZE. So I figured that logic shouldn't be left to implementors of MessageRouter.

Hmm, that's a fair point. That said, I'm not sure how an implementor of MessageRouter could screw that up - like if they're required to return a NetAddress they either give us something we concretely should try to connect to, or not.

Also note that currently a MessageRouter is allowed to return a partial path if it can't find a full one back from the destination. And we'll direct connect to the start of that path. So returning a "needs connection" from there would at very least complicate the result type, if we want to support that still.

True, though only in the sense that we'd have to add a Vec<NetAddress> to OnionMessagePath, which doesn't seem insane, even if a bit strange.

Further, it's a little roundabout to have MessageRouter tell OnionMessenger that it needs a connection to a node when OnionMessenger is the one that already is keeping track of the peers.

Mmm, true. I mean we do already pass the list of connected peers to the router, so its not like it shouldnt/wont have that info on-hand and for good reason. I guess in my head it doesn't seem wrong at all - the router's job is to look at the network graph plus the peers we have and where we want to get to and decide the best way to get there. If that means connecting, it can make that decision, but its up to the router, and once its decided, it seems like it should be able to communicate that fully to the messenger, which is responsible for actually sending the message (and, awkwardly, passing some event through the peermanager to the user to trigger the connection, but really that is a router-originated event).

And note that currently send_onion_message takes the path and contents, which it takes and creates an OnionMessage. So if MessageRouter is responsible for saying "needs connection" then we still need to call send_onion_message in order to have something to buffer. That at very least complicates the processing of MessageRouter's return value.

Yea, as discussed above istm we should change the send_onion_message API.

Not sure I understand this. Are you saying to just have the event say needs connection to some node id and have the handler look it up in the NetworkGraph? We'd blindly buffer messages in that case, too.

Yea, I hadn't considered that we're DC'ing in response to invoice_requests, not just send_payment_for_offers, so we can't really do that.

Ok, let me go ahead and move NetworkGraph to DefaultMessageRouter. I guess the idea would be to add a Vec<SocketAddress> field to OnionMessagePath? I used NodeAnnouncement since it would give the user more information on deciding whether they want to connect to the node. But perhaps SocketAddresses are sufficient. And I suppose a MessageRouter may want to connect to a private node, so we should support that.

(and, awkwardly, passing some event through the peermanager to the user to trigger the connection, but really that is a router-originated event)

An alternative could be to parameterize BackgroundProcessor with OnionMessenger or MessageRotuer, but I figured we'd want to avoid adding more parameterization there. Seemed to make sense to have PeerManager be responsible as a more higher level abstraction for peer-related needs.

Ok, let me go ahead and move NetworkGraph to DefaultMessageRouter. I guess the idea would be to add a Vec field to OnionMessagePath?

Yea, that was my thinking. I mean ideally we wouldn't require returning anything at all, we can figure out if we need to connect without. But to your point we kinda need to here for DoS reasons. One issue is in the ChannelManager if we want to generate the new events we have no idea what the network address is, which makes it hard to generate the events there.

I used NodeAnnouncement since it would give the user more information on deciding whether they want to connect to the node. But perhaps SocketAddresses are sufficient.

Yea, it just seemed like excess information that wasn't really required, and socket addresses are optional inside that so we kinda want to force them.

And I suppose a MessageRouter may want to connect to a private node, so we should support that.

I wasn't actually even thinking about that, but, yea, fair point.

An alternative could be to parameterize BackgroundProcessor with OnionMessenger or MessageRotuer, but I figured we'd want to avoid adding more parameterization there. Seemed to make sense to have PeerManager be responsible as a more higher level abstraction for peer-related needs.

Yea, seems fine, just a lot of piping stuff through to get from the onion message router to the background processor.

Related, we'll need to handle sciddir_or_pubkey somewhere. That is, going from Destination containing a BlindedPath with and scid to OnionMessagePath having a Destination containing a BlindedPath with a pubkey.

I suppose doing this in a MessageRouter is fine. But when wanting to construct such a blinded path, we'll need the NetworkGraph, too. This responsibility then may want to also live in MessageRouter, but the current draft is to have it in Router partly because ChannelManager already has one. So this may need to be rethought.

Hmm, yea, good point. We can do that when we get there on the receiving side, though.

jkczyz · 2023-11-16T16:31:32Z

Updated per offline discussion. Let me know if this new approach looks good, and I'll write up some tests.

codecov-commenter · 2023-11-16T22:02:01Z

Codecov Report

Attention: 55 lines in your changes are missing coverage. Please review.

Comparison is base (37150b4) 88.63% compared to head (0b83116) 89.49%.
Report is 22 commits behind head on main.

Files	Patch %	Lines
lightning/src/onion_message/messenger.rs	74.44%	40 Missing and 6 partials ⚠️
lightning-background-processor/src/lib.rs	86.84%	1 Missing and 4 partials ⚠️
lightning/src/events/mod.rs	0.00%	2 Missing ⚠️
lightning/src/onion_message/functional_tests.rs	98.71%	2 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2723      +/-   ##
==========================================
+ Coverage   88.63%   89.49%   +0.85%     
==========================================
  Files         115      115              
  Lines       90446    97229    +6783     
  Branches    90446    97229    +6783     
==========================================
+ Hits        80170    87017    +6847     
+ Misses       7875     7829      -46     
+ Partials     2401     2383      -18

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jkczyz · 2023-11-16T22:25:59Z

Latest push is to fix CI. I'll need to rebase to fix the fuzz test.

valentinewallace

Looks good I think. Will give it another glance later.

lightning-background-processor/src/lib.rs

lightning/src/events/mod.rs

valentinewallace · 2023-11-27T20:35:33Z

lightning-background-processor/src/lib.rs

+	use lightning::events::EventsProvider;
+
+	let events = core::cell::RefCell::new(Vec::new());
+	peer_manager.onion_message_handler().process_pending_events(&|e| events.borrow_mut().push(e));


Maybe after async functions in traits are stabilized, we can add an async event processing method to EventsProvider with a default implementation 🤔 I'm just wondering if this could be a temporary workaround.

Yeah, possibly. EventsProvider is kinda weird in that up until now it never really needed to be a trait.

Not sure I understand the reason for this now? Can we not just add a process_pending_events_async method on OnionMessenger just like we do for ChannelManager?

BackgroundProcessor gets the OnionMessenger through PeerManager, which only knows it is something that implements OnionMessageHandler and EventsProvider. If we instead pass OnionMessenger to BackgroundProcessor, then we would need to add three more type parameters to BackgroundProcessor.

lightning-background-processor/src/lib.rs

valentinewallace · 2023-11-27T20:41:30Z

lightning/src/ln/msgs.rs

@@ -1580,6 +1580,9 @@ pub trait OnionMessageHandler {
 	/// drop and refuse to forward onion messages to this peer.
 	fn peer_disconnected(&self, their_node_id: &PublicKey);

+	/// Performs actions that should happen on startup and roughly every ten seconds thereafter.


FWIW, docs tend to be more extensive on this method elsewhere. You may have been waiting for a concept ACK first though.

Expanded a bit, but note that other timer_tick_occurred methods are on structs whereas this is on a trait and thus is not associated with any concrete implementations.

TheBlueMatt · 2023-11-27T23:58:02Z

Concept ACK, I think it all looks good to me, can do a thorough review when val's comments are addressed.

valentinewallace

I'd be good with a rebase to fix CI

lightning/src/onion_message/messenger.rs

valentinewallace · 2023-11-29T18:52:50Z

lightning/src/onion_message/messenger.rs

+
+		match message_buffers.entry(first_node_id) {
+			hash_map::Entry::Vacant(e) => match addresses {
+				None => Err(SendError::InvalidFirstHop(first_node_id)),


Error message docs need updating if we want to use this here.

Hmm... this is the only place it is used. What's missing / wrong with the docs?

The docs say "The first hop does not support onion message forwarding." But we don't really know that they don't advertise OM support at this point, we only know that they're not connected to us.

Oh, right! This should now mean we don't know how to connect to the first hop.

valentinewallace · 2023-11-29T18:55:45Z

lightning/src/onion_message/messenger.rs

+
+		match message_buffers.entry(first_node_id) {
+			hash_map::Entry::Vacant(e) => match addresses {
+				None => Err(SendError::InvalidFirstHop(first_node_id)),


If we get to this point, it means that the peer was previously connected+known to support OMs but was disconnected since find_path was called, right? Given that, would it be better or is there some way to buffer it here instead?

first_node_id isn't necessarily a peer anymore. DefaultMessageRouter will look up the node in the NetworkGraph if it isn't one. More generally, a MessageRouter can return a partial path back from the Destination, and we'll happily generated a ConnectionNeeded event with the first node in such a path.

I meant that if someone is using DefaultMessageRouter, there's a race where we are connected to a peer when find_path is initially called, but disconnected by the time we go to actually send the OM to that first-hop peer. A fix would be to always poll the network graph for socket addresses in find_path, but this may be too rare to bother.

Yeah, that can occur because we don't hold the lock while finding a path. Seems like we might as well populate addresses unconditionally. Hmmm... but that won't work if the peer is unannounced (as we don't have the SocketAddresses handy) or if another implementation of MessageRouter fails to populate the field, for whatever reason.

I think we'd have to store the addresses from the Init message on connection. Then upon disconnection, we could transition to a third DisconnectedPeer state to hold on to the addresses until some timer ticks have passed.

@TheBlueMatt Any thoughts on this?

Not really, also not sure its worth worrying about too much, it should be incredibly rare timing-wise; in general we're supposed to be tolerant of onion messages getting lost. If we're really worried about it we can unconditionally populate the addresses (we should really always be sending OMs to public nodes of they're not reliable anyway), but the extra graph lookup to send a message to our peer kinda sucks.

The alternative to the graph lookup is storing them on connection from the Init message. Either way, we can easily do it later given we don't persist any of this data.

We dont have an address list in the Init, though? (well, our address, but not theirs).

Oh, right. I incorrectly remembered the purpose of those addresses.

jkczyz

Added tests and fixed a bug that was preventing the timer from ticking once the ConnectionNeeded event was generated. 🤦‍♂️

TheBlueMatt · 2023-11-30T18:02:27Z

Needs rebase.

jkczyz · 2023-11-30T20:28:31Z

Needs rebase.

Rebased

TheBlueMatt

Only really one comment, basically LGTM. Feel free to squash.

lightning/src/onion_message/messenger.rs

TheBlueMatt · 2023-11-30T21:24:45Z

lightning/src/onion_message/messenger.rs

+			let mut new_pending_messages = VecDeque::new();
+			core::mem::swap(pending_messages, &mut new_pending_messages);
+			*self = OnionMessageBuffer::ConnectedPeer(new_pending_messages);
+		}


nit: else { debug_assert!(false); }, right?

This is chained after .or_insert_with(|| OnionMessageBuffer::ConnectedPeer(VecDeque::new())), so it would cause test failures. I think it's fine without the check given in that case it is a no-op.

lightning/src/events/mod.rs

TheBlueMatt · 2023-11-30T22:03:59Z

lightning-background-processor/src/lib.rs

+	use lightning::events::EventsProvider;
+
+	let events = core::cell::RefCell::new(Vec::new());
+	peer_manager.onion_message_handler().process_pending_events(&|e| events.borrow_mut().push(e));


Not sure I understand the reason for this now? Can we not just add a process_pending_events_async method on OnionMessenger just like we do for ChannelManager?

TheBlueMatt · 2023-11-30T22:09:30Z

lightning/src/onion_message/functional_tests.rs

+	// Buffer an onion message for a disconnected peer
+	disconnect_peers(&nodes[0], &nodes[1]);
+	assert!(nodes[0].messenger.next_onion_message_for_peer(nodes[1].node_id).is_none());
+	nodes[0].messenger.send_onion_message(message, destination, None).unwrap();


Suggested change

nodes[0].messenger.send_onion_message(message, destination, None).unwrap();

nodes[0].messenger.send_onion_message(message, destination, None).unwrap();

assert!(nodes[0].messenger.next_onion_message_for_peer(nodes[1].node_id).is_none());

This would causes us to hit the debug assertion in dequeue_message. We don't prevent someone from calling next_onion_message_for_peer for a disconnected peer. Perhaps we should return None? Calling it means the PeerManager state is out-of-sync with the OnionMessenger state.

jkczyz

Will address the nits while waiting on responses for some outstanding comments.

jkczyz · 2023-11-30T22:59:02Z

lightning-background-processor/src/lib.rs

+	use lightning::events::EventsProvider;
+
+	let events = core::cell::RefCell::new(Vec::new());
+	peer_manager.onion_message_handler().process_pending_events(&|e| events.borrow_mut().push(e));


BackgroundProcessor gets the OnionMessenger through PeerManager, which only knows it is something that implements OnionMessageHandler and EventsProvider. If we instead pass OnionMessenger to BackgroundProcessor, then we would need to add three more type parameters to BackgroundProcessor.

jkczyz · 2023-11-30T23:05:13Z

lightning/src/onion_message/messenger.rs

+
+		match message_buffers.entry(first_node_id) {
+			hash_map::Entry::Vacant(e) => match addresses {
+				None => Err(SendError::InvalidFirstHop(first_node_id)),


@TheBlueMatt Any thoughts on this?

TheBlueMatt · 2023-12-04T22:23:49Z

Needs rebase, but would be happy to see it squashed.

Onion messages are buffered for sending to the next node. Since the network has limited adoption, connecting directly to a peer may be necessary. Add an OnionMessageBuffer abstraction that can differentiate between connected peers and those are pending a connection. This allows for buffering messages before a connection is established and applying different buffer policies for peers yet to be connected.

jkczyz · 2023-12-06T04:41:18Z

Needs rebase, but would be happy to see it squashed.

Rebased and squashed. Had to refactor the sending methods slightly to log onion message contents.

jkczyz · 2023-12-06T04:44:13Z

lightning/src/onion_message/messenger.rs

+		let sender = self.node_signer
+			.get_node_id(Recipient::Node)
+			.map_err(|_| SendError::GetNodeIdFailed)?;
+
+		let peers = self.message_buffers.lock().unwrap()
+			.iter()
+			.filter(|(_, buffer)| matches!(buffer, OnionMessageBuffer::ConnectedPeer(_)))
+			.map(|(node_id, _)| *node_id)
+			.collect();
+
+		let path = self.message_router
+			.find_path(sender, peers, destination)
+			.map_err(|_| SendError::PathNotFound)?;
+
+		log_trace!(self.logger, "Constructing onion message {}: {:?}", log_suffix, contents);
+
+		let result = self.send_onion_message_using_path(path, contents, reply_path);
+
+		match result.as_ref() {
+			Err(SendError::GetNodeIdFailed) => {
+				log_warn!(self.logger, "Unable to retrieve node id {}", log_suffix);
+			},
+			Err(SendError::PathNotFound) => {
+				log_trace!(self.logger, "Failed to find path {}", log_suffix);
+			},
+			Err(e) => {
+				log_trace!(self.logger, "Failed sending onion message {}: {:?}", log_suffix, e);
+			},
+			Ok(SendSuccess::Buffered) => {
+				log_trace!(self.logger, "Buffered onion message {}", log_suffix);
+			},
+		}


Ugh, this needs to be refactored a little more. Otherwise, not all the errors will be logged.

This should be fixed now.

OnionMessenger::send_onion_message takes an OnionMessagePath. This isn't very useful as it requires finding a path manually. Instead, have the method take a Destination and use OnionMessenger's MessageRouter to construct the path. Later, this will allow for buffering messages where the first node in the path isn't a direct connection.

MessageRouter::find_path returns a path to use when sending an onion message. If the first node on the path is not connected or does not support onion messages, sending will fail with InvalidFirstHop. Instead of failing outright, buffer the message for later sending once the first node is a connected peer.

When buffering onion messages for a node that is not connected as a peer, it's possible that the node does not exist. Include a NetworkGraph reference in DefaultMessageRouter so that it can be used to check if the node actually exists. Otherwise, an malicious node may send an onion message where the reply path's introduction node doesn't exist. This would result in buffering messages that may never be delivered.

MessageRouter::find_path is given a Destination to reach via a set of peers. If a path cannot be found, it may return a partial path such that OnionMessenger can signal a direct connection to the first node in the path is needed. Include a list of socket addresses in the returned OnionMessagePath to allow OnionMessenger to know how to connect to the node. This allows DefaultMessageRouter to use its NetworkGraph to return socket addresses for gossiped nodes.

When there isn't a direct connection with the Destination of an OnionMessage, look up socket addresses from the NetworkGraph. This is used to signal to OnionMessenger that a direct connection is needed to send the message.

A MessageRouter may be unable to find a complete path to an onion message's destination. This could because no such path exists or any needs on a potential path don't support onion messages. Add an event that indicates a connection with a node is needed in order to send the message.

An OnionMessageHandler may buffer messages that can't be sent because the recipient is not a peer. Have the trait extend EventsProvider so that implementation so that an Event::ConnectionNeeded can be generated for any nodes that fall into this category. Also, implement EventsProvider for OnionMessenger and IgnoringMessageHandler.

OnionMessenger buffers onion messages for nodes that are pending a connection. To prevent DoS concerns, add a timer_tick_occurred method to OnionMessageHandler so that buffered messages can be dropped. This will be called in lightning-background-processor every 10 seconds.

Some code hygiene before another parameter is added and rustfmt is eventually used.

Simply to avoid excessive wrapping when possible.

OnionMessageHandler implementations now also implement EventsProvider. Update lightning-background-processor to also process any events the PeerManager's OnionMessageHandler provides.

lightning-background-processor processes events provided by the PeerManager's OnionMessageHandler for when a connection is needed. If a connection is not established in a reasonable amount of time, drop any buffered onion messages by calling timer_tick_occurred.

Additional tests will be added needing a similar node struct, so consolidate its usage.

Add tests for onion message buffering checking that messages are cleared upon disconnection and timed out after MAX_TIMER_TICKS. Also, checks that ConnectionNeeded events are generated.

jkczyz · 2023-12-06T20:46:39Z

Latest push should fix the cargo doc failure in CI.

TheBlueMatt · 2023-12-08T00:57:03Z

lightning/src/onion_message/messenger.rs

-				Ok(())
-			}
+				e.get_mut().enqueue_message(onion_message);
+				Ok(SendSuccess::Buffered)


Should this return value depend on the buffer's connected/pending state?

Yeah, I'll fix it in a follow-up.

TheBlueMatt · 2023-12-08T01:00:30Z

lightning/src/onion_message/messenger.rs

+	///
+	/// Only needs to be set if a connection to the node is required. [`OnionMessenger`] may use
+	/// this to initiate such a connection.
+	pub addresses: Option<Vec<SocketAddress>>,


nit: can we say, like, first_hop_addresses or first_hop_socket_addresses? Just addresses doesn't tell me much in the context of an onion message path.

Fair point. Will address.

Folllow-ups to #2723

jkczyz commented Nov 9, 2023

View reviewed changes

jkczyz commented Nov 13, 2023

View reviewed changes

TheBlueMatt reviewed Nov 14, 2023

View reviewed changes

jkczyz force-pushed the 2023-11-direct-connect branch from 1915feb to 3117b94 Compare November 16, 2023 16:25

jkczyz force-pushed the 2023-11-direct-connect branch from 3117b94 to 74a53a5 Compare November 16, 2023 22:01

jkczyz force-pushed the 2023-11-direct-connect branch from 74a53a5 to 7774248 Compare November 16, 2023 22:10

valentinewallace reviewed Nov 27, 2023

View reviewed changes

lightning-background-processor/src/lib.rs Show resolved Hide resolved

valentinewallace reviewed Nov 27, 2023

View reviewed changes

TheBlueMatt added this to the 0.0.119 milestone Nov 27, 2023

jkczyz force-pushed the 2023-11-direct-connect branch from 7774248 to ad36f13 Compare November 28, 2023 22:46

valentinewallace reviewed Nov 29, 2023

View reviewed changes

jkczyz force-pushed the 2023-11-direct-connect branch from ad36f13 to 707bd48 Compare November 30, 2023 03:54

jkczyz commented Nov 30, 2023

View reviewed changes

jkczyz force-pushed the 2023-11-direct-connect branch from 707bd48 to 3868cc9 Compare November 30, 2023 04:02

jkczyz force-pushed the 2023-11-direct-connect branch from 3868cc9 to 415d1cf Compare November 30, 2023 20:28

TheBlueMatt reviewed Nov 30, 2023

View reviewed changes

jkczyz commented Nov 30, 2023

View reviewed changes

jkczyz force-pushed the 2023-11-direct-connect branch from 415d1cf to 7530750 Compare December 1, 2023 00:33

jkczyz mentioned this pull request Dec 2, 2023

logging every sent receive onion message #2642

Merged

jkczyz force-pushed the 2023-11-direct-connect branch from 7530750 to e8afba4 Compare December 6, 2023 04:39

jkczyz commented Dec 6, 2023

View reviewed changes

jkczyz added 7 commits December 6, 2023 08:47

Return socket addresses from DefaultMessageRouter

b86f02a

When there isn't a direct connection with the Destination of an OnionMessage, look up socket addresses from the NetworkGraph. This is used to signal to OnionMessenger that a direct connection is needed to send the message.

jkczyz force-pushed the 2023-11-direct-connect branch from e8afba4 to 3e87b3c Compare December 6, 2023 14:57

jkczyz added 10 commits December 6, 2023 14:25

Remove unnecessary BackgroundProcessor type param

ae98517

Re-wrap define_run_body macro parameters

36ecc8e

Some code hygiene before another parameter is added and rustfmt is eventually used.

Re-order define_run_body macro parameters

ce68f22

Simply to avoid excessive wrapping when possible.

Process OnionMessageHandler events in background

6ca81ff

OnionMessageHandler implementations now also implement EventsProvider. Update lightning-background-processor to also process any events the PeerManager's OnionMessageHandler provides.

Reuse MessengerNode in spec_test_vector

210407e

Additional tests will be added needing a similar node struct, so consolidate its usage.

Test pending connection onion message buffering

89e630b

Add tests for onion message buffering checking that messages are cleared upon disconnection and timed out after MAX_TIMER_TICKS. Also, checks that ConnectionNeeded events are generated.

Remove superfluous space from where clause

d46519b

Rename OnionMessageBuffer to OnionMessageRecipient

0b83116

jkczyz force-pushed the 2023-11-direct-connect branch from 3e87b3c to 0b83116 Compare December 6, 2023 20:45

valentinewallace approved these changes Dec 7, 2023

View reviewed changes

TheBlueMatt approved these changes Dec 8, 2023

View reviewed changes

TheBlueMatt merged commit 2d26679 into lightningdevkit:main Dec 8, 2023
15 checks passed

jkczyz mentioned this pull request Dec 8, 2023

Folllow-ups to #2723 #2776

Merged

TheBlueMatt added a commit that referenced this pull request Dec 8, 2023

Merge pull request #2776 from jkczyz/2023-12-direct-connect-follow-ups

4b53654

Folllow-ups to #2723

orbitalturtle mentioned this pull request Jan 3, 2024

Offers: send invoice request lndk-org/lndk#87

Merged

		@@ -516,18 +583,23 @@ where
		pub fn send_onion_message<T: OnionMessageContents>(

	nodes[0].messenger.send_onion_message(message, destination, None).unwrap();
	nodes[0].messenger.send_onion_message(message, destination, None).unwrap();
	assert!(nodes[0].messenger.next_onion_message_for_peer(nodes[1].node_id).is_none());

Direct connect for OnionMessage sending #2723

Direct connect for OnionMessage sending #2723

Conversation

jkczyz commented Nov 9, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkczyz Nov 9, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheBlueMatt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheBlueMatt Nov 14, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkczyz commented Nov 16, 2023

codecov-commenter commented Nov 16, 2023 • edited

Codecov Report

jkczyz commented Nov 16, 2023

valentinewallace left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheBlueMatt commented Nov 27, 2023

valentinewallace left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkczyz Nov 30, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkczyz left a comment

Choose a reason for hiding this comment

TheBlueMatt commented Nov 30, 2023

jkczyz commented Nov 30, 2023

TheBlueMatt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkczyz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheBlueMatt commented Dec 4, 2023

jkczyz commented Dec 6, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Direct connect for `OnionMessage` sending #2723

Direct connect for `OnionMessage` sending #2723

jkczyz commented Nov 9, 2023 •

edited

jkczyz Nov 9, 2023 •

edited

TheBlueMatt Nov 14, 2023 •

edited

codecov-commenter commented Nov 16, 2023 •

edited

jkczyz Nov 30, 2023 •

edited