Improved local discovery using local edge addresses #26

dechamps · 2014-06-28T20:22:24Z

This branch implements an idea that I published a year ago about improving the local discovery feature and make it much more reliable. There are several significant changes:

Add a local_address field to edge_t. The new field contains the local address of from's metaconnection socket to to (in contrast to address which contains the peer address).
Introduce a protocol change to transmit this new information in ADD_EDGE messages, in a backward-compatible way. The minor protocol number is bumped accordingly, and the new message format is only sent to nodes that understand them.
Finally, use this new information to send MTU probes to the local addresses of the destination nodes instead of sending broadcasts.

This new way of doing local discovery provides numerous advantages compared to using broadcasts:

No broadcast packets "polluting" the local network;
Reliable even if the sending host has multiple network interfaces (in contrast, broadcasts will only be sent through one unpredictable interface)
Works even if the two hosts are not on the same broadcast domain. One example is a large LAN where the two hosts might be on different local subnets. In fact, thanks to UDP hole punching this might even work if there is a NAT sitting in the middle of the LAN between the two nodes!
Sometimes a node is reachable through its "normal" address, and via a local subnet as well. One might think the local subnet is the best route to the node in this case, but more often than not it's actually worse - one example is where the local segment is a third party VPN running in parallel, or ironically it can be the local segment formed by the tinc VPN itself! Because this new algorithm only checks the addresses for which an edge is already established, it is less likely to fall into these traps.

The new local discovery behavior is controlled by a new LocalDiscoveryBroadcast option (edge local addresses, however, are always transmitted if possible). It defaults to the old behavior. @gsliepen: the reason it defaults to the old behavior is to keep the branch "neutral", but it probably makes more sense to enable it by default, since this new mechanism is vastly superior to using broadcasts IMHO. In fact I believe the old mechanism should probably be deleted once there is a good penetration of local edge address aware clients.

Testing: I tested this between two local nodes connected to a non-local third node, it seems to work perfectly. Note that currently this is broken with SPTPS because sptps_verify_datagram() is not implemented which breaks try_harder(), preventing the discovered address from being used. To be clear, this is not a regression: local discovery was already broken because of this. I verified that this works even with SPTPS by writing a quick & dirty implementation of sptps_verify_datagram(). I intend on cleaning up this fix in another pull request.

gsliepen · 2014-06-28T21:04:34Z

On Sat, Jun 28, 2014 at 01:22:25PM -0700, Etienne Dechamps wrote:

This branch implements [an idea that I published a year ago][idea] about improving the local discovery feature and make it much more reliable. There are several significant changes:

Add a local_address field to edge_t. The new field contains the local address of from's metaconnection socket to to (in contrast to address which contains the peer address).

Introduce a protocol change to transmit this new information in ADD_EDGE messages, in a backward-compatible way. The minor protocol number is bumped accordingly, and the new message format is only sent to nodes that understand them.

Actually, you don't need to bump the protocol or worry about older
nodes, since you can add items to messages without breaking
compatibility, as long as you only add stuff at the end. So just
unconditionally add the local address and just broadcast it.

In send_add_edge() you do:

x = send_request(c, "%d %x %s %s %s %s %x %d %s %s", ADD_EDGE, rand(), e->from->name, e->to->name, address, port, e->options, e->weight, local_address, local_port);

At the moment I don't have time to verify it, but I believe that it will
always be the case that the variable port is the same as local_port, so
it's redundant. Maybe that also allows you to skip having to do
sockaddr2str() directly followed by str2sockaddr(..., myport).

The rest looks fine, and I think you can remove the
LocalDiscoveryBroadcast option and make the new behaviour always
enabled.

Met vriendelijke groet / with kind regards,
Guus Sliepen guus@tinc-vpn.org

dechamps · 2014-06-29T10:22:22Z

@gsliepen

Actually, you don't need to bump the protocol or worry about older nodes, since you can add items to messages without breaking compatibility, as long as you only add stuff at the end. So just unconditionally add the local address and just broadcast it.

Nice, thanks. I updated the code and verified everything works properly when communicating with nodes that don't have these commits. I also made a subtler change to send "unknown" addresses instead of using the old ADD_EDGE message format when we don't know the local address (otherwise it would have made further extension of the ADD_EDGE message more complicated in the future).

In send_add_edge() you do: x = send_request(c, "%d %x %s %s %s %s %x %d %s %s", ADD_EDGE, rand(), e->from->name, e->to->name, address, port, e->options, e->weight, local_address, local_port);

At the moment I don't have time to verify it, but I believe that it will always be the case that the variable port is the same as local_port, so it's redundant. Maybe that also allows you to skip having to do sockaddr2str() directly followed by str2sockaddr(..., myport).

I don't follow. In your example port and local_port are clearly not the same (the first one refers to the port on the remote host, the second one on the local host, and they can of course be different). However, I agree there is technically some redundancy because with this code e->local_address.port == e->reverse->address.port should always be true. However, I believe it makes sense not to bake this assumption into the protocol itself, so that it stays flexible.

I think you can remove the LocalDiscoveryBroadcast option and make the new behaviour always enabled.

Done. I also added two additional commits that make more "controversial" changes. Rationale can be found in the commit messages. Feel free to ignore these commits if you disagree with the changes.

In addition to the remote address, each edge now stores the local address from the point of view of the "from" node. This information is then made available to other nodes through a backwards-compatible extension to ADD_EDGE messages. This information can be used in future code to improve packet routing.

This introduces a new way of doing local discovery: when tinc has local address information for the recipient node, it will send local discovery packets directly to the local address of that node, instead of using broadcast packets. This new way of doing local discovery provides numerous advantages compared to using broadcasts: - No broadcast packets "polluting" the local network; - Reliable even if the sending host has multiple network interfaces (in contrast, broadcasts will only be sent through one unpredictable interface) - Works even if the two hosts are not on the same broadcast domain. One example is a large LAN where the two hosts might be on different local subnets. In fact, thanks to UDP hole punching this might even work if there is a NAT sitting in the middle of the LAN between the two nodes! - Sometimes a node is reachable through its "normal" address, and via a local subnet as well. One might think the local subnet is the best route to the node in this case, but more often than not it's actually worse - one example is where the local segment is a third party VPN running in parallel, or ironically it can be the local segment formed by the tinc VPN itself! Because this new algorithm only checks the addresses for which an edge is already established, it is less likely to fall into these traps.

The new local address based local discovery mechanism is technically superior to the old broadcast-based one. In fact, the old algorithm can technically make things worse by e.g. sending broadcasts over the VPN itself and then selecting the VPN address as the node's UDP address. This cannot happen with the new mechanism. Note that this means old nodes that don't send their local addresses in ADD_EDGE messages can't be discovered, because there is no address to send discovery packets to. Old nodes can still discover new nodes by sending them broadcasts, though.

Recent improvements to the local discovery mechanism makes it cheaper, more network-friendly, and now it cannot make things worse (as opposed to the old mechanism). Thus there is no reason not to enable it by default.

dechamps mentioned this pull request Jun 29, 2014

Implement sptps_verify_datagram() #27

Closed

dechamps added 4 commits June 29, 2014 11:23

Enable LocalDiscovery by default.

498f1b1

Recent improvements to the local discovery mechanism makes it cheaper, more network-friendly, and now it cannot make things worse (as opposed to the old mechanism). Thus there is no reason not to enable it by default.

gsliepen merged commit 498f1b1 into gsliepen:1.1 Jun 29, 2014

dechamps deleted the localdiscovery branch July 13, 2014 08:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved local discovery using local edge addresses #26

Improved local discovery using local edge addresses #26

dechamps commented Jun 28, 2014

gsliepen commented Jun 28, 2014

dechamps commented Jun 29, 2014

Improved local discovery using local edge addresses #26

Improved local discovery using local edge addresses #26

Conversation

dechamps commented Jun 28, 2014

gsliepen commented Jun 28, 2014

dechamps commented Jun 29, 2014