Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved local discovery using local edge addresses #26

Merged
merged 4 commits into from Jun 29, 2014

Conversation

dechamps
Copy link
Contributor

This branch implements an idea that I published a year ago about improving the local discovery feature and make it much more reliable. There are several significant changes:

  • Add a local_address field to edge_t. The new field contains the local address of from's metaconnection socket to to (in contrast to address which contains the peer address).
  • Introduce a protocol change to transmit this new information in ADD_EDGE messages, in a backward-compatible way. The minor protocol number is bumped accordingly, and the new message format is only sent to nodes that understand them.
  • Finally, use this new information to send MTU probes to the local addresses of the destination nodes instead of sending broadcasts.

This new way of doing local discovery provides numerous advantages compared to using broadcasts:

  • No broadcast packets "polluting" the local network;
  • Reliable even if the sending host has multiple network interfaces (in contrast, broadcasts will only be sent through one unpredictable interface)
  • Works even if the two hosts are not on the same broadcast domain. One example is a large LAN where the two hosts might be on different local subnets. In fact, thanks to UDP hole punching this might even work if there is a NAT sitting in the middle of the LAN between the two nodes!
  • Sometimes a node is reachable through its "normal" address, and via a local subnet as well. One might think the local subnet is the best route to the node in this case, but more often than not it's actually worse - one example is where the local segment is a third party VPN running in parallel, or ironically it can be the local segment formed by the tinc VPN itself! Because this new algorithm only checks the addresses for which an edge is already established, it is less likely to fall into these traps.

The new local discovery behavior is controlled by a new LocalDiscoveryBroadcast option (edge local addresses, however, are always transmitted if possible). It defaults to the old behavior. @gsliepen: the reason it defaults to the old behavior is to keep the branch "neutral", but it probably makes more sense to enable it by default, since this new mechanism is vastly superior to using broadcasts IMHO. In fact I believe the old mechanism should probably be deleted once there is a good penetration of local edge address aware clients.

Testing: I tested this between two local nodes connected to a non-local third node, it seems to work perfectly. Note that currently this is broken with SPTPS because sptps_verify_datagram() is not implemented which breaks try_harder(), preventing the discovered address from being used. To be clear, this is not a regression: local discovery was already broken because of this. I verified that this works even with SPTPS by writing a quick & dirty implementation of sptps_verify_datagram(). I intend on cleaning up this fix in another pull request.

@gsliepen
Copy link
Owner

On Sat, Jun 28, 2014 at 01:22:25PM -0700, Etienne Dechamps wrote:

This branch implements [an idea that I published a year ago][idea] about improving the local discovery feature and make it much more reliable. There are several significant changes:

  • Add a local_address field to edge_t. The new field contains the local address of from's metaconnection socket to to (in contrast to address which contains the peer address).
  • Introduce a protocol change to transmit this new information in ADD_EDGE messages, in a backward-compatible way. The minor protocol number is bumped accordingly, and the new message format is only sent to nodes that understand them.

Actually, you don't need to bump the protocol or worry about older
nodes, since you can add items to messages without breaking
compatibility, as long as you only add stuff at the end. So just
unconditionally add the local address and just broadcast it.

In send_add_edge() you do:

x = send_request(c, "%d %x %s %s %s %s %x %d %s %s", ADD_EDGE, rand(), e->from->name, e->to->name, address, port, e->options, e->weight, local_address, local_port);

At the moment I don't have time to verify it, but I believe that it will
always be the case that the variable port is the same as local_port, so
it's redundant. Maybe that also allows you to skip having to do
sockaddr2str() directly followed by str2sockaddr(..., myport).

The rest looks fine, and I think you can remove the
LocalDiscoveryBroadcast option and make the new behaviour always
enabled.

Met vriendelijke groet / with kind regards,
Guus Sliepen guus@tinc-vpn.org

@dechamps
Copy link
Contributor Author

@gsliepen

Actually, you don't need to bump the protocol or worry about older nodes, since you can add items to messages without breaking compatibility, as long as you only add stuff at the end. So just unconditionally add the local address and just broadcast it.

Nice, thanks. I updated the code and verified everything works properly when communicating with nodes that don't have these commits. I also made a subtler change to send "unknown" addresses instead of using the old ADD_EDGE message format when we don't know the local address (otherwise it would have made further extension of the ADD_EDGE message more complicated in the future).

In send_add_edge() you do: x = send_request(c, "%d %x %s %s %s %s %x %d %s %s", ADD_EDGE, rand(), e->from->name, e->to->name, address, port, e->options, e->weight, local_address, local_port);

At the moment I don't have time to verify it, but I believe that it will always be the case that the variable port is the same as local_port, so it's redundant. Maybe that also allows you to skip having to do sockaddr2str() directly followed by str2sockaddr(..., myport).

I don't follow. In your example port and local_port are clearly not the same (the first one refers to the port on the remote host, the second one on the local host, and they can of course be different). However, I agree there is technically some redundancy because with this code e->local_address.port == e->reverse->address.port should always be true. However, I believe it makes sense not to bake this assumption into the protocol itself, so that it stays flexible.

I think you can remove the LocalDiscoveryBroadcast option and make the new behaviour always enabled.

Done. I also added two additional commits that make more "controversial" changes. Rationale can be found in the commit messages. Feel free to ignore these commits if you disagree with the changes.

In addition to the remote address, each edge now stores the local address from
the point of view of the "from" node. This information is then made available
to other nodes through a backwards-compatible extension to ADD_EDGE messages.

This information can be used in future code to improve packet routing.
This introduces a new way of doing local discovery: when tinc has
local address information for the recipient node, it will send local
discovery packets directly to the local address of that node, instead
of using broadcast packets.

This new way of doing local discovery provides numerous advantages compared to
using broadcasts:

 - No broadcast packets "polluting" the local network;

 - Reliable even if the sending host has multiple network interfaces (in
   contrast, broadcasts will only be sent through one unpredictable
   interface)

 - Works even if the two hosts are not on the same broadcast domain. One
   example is a large LAN where the two hosts might be on different local
   subnets. In fact, thanks to UDP hole punching this might even work if
   there is a NAT sitting in the middle of the LAN between the two nodes!

 - Sometimes a node is reachable through its "normal" address, and via a
   local subnet as well. One might think the local subnet is the best route
   to the node in this case, but more often than not it's actually worse -
   one example is where the local segment is a third party VPN running in
   parallel, or ironically it can be the local segment formed by the tinc
   VPN itself! Because this new algorithm only checks the addresses for
   which an edge is already established, it is less likely to fall into
   these traps.
The new local address based local discovery mechanism is technically
superior to the old broadcast-based one. In fact, the old algorithm
can technically make things worse by e.g. sending broadcasts over the
VPN itself and then selecting the VPN address as the node's UDP
address. This cannot happen with the new mechanism.

Note that this means old nodes that don't send their local addresses in
ADD_EDGE messages can't be discovered, because there is no address to
send discovery packets to. Old nodes can still discover new nodes by
sending them broadcasts, though.
Recent improvements to the local discovery mechanism makes it cheaper,
more network-friendly, and now it cannot make things worse (as opposed
to the old mechanism). Thus there is no reason not to enable it by
default.
@gsliepen gsliepen merged commit 498f1b1 into gsliepen:1.1 Jun 29, 2014
@dechamps dechamps deleted the localdiscovery branch July 13, 2014 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants