Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve multicast defaults #222

Merged
merged 1 commit into from Jul 25, 2019

Conversation

@eboasson
Copy link
Contributor

commented Jul 16, 2019

This PR changes two defaults related to the use of multicast:

  • It checks the kernel's information on the network interface to determine whether the chosen interface is a WiFi or any other type. On a WiFi network the use of multicast is (by default) restricted to participant discovery only.
    WiFi pretends to be like an Ethernet, but the reliability of multicast is dramatically lower. As far as I know, pretty much any interference will cause packet loss, and often the loss comes in bursts. Experience is that behaviour on a WiFi is much, much better when avoiding multicast except for that one purpose.
  • The second is to not assume that a retransmit request is likely to be followed in very short order by retransmit requests from other nodes, and that therefore the retransmits should not be multicast by default.

I've managed to check the WiFi/wired detection on macOS and Linux (though only with straightforwad network configuration). The code it uses on Windows seems like it ought to be correct, but it would be much appreciated if someone with a Windows box could give it a try.

The other commit is just a slight lenthening of a timeout in a test and the removal of a fixed port number. Those are the only two obvious things that may have caused the intermittent failure of the "select timeout" test (#212).

@rotu

This comment has been minimized.

Copy link
Contributor

commented Jul 16, 2019

What happens if the publisher is not WiFi but the recipient is? I think you'll have the same performance degradation.

@eboasson

This comment has been minimized.

Copy link
Contributor Author

commented Jul 18, 2019

What happens if the publisher is not WiFi but the recipient is? I think you'll have the same performance degradation.

That should actually be ok: the protocol works by advertising addresses that it is willing to receive data on (for the WiFi ones that would just be unicast) and by selecting the ones it is willing to send data to (for the WiFi ones, gain just unicast). Thus, the network should end up relying only on unicast for communicating with the WiFi ones.

It is a good point, though: by default the multicast address used for data is the same as for discovery. So, the WiFi nodes will get some packets via multicast, and it would be wise to verify that those do not trigger retransmit requests.

The other edge case is if there are nodes that are connected over Ethernet to a WiFi device that integrates into the larger network. That seems rather unusual, and so requiring a configuration file to overrule the default seems reasonable to me.

@rotu

This comment has been minimized.

Copy link
Contributor

commented Jul 18, 2019

Great! I'm glad you understand this, because I don't :-). I think it would be nice to detect questionable network characteristics, like long round-trip-times on a per-client basis.

@eboasson

This comment has been minimized.

Copy link
Contributor Author

commented Jul 19, 2019

I think it would be nice to detect questionable network characteristics, like long round-trip-times on a per-client basis.

Absolutely. Detecting and gracefully handling can do with some TLC ...

@eboasson eboasson force-pushed the eboasson:multicast-defaults branch from 7a14916 to a0fc7b5 Jul 19, 2019
@eboasson

This comment has been minimized.

Copy link
Contributor Author

commented Jul 23, 2019

I did a few more checks:

  • On Windows, too, detecting WiFi/not-WiFi works as expected;
  • A mix of AllowMulticast=spdp and AllowMulticast=true works fine with the limited equipment I have at hand;
  • A Linux VM on VMware on macOS using bridged networking will show the WiFi as a wired network, so that results in the "wrong" setting (but with the other WiFi nodes all set correctly, the system still behaves as desired);
  • A Linux VM on VMware on macOS using NAT networking will mess up things quite significantly becasue its (discovery) multicasts make it into the real world, but its advertised unicast address is unreachable ...

It looks to me like changing the defaults in this way does constitute an improvement. It doesn't change the need for improvements in the handling of problems, but I think that should be considered another problem.

@k0ekk0ek : are you ok with my changes?

@k0ekk0ek

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2019

@eboasson, no real comments. Personally, I would've checked for [ '\t'] instead of using isspace as it's locale dependent and you check for '\n' in other places, but that doesn't matter in practice and would be nitpicking. Also, it might fail on Linux if an alias interface is used (I haven't tried)? But that too, would be nitpicking.

I would however just skip the first two header lines in the Linux code. I think currently you're only skipping one(?). Would make the while loop somewhat simpler too.

Maybe replace the strncpy by ddsrt_strlcpy for macOS?

@eboasson

This comment has been minimized.

Copy link
Contributor Author

commented Jul 24, 2019

Personally, I would've checked for [ '\t'] instead of using isspace as it's locale dependent and you check for '\n' in other places, but that doesn't matter in practice and would be nitpicking.

Locale dependent stuff is nasty indeed — I'd better change it.

I would however just skip the first two header lines in the Linux code. I think currently you're only skipping one(?). Would make the while loop somewhat simpler too.

It actually skips the first two lines: first it reads to the newline in state SKIP_HEADER_1, then it skips to the second newline in SKIP_TO_EOL, and only then does it start checking for names. Why Linux doesn't offer an easy way is a mystery to me ...

Perhaps I'll add a comment line :)

Also, it might fail on Linux if an alias interface is used (I haven't tried)

I would assume the kernel would list aliases as well, but I too am inclined to ignore that detail. After all, it only affects the default setting.

Maybe replace the strncpy by ddsrt_strlcpy for macOS

Might as well do that, too.

* use multicast only for participant discovery if using a WiFi network
* default to using unicast for retransmits

Signed-off-by: Erik Boasson <eb@ilities.com>
@eboasson eboasson force-pushed the eboasson:multicast-defaults branch from a0fc7b5 to 72de86d Jul 25, 2019
@eboasson eboasson merged commit 4e80559 into eclipse-cyclonedds:master Jul 25, 2019
2 checks passed
2 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
eclipsefdn/eca The author(s) of the pull request is covered by necessary legal agreements in order to proceed!
Details
@eboasson eboasson deleted the eboasson:multicast-defaults branch Jul 25, 2019
@joespeed joespeed referenced this pull request Aug 15, 2019
27 of 29 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.