-
Notifications
You must be signed in to change notification settings - Fork 5k
Improve SO_REUSE{ADDR,PORT} xplat behavior for TCP #24809
Conversation
|
||
using (Socket b = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp)) | ||
{ | ||
Assert.ThrowsAny<SocketException>(() => b.Bind(new IPEndPoint(IPAddress.Loopback, port))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this test also verify that the ex.SocketErrorCode == SocketError.AddressAlreadyInUse
?
if (protocolType == PAL_PT_TCP) | ||
{ | ||
int optionValue = 1; | ||
setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &optionValue, sizeof(int)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if setsockopt fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently it's silently ignored. I don't expect it to fail, but I wouldn't want to cause it to fail the bind either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment to highlight that we're ignoring any error on purpose, and why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, is doing this during bind the right place? i.e. why here rather than after the socket is created? What's the expected behavior if, for example, after accept is used to get a connected socket on a particular port, that socket is then closed, and code then tries to listen on the same port that accepted socket was on?
// On Windows, SO_REUSEADDR allows the address *and* port to be reused. It's equivalent to | ||
// SO_REUSEADDR + SO_REUSEPORT other systems. Se we only return "true" if both of those options are true. | ||
// | ||
// Return true when SO_REUSEADDR and SO_REUSEPORT are true. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You deleted why we do this from the comment. How come?
// SO_REUSEADDR + SO_REUSEPORT other systems. | ||
// | ||
// SO_REUSEPORT on Linux will do load-balancing accross multiple TCP sockets. | ||
// This is different from Windows and BSD-Unix, so we won't set this for TCP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you validated these behaviors on macOS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: "This is different from Windows". The docs for SO_REUSEADDR state: "Once the second socket has successfully bound, the behavior for all sockets bound to that port is indeterminate. For example, if all of the sockets on the same port provide TCP service, any incoming TCP connection requests over the port cannot be guaranteed to be handled by the correct socket"... so it seems like the Windows behavior is undefined and load-balancing would be an appropriate result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you validated these behaviors on macOS?
Only by the CI tests.
Re: "This is different from Windows". The docs for SO_REUSEADDR state: "Once the second socket has successfully bound, the behavior for all sockets bound to that port is indeterminate. For example, if all of the sockets on the same port provide TCP service, any incoming TCP connection requests over the port cannot be guaranteed to be handled by the correct socket"... so it seems like the Windows behavior is undefined and load-balancing would be an appropriate result.
On Unix, ReuseAddress controls both TCP_WAIT binding AND socket address sharing.
On Windows, ReuseAddress only controls socket address sharing. TCP_WAIT binding is always allowed.
It's not possible to express on Unix, allow bind during TCP_WAIT without allowing socket address sharing.
On Windows (and BSD Unix) I believe the socket address sharing is used to hand over a socket address from one process to another. So you either get the old one, or the new one.
On Linux, by allowing the socket address sharing, a newly bound process jumps in immediately and starts handing part of the requests, so you get a mix.
So, to align the behavior regarding TCP_WAIT, this PR is setting SO_REUSEADDR on Bind.
To avoid unintentional socket address sharing, it no longer sets SO_REUSEPORT for ReuseAddress.
To have some backwards compatibility it uses SO_REUSEADDR as a 'fake' toggle for ReuseAddress (but it gets forced to 1 on Bind).
I think if we want to add SO_REUSEPORT behavior to Unix, we should do it only when ExclusiveAddressUse=false. This will give explicit control if sharing is desired.
int err = setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, optionValue, static_cast<socklen_t>(optionLen)); | ||
if (err == 0) | ||
if (err == 0 && protocolType != PAL_PT_TCP) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I do:
using System;
using System.Net.Sockets;
class Program
{
static void Main()
{
var s = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
Console.WriteLine(s.GetSocketOption(SocketOptionLevel.Socket, SocketOptionName.ReuseAddress));
s.SetSocketOption(SocketOptionLevel.Socket, SocketOptionName.ReuseAddress, 0);
Console.WriteLine(s.GetSocketOption(SocketOptionLevel.Socket, SocketOptionName.ReuseAddress));
}
}
I'll get back 1 1
on Linux (as opposed to 0 0
on Windows), since on Linux with this change we're just ignoring it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Linux we are using SO_REUSEADDR as a 'fake' toggle for TCP.
So this should return 0 0.
After a Bind, GetSocketOption(ReuseAddress) will return 1 always.
{ | ||
client.Connect(new IPEndPoint(IPAddress.Loopback, port)); | ||
using (Socket acceptedClient = a.Accept()) | ||
{ } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a.Accept().Dispose();
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be explicitly setting a linger time rather than relying on whatever the OS' default is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Yes, some OSes will block close during the linger time.
@@ -284,6 +284,52 @@ public void ReuseAddress_Windows(bool? exclusiveAddressUse, bool? firstSocketReu | |||
ReuseAddress(exclusiveAddressUse, firstSocketReuseAddress, secondSocketReuseAddress, expectFailure); | |||
} | |||
|
|||
[Fact] | |||
public void BindDuringTcpWait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BindDuringTcpWait_Succeeds?
} | ||
|
||
[Fact] | ||
public void ExclusiveAddressUseTcp() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this test doing differently from
public void ReuseAddress(bool? exclusiveAddressUse, bool? firstSocketReuseAddress, bool? secondSocketReuseAddress, bool expectFailure) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is using TCP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Ok.
@stephentoub I'm considering to extend this PR as follows:
This matches the behavior of Windows for non-multicast addresses. Windows doesn't allow binding to a multicast address. On Unix, for multicast addresses, setting SO_REUSEADDR for UDP implies SO_REUSEPORT. I think such sharing should be explicit.
If people had added the ReuseAddress = true to deal with TCP_WAIT on Unix, but do not want to share their address. They will need to remove that line again.
WDYT? |
That didn't work. Setting SO_REUSEADDR makes it possible to bind to the same address for UDP.
I've added these changes to the PR. |
// - not sharing (SO_REUSEPORT=0) | ||
// - explicit sharing (SO_REUSEPORT=1) | ||
// We make both PAL_SO_REUSEADDR and PAL_SO_EXCLUSIVEADDRUSE control SO_REUSEPORT. | ||
if (socketOptionName == PAL_SO_EXCLUSIVEADDRUSE || socketOptionName == PAL_SO_REUSEADDR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're now allowing ExclusiveAddressUse to be set, should we also update the multi-connect support to track it? e.g.
internal void ReplaceHandleIfNecessaryAfterFailedConnect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we should. I'll add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this a bit more. Connects don't fail for UDP. And for TCP, the ExclusiveAddress/ReuseAddress is an option for listen sockets (which don't connect). So there is no compelling use-case to add this to the tracked socket options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
{ | ||
using (Socket a = new Socket(AddressFamily.InterNetwork, SocketType.Dgram, ProtocolType.Udp)) | ||
{ | ||
Assert.Equal(1, (int)a.GetSocketOption(SocketOptionLevel.Socket, SocketOptionName.ExclusiveAddressUse)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this now defaults to true on Unix and false on Windows? I guess it did before this change, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. This hasn't changed. This test is verifying the defaults on Unix.
} | ||
} | ||
|
||
// Bind a socket to the same address we just used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we run any risk here that another test running concurrently might end up getting assigned this port in the short window between the above and here? Seems like it could result in random CI failures now and then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's possible. I don't think there's a way to deal with that. Perhaps the allocation strategy in the dynamic port range makes it very unlikely this will happen. We can see if it causes issues in CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do this for some of our Kestrel tests. We run this regularly on various versions of Windows, Linux and macOS without that being a source of flakiness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We run this regularly on various versions of Windows, Linux and macOS without that being a source of flakiness.
Do you run lots of tests in parallel? I seem to remember this actually being an issue for us in the past, but I could be misremembering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. All the tests that use the GetNextPort() method I linked to are in the same test class, so xunit does not parallelize them. We're only just now looking into running multiple test projects in parallel.
{ | ||
err = setsockopt(fd, SOL_SOCKET, SO_REUSEPORT, optionValue, static_cast<socklen_t>(optionLen)); | ||
if ((value != 0) && (value != 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does ReuseAddress behave when passing in a value other than 0 or 1? Does it fail, or does it treat anything non-zero as being true? I'm wondering if we want/need this if block at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While doing this PR I learned mac returns 512 for getsockopt SO_REUSEPORT when doing setsockopt to 1 (cdac9c6).
So behavior for getsockopt and setsockopt is platform dependent. Perhaps we should iron out some of those differences for other options too. I don't think we are limiting what the user can do by restricting to 0 and 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
@geoffkizer, any thoughts on this one? |
I think that adding SO_REUSEADDR by default for TCP on Unix probably makes sense. My understanding is that libuv does this by default. Would be interesting to see what other frameworks/languages do here as well, but my expectation is that they probably use SO_REUSEADDR as well. I'll look at the code and see if I have any comments. |
FYI, looks like Go sets SO_REUSEADDR for listen sockets: https://golang.org/src/net/sockopt_bsd.go |
I do wonder whether it makes more sense to do this as part of listen rather than bind. I believe the case we care about here is reusing a specific listen port, right? If so, seems like we should limit this behavior to listen sockets unless there's some other problem we're trying to solve. |
|
Yeah, that makes sense. Thanks. |
@dotnet-bot test Outerloop Linux x64 Debug Build please |
@tmds, the TcpClientTest.ExclusiveAddressUse_Set_False_NotSupported test is failing across distros:
Looks like it just needs to be fixed now that ExclusiveAddressUse can be set to false. |
@dotnet-bot test Outerloop Linux x64 Debug Build please |
@dotnet-bot test Windows x86 Release Build please (https://github.com/dotnet/corefx/issues/24874) |
@tmds, is 953d7e6#diff-0f682f473db4a003b3babf4e061c3987R275 still Windows-specific? |
@dotnet-bot test Outerloop Linux x64 Debug Build please |
int value = *reinterpret_cast<int32_t*>(optionValue); | ||
|
||
// macOS returns non-zero values other than 1. | ||
value = value == 0 ? 0 : 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I just noticed this. Should this be in an else for the if block below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Can be addressed in a subsequent cleanup PR if you agree, along with my other nit.)
else | ||
{ | ||
value = value == 0 ? 1 : 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe:
if (value == 0) { value = 1; }
else if (value == 1) { value = 0; }
else { return PAL_EINVAL; }
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks for pushing this through.
@stephentoub thank you for reviewing and follow-up! I'm going to add a test that checks how 'boolean' (0/1) socket options behave on different platforms and see if we can iron out some of the platform differences. I'll include your suggestions in that PR. |
* Improve SO_REUSE{ADDR,PORT} behavior for TCP * Assert SocketErrorCode is AddressAlreadyInUse * Make PAL_SO_EXCLUSIVEADDRUSE and PAL_SO_REUSEADDR control SO_REUSEPORT. * PR feedback * Fix macOS * Remove ExclusiveAddressUse_Set_False_NotSupported test * Test Roundtrip_ExclusiveAddressUse_GetEqualsSet_False on all platforms. Commit migrated from dotnet/corefx@7eba4bf
Fixes #24562
The test cases added would fail on Unix and pass on Windows.
The ProtocolType is passed to Bind, GetSockOpt and SetSockOpt to change the behavior for TCP in pal_networking.cpp.
CC @wfurt @geoffkizer @Priya91 @halter73 @stephentoub