Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Improve SO_REUSE{ADDR,PORT} xplat behavior for TCP #24809

Merged
merged 7 commits into from
Nov 15, 2017

Conversation

tmds
Copy link
Member

@tmds tmds commented Oct 23, 2017

Fixes #24562

The test cases added would fail on Unix and pass on Windows.

The ProtocolType is passed to Bind, GetSockOpt and SetSockOpt to change the behavior for TCP in pal_networking.cpp.

CC @wfurt @geoffkizer @Priya91 @halter73 @stephentoub


using (Socket b = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp))
{
Assert.ThrowsAny<SocketException>(() => b.Bind(new IPEndPoint(IPAddress.Loopback, port)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this test also verify that the ex.SocketErrorCode == SocketError.AddressAlreadyInUse?

@karelz karelz assigned tmds, Priya91 and wfurt and unassigned tmds Oct 28, 2017
@karelz karelz added this to the 2.1.0 milestone Oct 28, 2017
if (protocolType == PAL_PT_TCP)
{
int optionValue = 1;
setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &optionValue, sizeof(int));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if setsockopt fails?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it's silently ignored. I don't expect it to fail, but I wouldn't want to cause it to fail the bind either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment to highlight that we're ignoring any error on purpose, and why?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is doing this during bind the right place? i.e. why here rather than after the socket is created? What's the expected behavior if, for example, after accept is used to get a connected socket on a particular port, that socket is then closed, and code then tries to listen on the same port that accepted socket was on?

// On Windows, SO_REUSEADDR allows the address *and* port to be reused. It's equivalent to
// SO_REUSEADDR + SO_REUSEPORT other systems. Se we only return "true" if both of those options are true.
//
// Return true when SO_REUSEADDR and SO_REUSEPORT are true.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You deleted why we do this from the comment. How come?

// SO_REUSEADDR + SO_REUSEPORT other systems.
//
// SO_REUSEPORT on Linux will do load-balancing accross multiple TCP sockets.
// This is different from Windows and BSD-Unix, so we won't set this for TCP.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you validated these behaviors on macOS?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: "This is different from Windows". The docs for SO_REUSEADDR state: "Once the second socket has successfully bound, the behavior for all sockets bound to that port is indeterminate. For example, if all of the sockets on the same port provide TCP service, any incoming TCP connection requests over the port cannot be guaranteed to be handled by the correct socket"... so it seems like the Windows behavior is undefined and load-balancing would be an appropriate result.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you validated these behaviors on macOS?

Only by the CI tests.

Re: "This is different from Windows". The docs for SO_REUSEADDR state: "Once the second socket has successfully bound, the behavior for all sockets bound to that port is indeterminate. For example, if all of the sockets on the same port provide TCP service, any incoming TCP connection requests over the port cannot be guaranteed to be handled by the correct socket"... so it seems like the Windows behavior is undefined and load-balancing would be an appropriate result.

On Unix, ReuseAddress controls both TCP_WAIT binding AND socket address sharing.
On Windows, ReuseAddress only controls socket address sharing. TCP_WAIT binding is always allowed.
It's not possible to express on Unix, allow bind during TCP_WAIT without allowing socket address sharing.
On Windows (and BSD Unix) I believe the socket address sharing is used to hand over a socket address from one process to another. So you either get the old one, or the new one.
On Linux, by allowing the socket address sharing, a newly bound process jumps in immediately and starts handing part of the requests, so you get a mix.

So, to align the behavior regarding TCP_WAIT, this PR is setting SO_REUSEADDR on Bind.
To avoid unintentional socket address sharing, it no longer sets SO_REUSEPORT for ReuseAddress.
To have some backwards compatibility it uses SO_REUSEADDR as a 'fake' toggle for ReuseAddress (but it gets forced to 1 on Bind).

I think if we want to add SO_REUSEPORT behavior to Unix, we should do it only when ExclusiveAddressUse=false. This will give explicit control if sharing is desired.

int err = setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, optionValue, static_cast<socklen_t>(optionLen));
if (err == 0)
if (err == 0 && protocolType != PAL_PT_TCP)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I do:

using System;
using System.Net.Sockets;

class Program
{
    static void Main()
    {
        var s = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
        Console.WriteLine(s.GetSocketOption(SocketOptionLevel.Socket, SocketOptionName.ReuseAddress));
        s.SetSocketOption(SocketOptionLevel.Socket, SocketOptionName.ReuseAddress, 0);
        Console.WriteLine(s.GetSocketOption(SocketOptionLevel.Socket, SocketOptionName.ReuseAddress));
    }
}

I'll get back 1 1 on Linux (as opposed to 0 0 on Windows), since on Linux with this change we're just ignoring it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Linux we are using SO_REUSEADDR as a 'fake' toggle for TCP.
So this should return 0 0.
After a Bind, GetSocketOption(ReuseAddress) will return 1 always.

{
client.Connect(new IPEndPoint(IPAddress.Loopback, port));
using (Socket acceptedClient = a.Accept())
{ }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a.Accept().Dispose();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be explicitly setting a linger time rather than relying on whatever the OS' default is?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Yes, some OSes will block close during the linger time.

@@ -284,6 +284,52 @@ public void ReuseAddress_Windows(bool? exclusiveAddressUse, bool? firstSocketReu
ReuseAddress(exclusiveAddressUse, firstSocketReuseAddress, secondSocketReuseAddress, expectFailure);
}

[Fact]
public void BindDuringTcpWait()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BindDuringTcpWait_Succeeds?

}

[Fact]
public void ExclusiveAddressUseTcp()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this test doing differently from

public void ReuseAddress(bool? exclusiveAddressUse, bool? firstSocketReuseAddress, bool? secondSocketReuseAddress, bool expectFailure)
?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is using TCP.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Ok.

@tmds
Copy link
Member Author

tmds commented Nov 6, 2017

@stephentoub I'm considering to extend this PR as follows:

  • Set SO_REUSEADDR also for UDP in Bind, except for multicast addresses.

This matches the behavior of Windows for non-multicast addresses. Windows doesn't allow binding to a multicast address. On Unix, for multicast addresses, setting SO_REUSEADDR for UDP implies SO_REUSEPORT. I think such sharing should be explicit.

  • Control only SO_REUSEPORT from ReuseAddress. For both TCP and UDP.

If people had added the ReuseAddress = true to deal with TCP_WAIT on Unix, but do not want to share their address. They will need to remove that line again.

  • Make ExclusiveAddressUse control SO_REUSEPORT too. This means SocketOption.ReuseAddress == !ExclusiveAddressUse .

WDYT?

@tmds
Copy link
Member Author

tmds commented Nov 7, 2017

Set SO_REUSEADDR also for UDP in Bind, except for multicast addresses.

That didn't work. Setting SO_REUSEADDR makes it possible to bind to the same address for UDP.
(For TCP, the second Bind fails because the first socket is listening).

Control only SO_REUSEPORT from ReuseAddress. For both TCP and UDP.
Make ExclusiveAddressUse control SO_REUSEPORT too. This means SocketOption.ReuseAddress == !ExclusiveAddressUse .

I've added these changes to the PR.

@tmds
Copy link
Member Author

tmds commented Nov 8, 2017

// - not sharing (SO_REUSEPORT=0)
// - explicit sharing (SO_REUSEPORT=1)
// We make both PAL_SO_REUSEADDR and PAL_SO_EXCLUSIVEADDRUSE control SO_REUSEPORT.
if (socketOptionName == PAL_SO_EXCLUSIVEADDRUSE || socketOptionName == PAL_SO_REUSEADDR)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're now allowing ExclusiveAddressUse to be set, should we also update the multi-connect support to track it? e.g.

internal void ReplaceHandleIfNecessaryAfterFailedConnect()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we should. I'll add that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this a bit more. Connects don't fail for UDP. And for TCP, the ExclusiveAddress/ReuseAddress is an option for listen sockets (which don't connect). So there is no compelling use-case to add this to the tracked socket options.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

{
using (Socket a = new Socket(AddressFamily.InterNetwork, SocketType.Dgram, ProtocolType.Udp))
{
Assert.Equal(1, (int)a.GetSocketOption(SocketOptionLevel.Socket, SocketOptionName.ExclusiveAddressUse));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this now defaults to true on Unix and false on Windows? I guess it did before this change, too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. This hasn't changed. This test is verifying the defaults on Unix.

}
}

// Bind a socket to the same address we just used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we run any risk here that another test running concurrently might end up getting assigned this port in the short window between the above and here? Seems like it could result in random CI failures now and then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's possible. I don't think there's a way to deal with that. Perhaps the allocation strategy in the dynamic port range makes it very unlikely this will happen. We can see if it causes issues in CI?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do this for some of our Kestrel tests. We run this regularly on various versions of Windows, Linux and macOS without that being a source of flakiness.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We run this regularly on various versions of Windows, Linux and macOS without that being a source of flakiness.

Do you run lots of tests in parallel? I seem to remember this actually being an issue for us in the past, but I could be misremembering.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. All the tests that use the GetNextPort() method I linked to are in the same test class, so xunit does not parallelize them. We're only just now looking into running multiple test projects in parallel.

{
err = setsockopt(fd, SOL_SOCKET, SO_REUSEPORT, optionValue, static_cast<socklen_t>(optionLen));
if ((value != 0) && (value != 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does ReuseAddress behave when passing in a value other than 0 or 1? Does it fail, or does it treat anything non-zero as being true? I'm wondering if we want/need this if block at all.

Copy link
Member Author

@tmds tmds Nov 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While doing this PR I learned mac returns 512 for getsockopt SO_REUSEPORT when doing setsockopt to 1 (cdac9c6).
So behavior for getsockopt and setsockopt is platform dependent. Perhaps we should iron out some of those differences for other options too. I don't think we are limiting what the user can do by restricting to 0 and 1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@stephentoub
Copy link
Member

@geoffkizer, any thoughts on this one?

@geoffkizer
Copy link

I think that adding SO_REUSEADDR by default for TCP on Unix probably makes sense. My understanding is that libuv does this by default. Would be interesting to see what other frameworks/languages do here as well, but my expectation is that they probably use SO_REUSEADDR as well.

I'll look at the code and see if I have any comments.

@geoffkizer
Copy link

FYI, looks like Go sets SO_REUSEADDR for listen sockets: https://golang.org/src/net/sockopt_bsd.go

@geoffkizer
Copy link

I do wonder whether it makes more sense to do this as part of listen rather than bind. I believe the case we care about here is reusing a specific listen port, right? If so, seems like we should limit this behavior to listen sockets unless there's some other problem we're trying to solve.

@tmds
Copy link
Member Author

tmds commented Nov 13, 2017

I do wonder whether it makes more sense to do this as part of listen rather than bind. I believe the case we care about here is reusing a specific listen port, right? If so, seems like we should limit this behavior to listen sockets unless there's some other problem we're trying to solve.

SO_REUSEADDR needs to be set before Bind, which needs to be called before Listen. This is also the order in the Go implementation https://github.com/golang/go/blob/bf0f69220255941196c684f235727fd6dc747b5c/src/net/sock_posix.go#L169-L180.
The Go API has a single operation for Bind&Listen (net.Listen("tcp", ":8080")), these are two operations in .NET (Bind and Listen).

@geoffkizer
Copy link

Yeah, that makes sense. Thanks.

@stephentoub
Copy link
Member

@dotnet-bot test Outerloop Linux x64 Debug Build please
@dotnet-bot test Outerloop OSX x64 Debug Build please

@stephentoub
Copy link
Member

@tmds, the TcpClientTest.ExclusiveAddressUse_Set_False_NotSupported test is failing across distros:

Unhandled Exception of Type Xunit.Sdk.ThrowsException
Message :
Assert.Throws() Failure
Expected: typeof(System.Net.Sockets.SocketException)
Actual:   (No exception was thrown)
Stack Trace :
   at System.Net.Sockets.Tests.TcpClientTest.ExclusiveAddressUse_Set_False_NotSupported() in /mnt/j/workspace/dotnet_corefx/master/linux-TGroup_netcoreapp+CGroup_Debug+AGroup_x64+TestOuter_true_prtest/src/System.Net.Sockets/tests/FunctionalTests/TcpClientTest.cs:line 292

Looks like it just needs to be fixed now that ExclusiveAddressUse can be set to false.

@tmds
Copy link
Member Author

tmds commented Nov 14, 2017

@dotnet-bot test Outerloop Linux x64 Debug Build please
@dotnet-bot test Outerloop OSX x64 Debug Build please

@stephentoub
Copy link
Member

@dotnet-bot test Windows x86 Release Build please (https://github.com/dotnet/corefx/issues/24874)

@stephentoub
Copy link
Member

@tmds, is 953d7e6#diff-0f682f473db4a003b3babf4e061c3987R275 still Windows-specific?

@tmds
Copy link
Member Author

tmds commented Nov 14, 2017

@dotnet-bot test Outerloop Linux x64 Debug Build please
@dotnet-bot test Outerloop OSX x64 Debug Build please

int value = *reinterpret_cast<int32_t*>(optionValue);

// macOS returns non-zero values other than 1.
value = value == 0 ? 0 : 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I just noticed this. Should this be in an else for the if block below?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Can be addressed in a subsequent cleanup PR if you agree, along with my other nit.)

else
{
value = value == 0 ? 1 : 0;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:
if (value == 0) { value = 1; }
else if (value == 1) { value = 0; }
else { return PAL_EINVAL; }
?

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks for pushing this through.

@stephentoub stephentoub merged commit 7eba4bf into dotnet:master Nov 15, 2017
@tmds
Copy link
Member Author

tmds commented Nov 15, 2017

@stephentoub thank you for reviewing and follow-up! I'm going to add a test that checks how 'boolean' (0/1) socket options behave on different platforms and see if we can iron out some of the platform differences. I'll include your suggestions in that PR.

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Improve SO_REUSE{ADDR,PORT} behavior for TCP

* Assert SocketErrorCode is AddressAlreadyInUse

* Make PAL_SO_EXCLUSIVEADDRUSE and PAL_SO_REUSEADDR control SO_REUSEPORT.

* PR feedback

* Fix macOS

* Remove ExclusiveAddressUse_Set_False_NotSupported test

* Test Roundtrip_ExclusiveAddressUse_GetEqualsSet_False on all platforms.


Commit migrated from dotnet/corefx@7eba4bf
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
9 participants