-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore Socket.ConnectAsync via hostname on *nix platforms. #17374
Comments
Another library affected by this behavior: shayhatsor/zookeeper#7 |
Thanks, @NickCraver. Just to clarify, you meant change to option 3, right? |
@stephentoub ...I guess getting the number right would be important there huh? Fixed the text, thanks! |
Option 3 and 4 seem equally bad to me, so I'd opt for option 4 as it will work in more cases than 3. The better solution is to provide static Connect methods as noted in the PR.
Methods accepting multiple addresses on Linux/OSX should throw as it isn't supported. Moving away from those methods is part of the cross platform porting. Option 3 and 4 will mask these problems. When the DNS/IP config changes, applications will suddenly fail. |
How about a slightly different option, let's call it option 5: (5) When presented with multiple addresses, treat the first address as the only address. If the connection fails, throw the usual exception for connection failure, rather than PlatformNotSupportedException. The advantage vs. option 4 would be that DNS changes would not suddenly cause strange exceptions to be thrown. The disadvantage is that adding a DNS entry would silently have no effect. Thoughts? |
I'd be fine with (5). It's at least explainable. It would be nice if the resulting exception message could be expanded a bit, though, to highlight that only the first address was tried, why, and what the dev's options are instead. |
Option 5 is the same thing as option 4 but throwing a different exception. As to the type of exception, my preference is PlatformNoSupportedException. No one in favor of adding static Connect methods as the xplat way of handling this? @NickCraver how would you feel if you'd need to change your code to a static Connect method to make it xplat? |
I'm in favor of it long term, but that's not going to happen for 1.0 or in general in the near future. Such methods do not exist in the full framework, which then has significant impact on portability. |
I forgot about that. Have you thought of how you want to tackle this problem in general? For the (long) time being. |
I'd still opt for option 3 here after this discussion. If we're not actually going to execute the input that was handed to the method, it should fail and indicate how to fix it on the platforms that don't support it. This results in immediate and actionable failure. Option 4 is bad to me, because it's silently not doing what I told it to. When people go a look up the methods on MSDN, they won't indicate this behavior and it'll result in a lot of developer frustration. We know what's happening inside the method though. We should simply detect and throw, and save a ton of developer hours and help migration along at the same time. This is happening on new platforms, where people are finding what they're hitting, it wouldn't happen anywhere existing. Hiding the error behind very unclear behavior draws that pain far out instead of just eating it near-term, while developers are actively porting. As far as the signatures differing in Core or not to make this work - I'm indifferent there as it's already an #if def. That being said, we're passing a single address anyway. TL;DR: I/O operations differ across platforms, this isn't really any kind of surprise. We can responsibly help developers quickly fix the issue while porting and not cause hair loss debugging it - so IMO, we should go that route: option 3. How does Mono handle this? I'll try and look later today. |
When I resolve "localhost" on my machine, it already gives me 2 addresses: 1 for IPv4 and 1 for IPv6. So option3 would fail to connect to localhost and option 4 would succeed. |
@tmds Awesome - that's a good thing. It means we're failing early and changing calls early, not finding them later in deployment (which was one of the primary concerns). I'm not sure how common v6 will be on test VMs, though. |
I have to second Nick's thoughts with a preference to 3, but again with an improved message telling them what they should be doing instead, i.e. which "supported pattern" you want them to change their code to. This has the advantage of not biting anyone unnecessarily, while also never hiding actual runtime problems (i.e. by ignoring all-but-one address). Edit: and it retains API/signature compatibility with regular .NET, which is a big plus for compat. |
@NickCraver @mgravell if you like catching problems early. The current implementation tells you your code isn't xplat and you should use the single IPAddress Connect method. Hostname to IPAddress resolution returns addresses in the order they should be tried. So trying the first will succeed in most cases. It is not uncommon to resolve names to multiple addresses (IPv6 being a major reason). Option 3 doesn't fail early, it allows you to pass in certain cases which depend on your network configuration. If you know you'll only have 1 address, then call the Connect-overload which accepts 1 address. It works fine on all platforms. |
It looks to me like Mono tries connecting to all addresses. If an attempt fails, they destroy the underlying native socket and create a new one. When an attempt succeeds, they keep the socket that was connected. A disadvantage of this would be that any configuration that had been done on the original socket would be lost; Mono doesn't appear to make any attempt to, for example, remember socket options and restore them on the new socket, and doing so would require extra bookkeeping in Socket, which would likely make every Socket object larger. I don't have a good feel for what the real-world impact of this approach might be. |
@ericeil I'd like to keep the options and so would StackExchange.Redis: https://github.com/StackExchange/StackExchange.Redis/blob/master/StackExchange.Redis/StackExchange/Redis/SocketManager.cs#L178-L179 As for the discussion on whether to connect when there is exactly 1 address (option 3) or at least 1 address (option 4): in my real-world home network when I resolve "localhost", my pc's hostname or another pc's hostname I get 2, 9 and 2 addresses respectively. |
Seems like it could end up being super expensive, especially when needing to support methods like
How about a compromise between options 3 and 6: Too complicated? |
Option 7 is the best solution given the constraints of the problem. In order of increasing probability of connect succeeding: We should aim for 4&5 as a minimum as occurrence of multiple addresses is high. Note if you want some more info on the order of addresses returned when resolving a name, see man getaddrinfo. |
With Option 7, wouldn't the StackExchange code still get PlatformNotSupportedException (since it configures the socket prior to calling Connect)? |
Yes, unless the properties it configures are ones we choose to special-case. |
It sounds like option 7 is not tenable for v1.0. Without adding some state to track at least some of the options set on a socket, it's not going to solve the problem. If we add state to track options, we are going to have to live with that extra overhead forever, even after we add static Connect methods to handle this properly. And, it's not clear to me that we could pick a suitable subset of all possible socket state that would make this "just work" for everyone, in the short timeframe available before 1.0 ships. Mimicking Mono's behavior (is that option "6?") similarly does not solve the problem for even the uses that lead to this discussion. Options 3-5 behave unpredictably, and it's not clear to me that we're going to be able to pick the right heuristic that will work often enough to not cause headaches for everyone. I certainly don't feel like we've settled on a consensus choice in this discussion so far. That leaves me thinking that the existing behavior (option 2) is the best choice, for 1.0, because it gives us predictable, early, failures for all usages, and leaves open all other options for possible future implementation. For code that needs this functionality on 1.0, there is a relatively simple workaround. So, I am moving this out of the RTM milestone, but will leave the issue open for further discussion. |
Currently experienced the same problem with an attempt to port the .NET Driver for MongoDB to .NET Core. How to solve this in RTM if the code suddenly starts to throw errors when you try to connect to localhost. Do you have to write around the problem yourself to get a single connection endpoint using DNS? |
@vlesierse yes, that is the xplat way to handle this. Resolve to IPAdresses[]. Try them one by one until one works or all failed. |
As outlined in https://github.com/dotnet/corefx/issues/8768, on Unix we don't currently support using the instance Connect/ConnectAsync methods that take a string host or a DnsEndPoint, because they could map to multiple addresses, which means we might need to try reconnecting on the same socket after a failed attempt, and that's not supported with BSD sockets. They are potential workarounds we can explore as outlined in that issue, but they're non-trivial and/or have undesirable ramifications. However, one simple thing we can do is allow a string/DnsEndPoint version of an IPAddress, e.g. just as someone can provide an IPAddress, they can provide a string version of that IPAddress, such as "127.0.0.1". This is a common thing to do, and we can make it work just by attempting to parse the address.
As outlined in https://github.com/dotnet/corefx/issues/8768, on Unix we don't currently support using the instance Connect/ConnectAsync methods that take a string host or a DnsEndPoint, because they could map to multiple addresses, which means we might need to try reconnecting on the same socket after a failed attempt, and that's not supported with BSD sockets. They are potential workarounds we can explore as outlined in that issue, but they're non-trivial and/or have undesirable ramifications. However, one simple thing we can do is allow a string/DnsEndPoint version of an IPAddress, e.g. just as someone can provide an IPAddress, they can provide a string version of that IPAddress, such as "127.0.0.1". This is a common thing to do, and we can make it work just by attempting to parse the address.
As outlined in https://github.com/dotnet/corefx/issues/8768, on Unix we don't currently support using the instance Connect/ConnectAsync methods that take a string host or a DnsEndPoint, because they could map to multiple addresses, which means we might need to try reconnecting on the same socket after a failed attempt, and that's not supported with BSD sockets. They are potential workarounds we can explore as outlined in that issue, but they're non-trivial and/or have undesirable ramifications. However, one simple thing we can do is allow a string/DnsEndPoint version of an IPAddress, e.g. just as someone can provide an IPAddress, they can provide a string version of that IPAddress, such as "127.0.0.1". This is a common thing to do, and we can make it work just by attempting to parse the address.
…forms. Having the socket do the DNS lookup is apparently not supported by corefx. See issue: https://github.com/dotnet/corefx/issues/5829 and https://github.com/dotnet/corefx/issues/8768.
As outlined in https://github.com/dotnet/corefx/issues/8768, on Unix we don't currently support using the instance Connect/ConnectAsync methods that take a string host or a DnsEndPoint, because they could map to multiple addresses, which means we might need to try reconnecting on the same socket after a failed attempt, and that's not supported with BSD sockets. They are potential workarounds we can explore as outlined in that issue, but they're non-trivial and/or have undesirable ramifications. However, one simple thing we can do is allow a string/DnsEndPoint version of an IPAddress, e.g. just as someone can provide an IPAddress, they can provide a string version of that IPAddress, such as "127.0.0.1". This is a common thing to do, and we can make it work just by attempting to parse the address.
Due to https://github.com/dotnet/corefx/issues/8768, on non-Windows platforms Socket.Connect* methods throw PaltformNotSupportedException when connecting via host names. This change resolves the IP endpoints first when the endpoint is a Dns endpoint, tries to connect to each IP endpoint, then picks the first that works.
Due to https://github.com/dotnet/corefx/issues/8768, on non-Windows platforms Socket.Connect* methods throw PaltformNotSupportedException when connecting via host names. This change resolves the IP endpoints first when the endpoint is a Dns endpoint, tries to connect to each IP endpoint, then picks the first that works.
I couldn't find an existing issue covering the need for static Connect methods, so I created dotnet/corefx#11564. The existing instance methods are probably as good as they're going to get without either heroic effort or introcuding hard-to-grok behavior. Either way, it doesn't seem worth the cost, unless we get a lot of feedback that a) the current situation is unacceptable, and b) adding more functional static methods won't be enough. |
I'm going to reopen this and explore the option 7 I suggested. I'm thinking of something like this:
Reasonable? If so, are there a particular set of socket options folks believe would address the 99% case? cc: @NickCraver, @tmds, @glennc, @geoffkizer, @danroth27, @richlander, @pgavlin, @halter73 |
Is it really that bad to just add static methods that do the right thing? As it stands, there's no "right" way for users to write their code on Unix platforms. Seems like this is a functional gap and we need to address it. I realize that's a compat pain between .NET Core and .NET Framework, but how do we value that vs actually implementing the right functionality here? Note we already have static ConnectAsync in netstandard. |
+1 to option 7. @geoffkizer Yes, it's bad. A major goal (the primary goal?) of I think we obviously disagree on the "right" functionality here. I see 7 as far more desirable by most libraries I cna think of. Obviously the overhead is a weight we should evaluate, but I'd much rather have a slightly heavier socket that works over a user blowing up production. If the overhead is an issue for something really socket intensive, they'd be free to go the static route, as they are today. |
@stephentoub Personally I am fine with the PNSE informing me to adjust my code to Dns.Resolve or call a static method. Some observations from the references made to this issue:
|
Thanks, @NickCraver and @tmds. @tmds, isn't SIO_LOOPBACK_FAST_PATH a Windows-only option, anyway? |
Yep - Windows only, so that'll need to be behind a platform check either way (as it is already for anyone using it). Putting platform-specific optimizations behind platform-specific checks isn't a problem, IMO. They are optimizations, and completely optional. Basic functionality like "connect to this hostname" obviously my opinion sways sharply the other way on ;) |
Yes, this is exactly what makes it interesting. It won't be covered by the 99% of option 7.
When option 7 is implemented most 1% options could simply be placed after the connect call. That isn't the case here. SIO_LOOPBACK_FAST_PATH must be set before the Connect call. @stephentoub I see TcpClient already handles the 95% case. Offering yet another alternative to the developer to solve the PNSE. |
I'm not understanding. It doesn't apply to Unix, and on Windows this issue doesn't exist. So why is it interesting?
This is about a) making 15 years of existing code and already compiled binaries work better, and b) making the obvious/simple code a dev new to the platform would write just work. |
You're right. Not relevant.
It's great when things just work. |
For anyone following this, PR with huge improvements is active here: dotnet/corefx#16373 |
There's a bit of history to this, but due to https://github.com/dotnet/corefx/issues/5829 which was resolved by dotnet/corefx@30bd4b7, we are no longer able to use hostnames in socket connections for some platforms. Currently, it's a runtime
PlatformNotSupportedException
.This breaks StackExchange.Redis on at least OS X and Linux, you can see the issue filed here: StackExchange/StackExchange.Redis#410
The best notes for this are in the dotnet/corefx@30bd4b7 commit, copied here for ease of consumption:
Currently option 2 is implemented, which breaks this codepath:
SocketTaskExtensions.ConnectAsync(this Socket socket, string host, int port)
Socket.BeginConnect(string host, int port, AsyncCallback requestCallback, object state)
This hits line 2334:
ThrowIfNotSupportsMultipleConnectAttempts();
The exact codepath doesn't matter, because several more feed into this one. In fact even if you call
BeginConnect(EndPoint remoteEP, AsyncCallback callback, object state)
, it's still callingBeginConnect(dnsEP.Host, dnsEP.Port, callback, state);
in the host case anyway. So even using the explicit overload of a single endpoint, it still breaks.I would like us to change to option 3 in the comments above, only throwing for the case that breaks. The current state needlessly breaks existing code for no real reason. The comments in option 3 about deployments being a surprise break is perfectly valid, but it's no worse than option 2 and we're seeing that break in actual runtime usage already out in the wild.
If we can't prevent the issue in general (it looks like all practical options have been exhausted there), I would argue for 2 changes:
The text was updated successfully, but these errors were encountered: