Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Singleton HttpClient unable to execute GetAsync with .NET Core 2.1 in some environment #27501

Closed
schamo opened this issue Sep 28, 2018 · 27 comments

Comments

@schamo
Copy link

schamo commented Sep 28, 2018

We just migrated our (ASP).NET Core apps from 1.1 to 2.1. As we're using OpenIdConnect, we use an HttpClient for getting the configuration JSON from the identity provider, as well as the user information once he's logged in.

For efficiency reasons, we create an HttpClient on Startup and register it as a singleton. I'm conscious we should actually use HttpClientFactory by now - which I did, but I still see the same problem.

Now, on development and testing environments, everything works even after migration. However, on the customer's testing environment - where everything worked before with 1.1 as well - the HttpClient is no longer able to connect to the ID provider, yielding the following exception:

System.Net.Http.HttpRequestException: The requested name is valid, but no data of the requested type was found ---> System.Net.Sockets.SocketException: The requested name is valid, but no data of the requested type was found
   at System.Net.Http.ConnectHelper.ConnectAsync(String host, Int32 port, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.ConnectHelper.ConnectAsync(String host, Int32 port, CancellationToken cancellationToken)
   at System.Threading.Tasks.ValueTask`1.get_Result()
   at System.Net.Http.HttpConnectionPool.CreateConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Threading.Tasks.ValueTask`1.get_Result()
   at System.Net.Http.HttpConnectionPool.WaitForCreatedConnectionAsync(ValueTask`1 creationTask)
   at System.Threading.Tasks.ValueTask`1.get_Result()
   at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.FinishSendAsyncBuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
   at Bedag.Net.Core.Service.Auth.OpenIdConnect.OpenIdConnectDiscovery.ObtainConfiguration()

As I already wrote, everything works on our development and testing environments, but not on the customer's environment. I've been trying to figure out configurations that are completely different, but couldn't find any. So I started playing around with code and .NET Core options. And I even found some workarounds. Here's what did and didn't work:

Works:

  • Using DOTNET_SYSTEM_NET_HTTP_USESOCKETSHTTPHANDLER environment variable and setting it to 0
  • Initializing the HttpClient for every request (this works no matter if I set the environment variable or not)

Doesn't work:

  • Injecting IHttpClientFactory and then using CreateClient for any request
  • Using a static HttpClient

So this is confusing. If I initialize an HttpClient for every request, it works even when using SocketsHttpHandler (of course, it works as well when using WinHttpHandler). However, if the HttpClient (or the HttpClientFactory) is initialized during Startup, the code only works when we're NOT using SocketsHttpHandler, but does work with WinHttpHandler.
My guess is that the new SocketsHttpHandler requires some parameters that do not seem to be available on Startup, but that WinHttpHandler does not need (or gets them in another way). Because using SocketsHttpHandler does work when used "later" in the code, I guess its initialization on Startup is incomplete.

Can you give me any hints on this? It's just confusing that the apps are working on some environments and not on others (all machines are running Windows Server 2012 R2 with IIS 8.5).

@wfurt
Copy link
Member

wfurt commented Sep 29, 2018

can you provide simplified repro case @schamo ??? Even if it does work for you, it may give us some clues.

@karelz
Copy link
Member

karelz commented Oct 1, 2018

@schamo any luck with a repro? (ideally minimal one - using HttpClient directly without the factory)
We would be very interested in digging into it.

Which version of 2.1 do you have installed?

@karelz
Copy link
Member

karelz commented Oct 1, 2018

cc @geoffkizer

@schamo
Copy link
Author

schamo commented Oct 1, 2018

I actually have to work on another project these days. I'll probably continue on this one on wednesday/thursday, so I should be able to create a repro this week.

@karelz We're using 2.1.4, as I read that this version should resolve some proxy issues - which I supposed to be the cause of the problem. Don't know if this was the right track... I first tried on 2.1.3, which gave me the same result.

@karelz
Copy link
Member

karelz commented Oct 1, 2018

Proxies will be fully fixed in 2.1.5 (comming soon). If you can try daily build of 2.1.x that would be awesome, otherwise let's just wait for 2.1.5 to come out.

@schamo
Copy link
Author

schamo commented Oct 2, 2018

What does "coming soon" mean? Are we talking about days, or weeks?
I prefer waiting for 2.1.5, then, as I'm not sure whether the customer will agree to install a "provisional" version on his environment.

But I'll still try to get a working repro, just in case 2.1.5 would not resolve the issue.

@karelz
Copy link
Member

karelz commented Oct 2, 2018

I was vague intentionally as I don't know the schedule for 2.1.5. @leecow can you please share rough ETA for 2.1.5?

@leecow
Copy link
Member

leecow commented Oct 2, 2018

1 hour ago ;-)

@schamo
Copy link
Author

schamo commented Oct 3, 2018

Thanks, I will therefore try 2.1.5. Is it enough to simply install the new runtime on the server, or do I need to re-publish the app using the new SDK as well?

@karelz
Copy link
Member

karelz commented Oct 3, 2018

Yes, simply install it on server - the latest patch version is always picked up by apps. No need to re-publish.

@schamo
Copy link
Author

schamo commented Oct 4, 2018

OK, I tried this out - unfortunately without success. The workaround's still doing the job, but we'd like to get away from this...
I wasn't able to create a simple repro, either. Seems to be more complicated than I thought. Don't know when I'll find some more time for this, hopefully still this week.

@schamo
Copy link
Author

schamo commented Oct 15, 2018

Just a short update on this... I've already tried various approaches to reproduce this, but the only one that was "successful" was using our own stack, which I won't publish here. I therefore still need to figure out what exactly is causing this behavior. It might be the fact that we're using a hierarchy of Startup classes; that's what I'm going to try next. We're also using SimpleInjector instead of the standard DI implementation, but this apparently wasn't the cause. I'm continuing to work on this one...

@schamo
Copy link
Author

schamo commented Oct 16, 2018

Here we go... I finally managed to isolate the problem.
The issue appears when HttpClient is injected (and used) in two dependent services. You may find an example project here:
https://github.com/schamo/HttpClientIssueReproduction

I'm aware that the code doesn't make a lot of sense. Our use case is OpenID authentication, where we have one service reading the JSON config of the IDP, and another service reading the user information (both via HttpClient).

As written before, the problem occurs only on one environment (which unfortunately we cannot fully control), but not on others. Using the DOTNET_SYSTEM_NET_HTTP_USESOCKETSHTTPHANDLER set to 0 solves the issue; however, I guess that this workaround won't stay around eternally.

Another thing that might be worth noting is that the same issue occurs when we're calling a WCF service from a .NET Core console application - where we have no HttpClient registration at all. If I find some more time, I might include another repro project in the solution mentioned above.

@stephentoub
Copy link
Member

stephentoub commented Oct 17, 2018

You may find an example project here

Thanks, @Schmo. Should I be able to repro this just by opening the .sln and pressing F5 or ctrl-F5? When I do, it appears to work, with the HTML for a google page displayed in the browser. But maybe that's what you were referring to when you said it only repros on one system and not on any others?

The requested name is valid, but no data of the requested type was found is the error code generated by Windows for SocketError.NoData, which generally means there's a DNS-related issue.

@schamo
Copy link
Author

schamo commented Oct 17, 2018

Hi @stephentoub
Yes, starting with F5 should do the job. There's a DataController with a GET method, so /api/data is the way to go (and therefore the default launch URL).
As I wrote before, the code we have runs everywhere on our testing environment, but fails with the given message on the customer's environment. It worked with .NET Core 1.1.x, and it works with the DOTNET_SYSTEM_NET_HTTP_USESOCKETSHTTPHANDLER environment variable set to 0. It works as well if only one service uses the HttpClient, but it fails as soon as I have the situation you see in the example project.
As it does work with the "old" WinHttpHandler, and it works when creating a new HttpClient for every request, I doubt there's a DNS issue. It's really just this specific constellation with the SocketsHttpHandler that does not work, and (until now) only on one given environment.

@karelz
Copy link
Member

karelz commented Oct 17, 2018

@schamo given that it reproduces only in some environments - either we need steps how to recreate such environment locally, or we need access to the failing environment, or we will need to ask you to debug it there.
Which one is easier?

@schamo
Copy link
Author

schamo commented Oct 17, 2018

What indications would you need to recreate such an environment? If I knew the "guilty" setting, I could probably try it on another environment as well. Requesting access to the environment for you is not really an option - not going into detail, but it's complicated...
Debugging will probably be the "easiest" way. What would that mean exactly?

@karelz
Copy link
Member

karelz commented Oct 17, 2018

Assuming you have 100% (or high-percentage) repro, step into SocketsHttpHandler code and see where the error comes from - DNS APIs are suspicious here (likely called from Socket APIs).

@wfurt
Copy link
Member

wfurt commented Oct 17, 2018

This could be environmental, right? e.g. the erros can represent true network failures.
Would that be visible in ETW traces @karelz?

@karelz
Copy link
Member

karelz commented Oct 17, 2018

@wfurt no idea. Note that we added most logging post-2.1 :(
It may be worth a shot - comparing the working repro trace vs. non-working.

@schamo
Copy link
Author

schamo commented Oct 17, 2018

Sorry if this sounds dumb, but how can I step into SocketsHttpHandler code? Do I need to publish the app in a special way so I can see this code?
The repro "works" (i.e. the calls fail) 100% of the time, but as I said, only with SocketsHttpHandler. With WinHttpHandler, there is never a failure.

@schamo
Copy link
Author

schamo commented Oct 17, 2018

How do I enable ETW traces? I read about them somewhere, but didn't get how to use them. Is there a tutorial that I can follow (more or less) one to one?

@wfurt
Copy link
Member

wfurt commented Oct 17, 2018

Uncheck "Debug just my code" in VS. You should be able to get symbols from public symbol servers.
VS should be able to get all working for you.

@schamo
Copy link
Author

schamo commented Oct 17, 2018

I'll try to do that. However, I won't be able to install VS on that server. I'll have to see if remote debugging works; I doubt it, as I have to access the server via Jump Host. This seems a little hopeless right now...

@karelz
Copy link
Member

karelz commented Oct 17, 2018

ETW tracing might be the next best thing, although I am not sure if it will help in this case.
We might get more luck on .NET Core 3.0 daily bits if you can try them for the repro (there's more logging in there).
Or you might try lightweight debuggers like VS Code.

@schamo
Copy link
Author

schamo commented Oct 18, 2018

I tried using 3.0, using this guide. This means I downloaded and installed the latest SDK (3.0.100-alpha1-009689) and added a NuGet.config with the specified content. Then I simply executed dotnet new web. This directly gave me a NU1102 error: The package Microsoft.AspNetCore.App with version <>= 3.0.0-alpha1-10062 could not be found. It says as well that the latest version in the dotnet-core feed is 2.2.0-preview3-35497. What went wrong here?

If trying the 3.0 version does not bring us any further, we'll just run the repro app on the next higher environment. If it works there, we'll ask the customer's IT department to analyze the config differences between the environments. We won't go the debugging way right now, as no-one is willing to spend more money on this. Even though we still may be forced to do it later...

@karelz
Copy link
Member

karelz commented Nov 2, 2018

OK, closing it for now then as it is not actionable. If you have more information in future, ping us and we can reopen. Thanks!

@karelz karelz closed this as completed Nov 2, 2018
@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 3.0 milestone Jan 31, 2020
@dotnet dotnet locked as resolved and limited conversation to collaborators Dec 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants