New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HttpClient throws unexpected TaskCanceledException when HttpClient is reused #24392
Comments
|
As an update, the issue appears to occur mostly on requests that have Though there have been a very few exceptions to this. |
|
@shravan2x Can you provide a short small repro app for the issue, with the exact combination of options for which this issue repros. The proxy stuff is not required for the repro right, going by your description, it would be great if only relevant code is there. |
|
@Priya91 I'm trying to get one up, but haven't had success yet. Making the same request over and over (with I've currently pushed a version of the app that replaces all common occurences of I'll post back when I have more information. |
|
I was finally able to reproduce it! It has nothing to do with the
It only happens on Linux. The correct behavior would be for it to detect closed connections and recreate a new one. I remember @davidsh mentioning that .NET Core on linux uses a CURL wrapper, but Windows uses something else. That might lead to the differing behavior. This explains why it didn't occur when I used a new Reproduction: HttpClient hc = new HttpClient();
while (true)
{
Console.Write(DateTime.Now + " ...");
await hc.GetAsync("https://api.steampowered.com/", (new CancellationTokenSource(TimeSpan.FromSeconds(30))).Token);
Console.WriteLine("done");
Console.ReadKey();
}Run this, wait 5-6 minutes, then enter a key to send another request. Result on Windows: Result on Ubuntu 16.04: The exception occurs after 30s, as expected from the CancellationToken we passed in. P.S. I made a separate issue for the extra newlines in the Ubuntu output at https://github.com/dotnet/corefx/issues/25809. |
|
I am also having the same problem but after I deploy it to Azure (Azure Container Service, Ubuntu 16.04).
In my case, wait for 4 minutes is enough to reproduce it. (TCP/IP TIME_WAIT 240 seconds?) So, my current workaround (until the bug is fixed) is, remember last invocation time and then dispose+renew the HttpClient instance after 4 minutes as necessary, so it will still reuse the same instance within 4 minutes since the last invocation. |
|
@jo-ninja I'm not sure a fixed 4 minute window would always work, since (apart from .NET Core's own timeout) different servers may have different connection timeouts. |
|
@shravan2x Thanks for coming up with a more helpful repro. It sounds like a bug, we will investigate. |
|
This does seem like a pretty serious issue. |
|
@shravan2x I tried this repro with latest changes from master on ubuntu1704 with curl 7.52.1, and was not able to repro it. Can you try with the latest master bits? |
|
@Priya91 Does .NET Core use the version of CURL I can find with If not, how would I check this? |
|
@shravan2x I got the same issue. |
|
I'm on 7.47.0. @Priya91 If you can't reproduce it, it might be something that was fixed on master since at least |
|
I changed the code, do not always new HttpClient, it turns well. |
|
@Priya91 can you check your test on 2.0? It would be good to have a confirmation we are able to reproduce it on 2.0 and the fact we can't reproduce it on master (2.1) means, it is fixed. @tiayi I don't follow - the code snippet above does not create new |
|
@shravan2x I was not able to repro this for 2.0 as well, |
|
@shravan2x Can you share more information about your runtime environment, which linux distro version are you on, is it a docker container? Can you try your code on updated version of curl? |
|
Runtime Environment: .NET Core 2.0.2 (A regular installation, not preview or anything) I can't try an updated CURL until next Wednesday or so. |
|
I was not able to repro on a vanilla ubuntu16.04 vm with .netcore 2.0.2 @shravan2x Are you able to reproduce on other ubuntu16.04 vms? Looks like your machine could be in corrupted state or someother issue is interferring with your requests. |
|
@Priya91 I've tested on 3 VMs so far. All of them have the issue. It might have something to do with the fact that I upgraded most of these from the 1.0.0 preview versions. I'll try on a vanilla VM and post back. |
|
I was able to reproduce a similar issue with the AWS SDK using HttpClient underneath on Ubuntu 16.04. Details can be found aws/aws-sdk-net#796 |
|
@danielmarbach why do you think it is the same problem? There is no concurrency in this case. |
|
I was not able to reproduce it on the same machine with the code provided here and not even when taking concurrency into account. When you look at the stack trace the exception is also raised from |
Almost all exceptions will be from there, that's the method that awaits the execution of the send task that sends the request. |
|
I don't think 2.0.4 will make any difference for your case. Closing for now. Feel free to reopen if there is actionable repro / more people hitting the problem. |
|
@karelz One thing is that I have I side-by-side installation of dotnet 1.1.4 on the VMs with the issue. I'll try removing to see if that makes a difference. |
|
@shravan2x there may be even more machine-wide configurations. Removing things one by one is usually good strategy to understand how to build a new VM from scratch to reproduce it. If you succeed, just post more details and let us know, or reopen the issue. |
|
It is weird/suspicious that step [2] is needed to reproduce it. |
|
Any update or work-around on this? I'm experiencing this issue with dotnet core runtime v2.0.4 running on Raspberry Pi with Debian 9 (Stretch).
|
|
@1iveowl if you have a repro, or something we can reproduce here and investigate, we would be happy to look at it as I mentioned above. |
|
@karelz It's a tough nut. I tried to create dumbed down repo, but I can only provoke the error when I try and connect to a web service running on a particular server running in my local environment. The server is not exposed on the internet. |
|
Is it possible to try to replicate the server setup somewhere on the internet (e.g. Azure)? We could run small repro against existing server. |
|
Would it be so far off to try my AWS repo? It might still lead to some insights |
|
It should not matter where it is hosted as long as we can reproduce the problem. It is primarily about the client anyway. |
|
@danielmarbach AFAIK your repro used AWS SDK - that is something I would like to remove from the repro (removing the chance it is some weirdness introduced by the SDK). |
|
We encountered the same issue on Windows Server 2008 R2. |
|
I reproduced the issue on a ubuntu 16.04 x64 vm with runtime 2.0.6 and sdk 2.1.101. The issue came out with the default It seems to have something to do with the call of |
|
@lukazh do you have a simplified repro you can share? |
|
@karelz FYI. No big difference from what shravan2x provided. I will try it out with 2.0.3 later to check if the issue comes out again. |
|
@lukazh we were not able to reproduce the original problem in house or on vanilla VM (see https://github.com/dotnet/corefx/issues/25800#issuecomment-352556764 and https://github.com/dotnet/corefx/issues/25800#issuecomment-354729309). |
|
@karelz If this is too difficult to reproduce on a vanilla VM, perhaps we could try to reproduce in a container, freeze it and upload? |
|
@shravan2x that is a good idea. It would be still good to minimize the steps which are installed / executed on the container prior to the repro. |
|
I ran some more tests on the same VM with different URLs and the exception never came out. Even with the URL I used to repro it can't ensure to run into the issue. |
|
@lukazh My timeout was 100s for an endpoint that usually completes <1s. I'm fairly confident waiting longer wouldn't change much. |
|
Any updates on this issue? |
|
Any updates here. Our API service (ASP.Net API) runs on Azure Classic App Service and out of every 100000 request 200 request fail with Internal Server Error with same exception. Please assist [EDIT] Format call stack by @karelz |
|
@dshevani your callstack seems to be from .NET Framework (System.Web.* usage), not from .NET Core. Moreover it does not seem to use HttpClient at all, so it is likely not related to this issue. For .NET Framework help, you can use community channels (like StackOverflow), or if you have a repro / specific bug report, then VS Developer Community. @sepehr1014 in general, we are still blocked on getting anything actionable here - repro, or deeper analysis/debugging from a repro. |
|
The team I work with has been observing this issue on a host of .NET Core services we have - all of them have had the 100s timeout on various calls with no correlation to call size and minimal correlation to load. We also experienced the frustration of low reproducibility and inability to get the issue to occur in our dev environments. I suspect part of that stemmed from developing on Windows boxes, but our services are running in containers on Linux. Luckily we have seen the issue resolve when we built against .NET Core 2.1 RC. I haven't had the time to completely isolate the issue, but since was present when we built against 2.1 Preview 1, we believe it was related to the fact that since preview 2, the default is to use the SocketsHttpHandler instead of the native handler (which uses libcurl). After almost 3 days running so far we haven't seen any timeouts, so we're pretty confident it's resolved. |
|
@booradlus that's great news, thanks for confirmation that 2.1 RC helps! |
|
@karelz I had the exact same issue on Debian Stretch Docker images. Roughly 1-3 calls in a 1000 from a static The HTTP call would never leave the box, so the target service never even recorded an incoming request. It was happening fairly consistently, but impossible to reproduce on demand though, and appeared to happen with no apparent behavior pattern. The problem disappeared after upgrading to the .NET Core 2.1 RC runtime, so it might have been some issue with the curl handler. |
|
do you have stanalone simple repro @filipw ? Something I can just run and get repro? |
|
Thanks @filipw for letting us know it does not reproduce for you on .NET Core 2.1 RC anymore! The root-cause might have been in libcurl (specific version), or how we interact with it in CurlHandler. We introduced SocketsHttpHandler (default in 2.1) to help us address exactly this class of problems and make behaviors on all platforms uniform. It's good to see it is paying off. |
This is the raw stack trace from the exception that occurs:
The issue:
My problem originates with the
HttpWebRequestAPI being depricated and a significantly lower performance version being used in .NET Core 2.0. I use a library (QuarkHttp) that is a tiny wrapper around theHttpWebRequestAPI. When porting this wrapper library to .NET Core, I used a sharedHttpClientinstance to avoid the overhead of creating new connections each time (following @geoffkizer's comment /cc @davidsh). My code looks like this:The exception occurs on this line
In the exception handler, I measure the time taken myself to ensure that I wasn't missing something obvious. This is the output of that logging:
One point to note is that this issue occurs across atleast 5 different domains that I know of, but have never seen this issue anywhere in the past. It works as expected on the .NET framework 4.5.x,4.6.x,4.7.x and Mono 4.x.x/5.x.x. However, on .NET Core, the issue occurs very often (many times an hour in my application).
This is why I think this is a framework issue rather than my own error:
In the
SendNewAsyncmethod, the lineHttpClient httpClientToUse = _defaultHttpClient;causes the reuse of the default HttpClient. When this line is changed toHttpClient httpClientToUse = CreateHttpClient((false, null));(which is used to initialize the default HttpClient in the first place as seen in the constructor), a new HttpClient is used for every request. This makes things slower, but the issue disappears. It only occurs when the sameHttpClientis reused.I have no code that modifies the
HttpClientin any way after it is initially created. In my specific application, the proxy and allowRedirects options are never changed, so theHttpClientthat's used wasn't taken from_proxyHttpClientseither.I'm not sure how to debug this issue further. I could definitely test things if anyone has ideas.
My .NET Core version is 2.0.2, and it runs on Ubuntu 16.04.
The text was updated successfully, but these errors were encountered: