-
-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash with libCurl 7.57.0 and CURL_LOCK_DATA_CONNECT #2132
Comments
I wrote up a pthreads-using alternative version, which runs 37 threads, each thread doing 11 loops and they're all downloading a 512MB file from I didn't manage to reproduce the crash but I've starred at the stack traces above and read code and I think I know what's going on. The problem is in |
No, I was wrong. The connection cache that is initialized and kept in the multi handle is not used when the shared connection is used so that's not it... |
The first stack trace points at the "host cache" that seems to cause the issue: |
I tested it just yet at home with macOS. There is no crash either. Seems to happen on windows only. |
This is the version of the testprogram I ran on macos: I tried this version on windows 10 with Visual Studio 2015 as well: on windows this version of the program crashes nearly every time. |
Interesting. Some questions:
|
|
I have an idea. Will offer a test patch soon. |
If the lock is released before the dealings with the bundle is over, it may have changed by another thread in the mean time. Fixes #2132
It'd be great if you could try out a build with the #2139 change applied! |
Thanks for the fast patch. I have some interesting new fact: the patch you provided seems to make the example crash free as long as I use a maximum of 5 threads. After the example is finished I have 5 sockets in windows that are in state WAITING according to netstat. |
I found another suspicious detail that I pushed to the PR branch. When scrutinizing this functionality closer, it also struck me that it will behave badly for the cases where pipelining or h2 is attempted on a connection that is owned by another thread. We need to make sure that those features are limited to easy handles that are part of the same multi handle. But this crash isn't using either of those functions... |
Please try again with my additional teeny weeny fix added. It is a bit tricky for me since I can't reproduce this easily - which given the flaws I found I honestly can't really understand why I can't more easily... :-/ |
Yes I would but I don't see any commit on that branch. Which commit (fix) do you mean? |
Still getting crashes but less frequently. Callstack is:
And I still see a lot of WAITING sockets per run. If the connection pool cache size is set to its default value 5 shouldn't there be a maximum of 5 waiting sockets after each run? |
okay thanks, then I think we consider that a small step in the right direction at least... |
With how many threads? When the pool fits 5 connections that are all in use when the 6th request ends, it will close that 6th connection. So if you do 100 parallel connections that all end at the same time you'd get about 95 closed connections at once and 5 kept alive in the pool. |
I referenced my example https://gist.github.com/patrickdawson/da3ff09835dfc0c19caf2fd94c105f18, so with 100 threads. Than I misunderstood the option. |
Yeah, I have more work coming up to test soon... |
I updated the PR again... |
Note |
Hey. I ran my test program (I referenced my example https://gist.github.com/patrickdawson/da3ff09835dfc0c19caf2fd94c105f18) 20 times and did not see a single crash. It seems like you fixed it. |
Awesome, thanks for verifying this. I'll clean up, squash some commits somewhat and merge into master soonish. |
I did this
I tried the new option CURLSHOPT_SHARE with parameter CURL_LOCK_DATA_CONNECT available in libcurl 7.57.0 according to https://curl.haxx.se/dev/release-notes.html. If I use this option I get random crashes in my application with different call stacks.
Here is an example program to reproduce the problem:
As test server I used a small nodejs express server:
Here are the callstacks from the crashes I got:
Callstack 1 (this one happens most of the time) with the following error:
free(ca->ai_addr); // Read access violation, ca is 0x1
Callstack 2 (less frequently than callstack 1 in my tests) with the following error:
diff = Curl_timediff(node->time, now); // node is 0
I expected the following
I expected that the application does not crash and that the sockets are reused so that the operating system does not report 300 WAITING sockets after the application has finished.
curl/libcurl version
7.57.0
operating system
Windows 7 Enterprise Service Pack 1
The text was updated successfully, but these errors were encountered: