Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sporadic crashes in curl_multi_remove_handle #627

Closed
Rider-Linden opened this issue Feb 1, 2016 · 6 comments
Closed

Sporadic crashes in curl_multi_remove_handle #627

Rider-Linden opened this issue Feb 1, 2016 · 6 comments
Assignees

Comments

@Rider-Linden
Copy link
Contributor

I'm encountering a sporadic crash in curl_multi_remove_handle, it appears to be an interaction with timeouts and pipelining very much like what is described in this thread: http://curl.haxx.se/mail/lib-2011-09/0305.html

Unfortunately, the conversation appears to trail off right after "A-ha! Pipelining!". Was any resolution ever reached?

For the record we are seeing this in 7.42.1 and my call stack looks like:

[0] Curl_getoff_all_pipelines [secondlife-bin url.c]
[1] curl_multi_remove_handle [secondlife-bin multi.c]
[2] LLCore::HttpLibcurl::completeRequest(void *,void *,CURLcode) [secondlife-bin _httplibcurl.cpp]
[3] LLCore::HttpLibcurl::processTransport() [secondlife-bin _httplibcurl.cpp]
[4] LLCore::HttpService::threadRun(LLCoreInt::HttpThread *) [secondlife-bin _httpservice.cpp]
[5] boost::detail::function::function_obj_invoker1,boost::_bi::list2,boost::arg<1> > >,LLVisualParam *,int>::invoke(boost::detail::function::function_buffer &,int) [secondlife-bin function_template.hpp]
[6] LLCoreInt::HttpThread::run() [secondlife-bin _thread.h]
[7] boost::detail::thread_data >::run() [secondlife-bin thread.hpp]

I have not been able to get a local reproduction yet... I'm only seeing this as reports from the wild.

@bagder bagder self-assigned this Feb 1, 2016
@bagder
Copy link
Member

bagder commented Feb 1, 2016

I can't recall any specific fix and I can't find any in the git log now either, so I would probably say that we didn't get anywhere on that.

So you are using pipelining too?

@Rider-Linden
Copy link
Contributor Author

Yes, We're using pipelining primarily for resource retrieval from our CDN (Textures, Meshes, etc.)

Reconstructing what I can from the user logs associated with this crash I'm seeing a number of failed or timedout pulls from the CDN, so I realize that I am making an assumption here.

@bagder
Copy link
Member

bagder commented Feb 6, 2016

I guess you can't reproduce it? Can you get more debugging info in the stack trace?

@bagder
Copy link
Member

bagder commented Feb 16, 2016

Pipelining is not a very widely used feature so it wouldn't surprise me if there are a few bugs lurking in those corners. HTTP/2 is a much better pipelining so you might want to try to move over to that.

Without any steps to reproduce or further details on this problem there's a not a lot we can do. I'm closing this, but if more or further detail will arise we can always reopen or create a new bug .

@bagder bagder closed this as completed Feb 16, 2016
@Rider-Linden
Copy link
Contributor Author

Sorry... I just saw your email...
I was finally able to reproduce and track down what was going on. It appears that this was a regression from an old bug referred to as #1420. On a slow connection with a large number of pipelined requests some of the requests would time out. After the timeout they would be marked for deletion. However, url.c IsPipeliningPossible and bundle checking loop in ConnectionExists would potentially reuse these doomed connections.

There were a number of patches from Carlo Wood from several years ago that addressed the issue, but it would seem that his changes to url.c were lost in the shuffle. I've reintroduced the checks in url.c and will submit a pull request.

Rider-Linden added a commit to Rider-Linden/curl that referenced this issue Feb 19, 2016
…tion for pipelining. Restores checks by Carlo Wood.
@bagder
Copy link
Member

bagder commented Feb 19, 2016

Lovely, looking forward to that. Yes, Carlo Wood did a lot of good work but we never got a completed patch set and then he vanished.

nopjmp pushed a commit to nopjmp/curl that referenced this issue Oct 14, 2016
…tion for pipelining. Restores checks by Carlo Wood.
jay pushed a commit that referenced this issue Oct 14, 2016
No longer attempt to use "doomed" to-be-closed connections when
pipelining. Prior to this change connections marked for deletion (e.g.
timeout) would be erroneously used, resulting in sporadic crashes.

As originally reported and fixed by Carlo Wood (origin unknown).

Bug: #627
Reported-by: Rider Linden

Closes #1075
Participation-by: nopjmp@users.noreply.github.com
@lock lock bot locked as resolved and limited conversation to collaborators May 7, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

2 participants