Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asyn-thread: Don't block waiting on resolver threads during multi_done #4882

Closed
wants to merge 2 commits into from

Conversation

jay
Copy link
Member

@jay jay commented Feb 5, 2020

This commit changes the behavior of libcurl to no longer block on
incomplete resolve threads from the parent thread (ie the user's thread
which is the multi perform thread) during multi_done.

Instead it now orphans those threads in a master list which is culled
periodically by soon-to-be exiting orphans to wait on and destroy those
that are in the process of or have since exited, which is fast. On
global cleanup we wait on and destroy any remaining threads, which may
be slow but at that point we cannot defer it any longer.

Ideally we would not wait for orphaned threads at all, but we have to
because after global cleanup the user may choose to unload the shared
library that is/contains libcurl.

Note: thread_wait_resolv (Curl_resolver_wait_resolv) which blocks
waiting for a resolver thread remains unchanged, since that is
essentially the purpose of the function. It is currently called by FTP
and SOCKS code, and continues to be a known issue that those two may
block waiting for async resolves to finish.

Fixes #4852
Closes #xxxx

@jay jay added the name lookup DNS and related tech label Feb 5, 2020
lib/curl_threads.h Outdated Show resolved Hide resolved
@bagder
Copy link
Member

bagder commented Feb 5, 2020

I'm for the idea of removing the wait from multi_done and doing it at a later point.

I'm against adding more global stuff that isn't thread-safe (it goes against my long term plan on making a thread-safe curl_global_init). I think the benefit of doing the list globally instead of per multi-handle isn't worth the price. I vote for the list to be done in the multi handle instead and thus the possible wait would be in curl_multi_cleanup() instead.

@jay
Copy link
Member Author

jay commented Feb 5, 2020

I'm against adding more global stuff that isn't thread-safe (it goes against my long term plan on making a thread-safe curl_global_init). I think the benefit of doing the list globally instead of per multi-handle isn't worth the price. I vote for the list to be done in the multi handle instead and thus the possible wait would be in curl_multi_cleanup() instead.

Making global init thread-safe is possible by doing a compare-and-swap (CAS), as long as dependencies have thread-safe initialization as well, as discussed in #4064. Making it thread safe without any type of synchronization in libcurl strikes me as difficult and will create inefficiency as everything would have to be done per handle. That seems like a waste. Also it would eliminate a lot of the gain of this PR. I disagree with this vote.

This commit changes the behavior of libcurl to no longer block on
incomplete resolve threads from the parent thread (ie the user's thread
which is the multi perform thread) during multi_done.

Instead it now orphans those threads in a master list which is culled
periodically by soon-to-be exiting orphans to wait on and destroy those
that are in the process of or have since exited, which is fast. On
global cleanup we wait on and destroy any remaining threads, which may
be slow but at that point we cannot defer it any longer.

Ideally we would not wait for orphaned threads at all, but we have to
because after global cleanup the user may choose to unload the shared
library that is/contains libcurl.

Note: thread_wait_resolv (Curl_resolver_wait_resolv) which blocks
waiting for a resolver thread remains unchanged, since that is
essentially the purpose of the function. It is currently called by FTP
and SOCKS code, and continues to be a known issue that those two may
block waiting for async resolves to finish.

Fixes xxxxs://github.com/curl/issues/4852
Closes #xxxx
@bagder
Copy link
Member

bagder commented Feb 5, 2020

That seems like a waste. Also it would eliminate a lot of the gain of this PR. I disagree with this vote.

I think it is a trade-off that's worth taking. You seem to value this extra optimization to the detriment of those who use libcurl in plugin based systems and struggle with curl_global_init in its current design.

By storing the list in the multi handle, most applications probably won't see the issue until the moment just before they would call curl_global_cleanup anyway, and applications mostly do only one multi handle per thread.

By making it global, you very deliberately dismiss the users with the use cases where they have a hard time to call the init in a thread-safe way.

How do you determine which user category is more important or more frequent?

@jay
Copy link
Member Author

jay commented Feb 5, 2020

Yes, it's to their detriment because they are not using the API as documented. In some cases they just can't. As you know I am sympathetic but it's true I do favor those who are following the doc.

I agree curl_global_init should be thread-safe. I just don't see how we easily do that without synchronization, which is what it sounds like you are suggesting. Can you describe your long term plan for the global functions? (if it's already somewhere I missed it or forgot)

@bagder
Copy link
Member

bagder commented Feb 6, 2020

Yes, it's to their detriment because they are not using the API as documented. In some cases they just can't.

I'm talking about that second class of users.

Can you describe your long term plan for the global functions? (if it's already somewhere I missed it or forgot)

My wish is to eventually make curl_global_init thread-safe and I'm taking steps in that direction, one by one. Ideally, we'll end up with a situation where the function can become more or less a no-op function under the right circumstances. Possibly with an extra build option, possibly something else. I don't have everything worked out and I don't have a schedule for it.

We can however not reach that point if we add more stuff of our own to that function that isn't thread-safe.

@jay jay added the stale label Jul 22, 2020
@jay
Copy link
Member Author

jay commented Jul 22, 2020

Stale, we couldn't reach an agreement.

@jay jay closed this Jul 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
name lookup DNS and related tech stale
Development

Successfully merging this pull request may close these issues.

Multi perform hang waiting for threaded resolver
3 participants