Documentation suggests calling functions at incorrect times #586

Closed
marc-groundctl opened this Issue Dec 30, 2015 · 9 comments

Projects

None yet

5 participants

@marc-groundctl

In the libcurl documentation (https://github.com/bagder/curl/blob/4b96240d3fa589babfcab573d0806cb3b4ebfdb4/docs/libcurl/libcurl.3#L189) one of the recommended ways to call curl_global_init is in the constructor of a global variable. However, on Windows this may not work properly because curl_global_init calls WSAStartup, and WSAStartup is not designed to be called while other dlls are being loaded (https://msdn.microsoft.com/en-us/library/windows/desktop/ms742213(v=vs.85).aspx). Please remove this recommendation from the documentation.

@bagder
Member
bagder commented Jan 4, 2016

The curl_global_init() function is already documented to not be thread-safe and it states "You must not call it when any other thread in the program (i.e. a thread sharing the same memory) is running." I think that is clear enough.

Your objection because of a subtlety on Windows doesn't seem like a strong enough reason to not provide this suggestion, to me.

@marc-groundctl

This isn't about thread safety though. This happens when dlls are loaded, before main, when no other threads are running.

@dfandrich
Collaborator
@jay
Member
jay commented Jan 4, 2016

I propose something like this fix where we add to libcurl.3 and curl_global_init.3:

If you are initializing libcurl from a Windows DLL you should not initialize
it from DllMain or a static initializer in the DLL because Windows holds the
loader lock during that time and it could cause a deadlock.

TL;DR: Everything below is my long take on Windows DLLs and threads.


This happens when dlls are loaded, before main, when no other threads are running.

Not necessarily. This is actually a complicated issue. Other threads may be created depending on the DLLs loaded. One DLL may depend on others and that will determine the ordering. You may have an anti-virus client or something injecting threads as well. Also a DLL may be loaded at distinctly different times: load-time (ie you start the program and it loads the DLL before main) or run-time (ie a program or one of its already loaded DLLs could have code to call LoadLibrary).

The loader lock is re-entrant. But, as I just explained, just because it's the same thread it may not be fine. There is code beyond your control that may spawn other threads. This includes Microsoft code, though I have to admit they usually do a good job of managing for this scenario. But not always, and for exactly that reason they warn what you can do in DllMain is extremely limited.

So how can you be sure that at any point in your program, including before main, that there will be no other threads? Well, unfortunately you can't (within reason). It doesn't matter whether or not you compiled your program to use the multithreaded CRT. You may observe when you are testing your program that there are no other threads at a certain point but, as I just explained, that doesn't mean as much as you think. You could suspend all running threads but that may cause other problems in your program depending on the code that's running. Again Microsoft code is mostly accounting for stuff like this but 3rd party code who knows.

Basically, though the documentation for curl_global_init states that you must not call it when any other thread is running, in Windows you cannot expect that at point n it's certain no other thread will be running. You can suspend threads and do magic but even then it's still not 100%. I'd guess though that programmers not aware of this interpret the documentation to mean call at the beginning of their program before they knowingly create any other threads.

To be specific to your point on Winsock, I assume you are referring to this:

The WSAStartup function typically leads to protocol-specific helper DLLs being loaded. As a result, the WSAStartup function should not be called from the DllMain function in a application DLL. This can potentially cause deadlocks. For more information, please see the DLL Main Function.

The proper way for DLL initialization to take place is to have an initialization function in your DLL. This would be in our case 'curl_global_init'. But what if you have program foo which load-time links to your DLL bar which load-time links to DLL libcurl? The proper way is the program calls bar_init which calls curl_global_init, at the very beginning of the program.

If you are injecting your DLL or your DLL does not have a called initialization routine and you cannot change the program then initializing is harder. You can wait until the loader lock is released by, ironically, creating a thread from DllMain that waits for the loader lock to be released or in the case of injection where a suspended thread will be resumed you can queue a user apc call.

You are correct initializing libcurl via static initializer may not work properly because in the case of a static initializer in a DLL the loader lock is held during that time. Therefore I think we should make a note of this.

@bagder
Member
bagder commented Jan 4, 2016

I'm a 👍 on that suggestion!

@bagder
Member
bagder commented Jan 4, 2016

And man, thanks @jay for that totally unbelievably detailed description and elaborate documentation on the procedure.

@jay jay added a commit that referenced this issue Jan 4, 2016
@jay jay curl_global_init.3: Add Windows-specific info for init via DLL
- Add to both curl_global_init.3 and libcurl.3 the caveat for Windows
that initializing libcurl via a DLL's DllMain or static initializer
could cause a deadlock.

Bug: #586
Reported-by: marc-groundctl@users.noreply.github.com
19ca401
@jay
Member
jay commented Jan 4, 2016

No problem! Landed in 19ca401.

@jay jay closed this Jan 4, 2016
@vasild
vasild commented Oct 2, 2016

Hello,

From the CURL documentation and from this thread it remains unclear how to call curl_global_init() on Windows from a shared library module (DLL) that is to be used by other modules.

So, how to handle that case?

@jay
Member
jay commented Oct 2, 2016 edited

From the CURL documentation and from this thread it remains unclear how to call curl_global_init() on Windows from a shared library module (DLL) that is to be used by other modules.

Your DLL should have an initialization routine and that's where you'd call curl_global_init, the same as you would any of your other initialization functions. Many functions should not be called from DllMain or a static initializer, that is noted in MS docs and for the issue reported in this thread I added the same warning to curl_global_init doc.

In your case these things are especially notable:

curl_global_init is documented that it should be called when no other threads are running. In practice on Windows that's really unlikely in your case. That warning is there because curl_global_init cannot be called concurrently and is allowed to call other library initialization routines that are not thread safe or cannot be called more than once. Some libraries init calls are thread safe like Windows general init (winsock), SSPI init, WinSSL init, OpenSSL 1.1.0 init (if built with thread support). Some are not (which? I don't know, you'd have to review the libraries your libcurl is linked to). Some you may get away with it like OpenSSL <= 1.0.2 as long as it's not initialized twice, regardless of whether it's done concurrently. I'm hesitant to note any of this in the docs ("Hey, you may get away with it. Let's see what happens!") because I don't want to create any false assurances, it is very easy to get wrong.

And another issue is if your module could be dynamically unloaded that is not supported at the moment.

In the future please ask questions on one of the mailing lists or if you believe you have an issue that should be corrected then file it as a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment