-
-
Notifications
You must be signed in to change notification settings - Fork 416
fix deadlock when creating a thread while executing DllMain in another thread #86
Conversation
I think we should rather fix this by calling dll_thread_attach and |
These are different use cases. The deadlock appears in a DLL that does I don't think that you can move dll_attach/detach_* elsewhere, it is the |
The idea is to use DLLMain only to register the library and then The host should either explicitly initialize D's runtime or if that is not A small subset will remain where you can't avoid it and we should provide Making DLLMain the default place for initialization is a mistake as it rules IIUC a solution to the above deadlock would be to do the initialization from the |
Avoiding the loader lock might work with an explicite initialization Both described workarounds are not feasable:
I am not sure we have to solve an issue that no other language has yet So I think, the patch solves a problem of the current runtime, but not |
I'm really breaking my head about a good DLL thread model, and while Windows
I'd argue this must be done explicitly, e.g. by providing an init_thread DLL_THREAD_ATTACH won't get called for already running threads For initialisation outside of DllMainAs a plugin writer you still control the doors to your library, |
m_hndl = cast(HANDLE) _beginthreadex( null, m_sz, &thread_entryPoint, cast(void*) this, 0, &m_addr ); | ||
if( cast(size_t) m_hndl == 0 ) | ||
throw new ThreadException( "Error creating thread" ); | ||
ResumeThread( m_hndl ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the return value.
Why is it safe to call ResumeThread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. I will add the return code check.
When I debugged it, it was CreateThread that acquired the loader lock, but not ResumeThread. So the lock hierarchy is fine with slock never acquired before the loader lock (the order is given by DllMain now).
dll_attach_process iterates over all existing threads of the process and initializes TLS by "impersonating" the thread (switching a pointer in the thread environment block). I'll have to think about how the initialization could work in general, but I'm not familiar with Posix. I guess it has some init function to be called when the shared library is loaded, but probably none to get notified when a thread is created. |
As the pull request does fix the lock order we should go for it. |
I've rebased, resolved the conflict and added the return value check for |
Huh? Two identical (as for the actual diff) show now up in the history for me. |
I don't understand. What is duplicated for you? Maybe something related |
No, rebasing is not really an issue with pull requests. But you managed to merge the |
After reset, I cannot push anymore (error: failed to push some refs). When I pull as the error messgae tells me, there is nothing to push. Arrrrggh, that's why I hate git, whenever you try to do something it gets in the way and steals all your time. BTW: the diff in github looks fine to me. |
Even simpler: |
Thanks, that might have done it. |
Yes, looks great now! As for the actual commit, though I'm not a Windows expert, I see the problem and the fix seems reasonable, so +1 from me. |
I've been reading about TLS on linux in this document from 2005: http://www.akkadia.org/drepper/tls.pdf Is the __tls_get_addr mechanism still used? If yes, a TLS initialisation callback could be added by replacing __tls_get_addr with a statically linked replacement implementation. |
I've dismissed that idea, because initialization of TLS is done lazily but the When loading libraries through core.runtime we should do it right away, which requires to traverse |
@dawgfoto Is this good to go? There's rumor on the street it blocks making druntime a shared lib in the future. Please advise, thanks! |
Which street? We're doing way too much in DllMain and we'll keep getting deadlocks (static initialization with the Windows loader lock held). |
I think the patch is vital for the DLL that contains the thread creation code i.e. druntime.dll. I agree, that there can be too much going on inside DllMain, but it isn't a good idea to put heavyweight code into thread-local static constructors to start with, as they slow down every thread created in the process. Its a bit more critical for shared static constructors, but I'd suggest lazy initialization for the win! Maybe it was intentional, but I think there was a merge error: the patch moved "add(this)" before ResumeThread (reasoning in the comment before), but it is below in the commit. |
It was intentionally. I fixed a bug that was introduced by moving the add. If an exception were thrown on error you'd end up with a stale Thread entry. Because this is |
You are right, the order should be irrelevant because of the |
A recent bug report for Visual D exhibited a problem with the thread creation
inside a DLL: if a thread is created while another thread is just excecuting
DllMain(THREAD_ATTACH), a dead lock is almost inevitable:
Main thread, that creates the new thread executes:
While the other thread executes:
The problem is: the OS serializes calls to CreateThread and DllMain, so the
threads hang at the marked lines.
The solution is to create the thread suspended without the slock, and just add and resume it
with the lock acquired.