Fix Issue 21784: Allow detached threads to be joined #3418
Conversation
Thanks for your pull request and interest in making D better, @omerfirmak! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla references
Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "stable + druntime#3418" |
Does this have a bugzilla issue number? Also a test would be nice. |
I don't think there is one. |
Please create one and title the commit message "Fix issue #####: Description of issue" |
This is for the thread constructor issue ? Could you write a test too ? https://github.com/dlang/druntime/tree/master/test/thread/src |
All done |
Now plz target stable and we're all good. |
ooh, stable good catch. |
This is useful when a thread terminates before the caller is able to join().
Ready |
@@ -299,7 +299,8 @@ class Thread : ThreadBase | |||
} | |||
else version (Posix) | |||
{ | |||
pthread_detach( m_addr ); | |||
if (m_addr != m_addr.init) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to apply the same fix for version (Windows)
above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a windows machine to test with but if CloseHandle
crashes with m_hndl.init
as it's argument, we probably should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If test is correct, then you should 'seen failures on the Windows CI pipelines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is fine then, AFAICS it should be triggering a CloseHandle
call with m_hndl.init
PR dlang#3418 allowed them on Posix; extend it to Windows too. The added test/thread/src/join_detach.d test now passes on Windows (tested by LDC).
PR #3418 allowed them on Posix; extend it to Windows too. The added test/thread/src/join_detach.d test now passes on Windows (tested by LDC).
th.start(); | ||
sem.wait(); | ||
while (th.isRunning()) {} | ||
destroy(th); // force detach |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This triggers a spurious CI failure, e.g.:
*** Error in `./generated/linux/release/32/join_detach': double free or corruption (fasttop): 0xf6b004b8 ***
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(+0x67377)[0xf7612377]
/lib/i386-linux-gnu/libc.so.6(+0x6d2f7)[0xf76182f7]
/lib/i386-linux-gnu/libc.so.6(+0x6dc31)[0xf7618c31]
./generated/linux/release/32/join_detach(_D4core8internal9container6common8xreallocFNbNiPvkZQe+0x1e)[0x807d17e]
./generated/linux/release/32/join_detach(_D2rt5tlsgc7destroyFNbNiPvZv+0x8c)[0x807a93c]
./generated/linux/release/32/join_detach(_D4core6thread8osthread6Thread6__dtorMFNbNiZv+0x19)[0x8077f21]
./generated/linux/release/32/join_detach(rt_finalize2+0x58)[0x807fe88]
./generated/linux/release/32/join_detach(rt_finalize+0x1e)[0x8079d4e]
./generated/linux/release/32/join_detach(_D6object__T7destroyVbi1TC4core6thread8osthread6ThreadZQBrFNbQBkZv+0x1b)[0x807797f]
./generated/linux/release/32/join_detach(_Dmain+0x74)[0x8077928]
./generated/linux/release/32/join_detach(_D2rt6dmain212_d_run_main2UAAakPUQgZiZ6runAllMFZ9__lambda2MFZv+0x21)[0x8078f9d]
./generated/linux/release/32/join_detach(_D2rt6dmain212_d_run_main2UAAakPUQgZiZ7tryExecMFMDFZvZv+0x15)[0x8078e39]
./generated/linux/release/32/join_detach(_D2rt6dmain212_d_run_main2UAAakPUQgZiZ6runAllMFZv+0x8a)[0x8078f1a]
./generated/linux/release/32/join_detach(_D2rt6dmain212_d_run_main2UAAakPUQgZiZ7tryExecMFMDFZvZv+0x15)[0x8078e39]
./generated/linux/release/32/join_detach(_d_run_main2+0x193)[0x8078dc3]
./generated/linux/release/32/join_detach(_d_run_main+0x95)[0x8078c15]
./generated/linux/release/32/join_detach(main+0x1f)[0x807795f]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf7)[0xf75c3637]
./generated/linux/release/32/join_detach[0x80777c1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a race condition between the main and spawned thread both destroying the TLS. The current implementation has some measures to prevent double destruction, but those are not used in this case (and also not thread safe AFAICT).
Hence the explicit note in the docs:
[...] As thread manipulation is a required facility for garbage collection, all user threads should derive from this class, and instances of this class should never be explicitly deleted. [...]
This is useful when a thread terminates before the
caller is able to join(). With a reset
m_addr
, caller would segfault.