Skip to content
This repository has been archived by the owner on Oct 12, 2022. It is now read-only.

sections_elf_shared: Do not access TLS of dead thread in finiTLSRanges() #1655

Merged
merged 1 commit into from
Sep 16, 2016

Conversation

dnadlinger
Copy link
Member

finiTLSRanges() is called from the destructor of core.Thread. At this point,
the OS thread has already ceased to exist, so what was formerly a pointer
to _loadedDSOs is no longer valid.

In other words, tdsos.reset() was a use-after-free bug. It is unclear why
the issue didn't surface on Linux/FreeBSD yet; for example, glibc might not
actually re-use the TLS address range after a thread exits. On OS X, however,
this did quite frequently trigger a crash when running the Phobos unit tests,
since tdsos would have already been overwritten with unrelated contents.

@MartinNowak

finiTLSRanges() is called from the destructor of core.Thread. At this point,
the OS thread has already ceased to exist, so what was formerly a pointer
to _loadedDSOs is no longer valid.

In other words, `tdsos.reset()` was a use-after-free bug. It is unclear why
the issue didn't surface on Linux/FreeBSD yet; for example, glibc might not
actually re-use the TLS address range after a thread exits. On OS X, however,
this did quite frequently trigger a crash when running the Phobos unit tests,
since `tdsos` would have already been overwritten with unrelated contents.
@WalterBright
Copy link
Member

Fixing random crashes is always good!

@WalterBright
Copy link
Member

Auto-merge toggled on

@WalterBright WalterBright merged commit d0d8c6f into dlang:master Sep 16, 2016
@dnadlinger dnadlinger deleted the thread-dsos-use-after-free branch September 17, 2016 19:13
@dnadlinger
Copy link
Member Author

@WalterBright: To be precise, this doesn't fix any random crashes yet, at least none that I could trigger on Linux. It might be that the TLS areas are only freed on pthread_detach there, in which case the code would actually be fine (although still redundant).

The fix is very relevant for shared libraries on OS X, though, which still need backend support work in DMD.

@MartinNowak
Copy link
Member

MartinNowak commented Sep 20, 2016

Your analysis is correct, finiTLSRanges gets called from the Thread finalizer (~this()) through rt.tlsgc.destroy, i.e. after the thread might have exited.
The fix is a bit wonky though. The underlying problem is that we use a TLS array and store a reference to that outside of the thread (so that it can get GCed). What we should do instead is storing an array outside of the thread (maybe in the Thread object?) and a reference to that in TLS, then the design in rt.tlsgc should work.

@pjotrp
Copy link

pjotrp commented Feb 12, 2017

With ldc 1.1.0 (and earlier) it turns out finiTLSRanges() does crash randomly in sambamba on Linux. Both in the original version and the later fixed version of druntime as is discussed in biod/sambamba#219.

When I removed the line in the static version of the library with this patch https://github.com/genenetwork/guix-bioinformatics/blob/master/ldc-druntime-finiTLSRanges.patch the segfault no longer happens. We are still running tests, but so far it looks good.

@dnadlinger
Copy link
Member Author

@pjotrp: It seems like your patch should avoid the crash, but introduces a small memory leak on every thread exit. We are hitting what seems to be at least a very similar issue – again on OS X – in ldc-developers/ldc#2006, so this should be properly fixed. The easiest fix is probably to just allocate the Array on the C heap, although there might be a nicer way around it with a small redesign.

dnadlinger added a commit to ldc-developers/druntime that referenced this pull request Mar 18, 2017
…n-version(Shared)

See dlang#1655 for a discussion of the analogous issue for version(Shared).
kinke pushed a commit to ldc-developers/druntime that referenced this pull request Mar 18, 2017
…n-version(Shared)

See dlang#1655 for a discussion of the analogous issue for version(Shared).
omerfirmak added a commit to omerfirmak/druntime that referenced this pull request Dec 9, 2020
Similar to the issue fixed here dlang#1655 (comment) ,
static version of the sections_elf_shared accesses TLS of a dead thread. The simplest fix is to allocate
_tlsRanges somewhere that would persist outside of the thread scope, for example the C heap.

This was encountered while porting ldc 1.24.0 to Alpine Linux which uses Musl.
https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/14364 But this issue should
be affecting other libc implementations as well.
omerfirmak added a commit to omerfirmak/druntime that referenced this pull request Dec 9, 2020
Similar to the issue fixed here dlang#1655 (comment) ,
static version of the sections_elf_shared accesses TLS of a dead thread. The simplest fix is to allocate
_tlsRanges somewhere that would persist outside of the thread scope, for example the C heap.

This was encountered while porting ldc 1.24.0 to Alpine Linux which uses Musl.
https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/14364 But this issue should
be affecting other libc implementations as well.
omerfirmak added a commit to omerfirmak/druntime that referenced this pull request Dec 12, 2020
Similar to the issue fixed here dlang#1655 (comment) ,
static version of the sections_elf_shared accesses TLS of a dead thread. The simplest fix is to allocate
_tlsRanges somewhere that would persist outside of the thread scope, for example the C heap.

This was encountered while porting ldc 1.24.0 to Alpine Linux which uses Musl.
https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/14364 But this issue should
be affecting other libc implementations as well.
omerfirmak added a commit to omerfirmak/druntime that referenced this pull request Dec 12, 2020
Similar to the issue fixed here dlang#1655 (comment) ,
static version of the sections_elf_shared accesses TLS of a dead thread. The simplest fix is to allocate
_tlsRanges somewhere that would persist outside of the thread scope, for example the C heap.

This was encountered while porting ldc 1.24.0 to Alpine Linux which uses Musl.
https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/14364 But this issue should
be affecting other libc implementations as well.
omerfirmak added a commit to omerfirmak/druntime that referenced this pull request Dec 12, 2020
Similar to the issue fixed here dlang#1655 (comment) ,
static version of the sections_elf_shared accesses TLS of a dead thread. The simplest fix is to allocate
_tlsRanges somewhere that would persist outside of the thread scope, for example the C heap.

This was encountered while porting ldc 1.24.0 to Alpine Linux which uses Musl.
https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/14364 But this issue should
be affecting other libc implementations as well.
dlang-bot pushed a commit that referenced this pull request Dec 13, 2020
Similar to the issue fixed here #1655 (comment) ,
static version of the sections_elf_shared accesses TLS of a dead thread. The simplest fix is to allocate
_tlsRanges somewhere that would persist outside of the thread scope, for example the C heap.

This was encountered while porting ldc 1.24.0 to Alpine Linux which uses Musl.
https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/14364 But this issue should
be affecting other libc implementations as well.
n8sh pushed a commit to n8sh/druntime that referenced this pull request Mar 3, 2021
Similar to the issue fixed here dlang#1655 (comment) ,
static version of the sections_elf_shared accesses TLS of a dead thread. The simplest fix is to allocate
_tlsRanges somewhere that would persist outside of the thread scope, for example the C heap.

This was encountered while porting ldc 1.24.0 to Alpine Linux which uses Musl.
https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/14364 But this issue should
be affecting other libc implementations as well.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants