Skip to content

Conversation

@rwmjones
Copy link
Contributor

Calling caml_c_thread_unregister() will (amongst many other things) free the caml_thread_t signal_stack entry of the current thread. The OCaml runtime initializes the main thread using special code in caml_thread_domain_initialize_hook, but this leaves the signal_stack entry uninitialized.

If you use glibc tunables to enable malloc checking, and especially use the malloc.perturb feature[1], then uninitialized areas of memory are filled with a repeating pattern. This causes the process to crash if you free an uninitialized pointer.

Thus any program which calls caml_c_thread_unregister on the main thread, and is using glibc malloc checking, will crash.

Fix this by initializing the signal_stack to NULL (since free(NULL) is permitted and does nothing).

This was found when debugging a problem in the nbdkit OCaml plugin[2].

[1] https://www.gnu.org/software/libc/manual/html_node/Memory-Allocation-Tunables.html
[2] https://discuss.ocaml.org/t/free-uninitalized-data-when-calling-caml-c-thread-unregister-on-the-main-thread/15198

Fixes: #13400

Copy link
Contributor

@gadmm gadmm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are not supposed to call caml_c_thread_unregister on the main thread. Nevertheless it's good to initialise the signal_stack field.

@gasche
Copy link
Member

gasche commented Aug 26, 2024

This also looks fine to me. Thanks!

The PR as-is targets 5.2, but I would prefer to merge in trunk directly. If you can change the branch, please do, otherwise I can close and push manually. I would then cherry-pick in the 5.3 branch anyway.

I think that this would deserve a Changes entry as it fixes a user-facing issue, and to credit your external contribution.

@rwmjones
Copy link
Contributor Author

Hang on, I'll try to retarget it ...

@rwmjones rwmjones changed the base branch from 5.2 to trunk August 26, 2024 12:43
@gadmm
Copy link
Contributor

gadmm commented Aug 26, 2024

Please add a Changes entry but

as it fixes a user-facing issue

while it is good to initialize memory, is it technically a bug to free a block whose memory was (partly) left uninitialized?

@gasche
Copy link
Member

gasche commented Aug 26, 2024

while it is good to initialize memory, is it technically a bug to free a block whose memory was (partly) left uninitialized?

My understanding is that the bug reported is not that we free a caml_thread_t that is partially uninitialized, but that we call free on the signal_stack field (through the call to caml_free_signal_stack in thread_detach_from_runtime) that is unitialized.

If I read the code correctly, the code will still fail now that signal_stack is a proper NULL, so arguably we are turning an obscure UB bug in a clean failure (rather than fixing a bug per se). Anyway.

@rwmjones
Copy link
Contributor Author

rwmjones commented Aug 26, 2024

My understanding is that the bug reported is not that we free a caml_thread_t that is partially uninitialized, but that we call free on the signal_stack field (through the call to caml_free_signal_stack in thread_detach_from_runtime) that is unitialized.

This is correct, you call free on an uninitialized pointer (when glibc tunables are used to pollute uninitialized memory). From the stack trace notice the pointer is 0xd5d5... where 0xd5 is the complement of malloc.perturb=42:

#7  free_check (mem=mem@entry=0xd5d5d5d5d5d5d5d5)
    at /usr/src/debug/glibc-2.40-3.fc41.x86_64/malloc/malloc-check.c:228
#8  0x00007ffff7fa0111 in free_check (mem=0xd5d5d5d5d5d5d5d5)
    at /usr/src/debug/glibc-2.40-3.fc41.x86_64/malloc/malloc-check.c:215
#9  __debug_free (mem=0xd5d5d5d5d5d5d5d5) at malloc-debug.c:208

Note free(NULL) is a no-op, according to POSIX.

review by Guillaume Munch-Maccagnoni and Gabriel Scherer

Will modify in a moment.

Calling caml_c_thread_unregister() will (amongst many other things)
free the caml_thread_t signal_stack entry of the current thread.  The
OCaml runtime initializes the main thread using special code in
caml_thread_domain_initialize_hook, but this leaves the signal_stack
entry uninitialized.

If you use glibc tunables to enable malloc checking, and especially
use the malloc.perturb feature[1], then uninitialized areas of memory
are filled with a repeating pattern.  This causes the process to crash
if you free an uninitialized pointer.

Thus any program which calls caml_c_thread_unregister on the main
thread, and is using glibc malloc checking, will crash.

Fix this by initializing the signal_stack to NULL (since free(NULL) is
permitted and does nothing).

This was found when debugging a problem in the nbdkit OCaml plugin[2].

[1] https://www.gnu.org/software/libc/manual/html_node/Memory-Allocation-Tunables.html
[2] https://discuss.ocaml.org/t/free-uninitalized-data-when-calling-caml-c-thread-unregister-on-the-main-thread/15198

Fixes: ocaml#13400
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
@gasche
Copy link
Member

gasche commented Aug 26, 2024

Note free(NULL) is a no-op, according to POSIX.

My impression is that the code of caml_free_signal_stack is not designed to support this and will do weird things in this case -- in particular there is an assertion that the signal stack is not NULL, so this use-case will abort on assert failure in debug mode. If you wanted to ensure that this use-case is supported, I guess you should add an early exit in caml_free_signal_stack on NULL inputs, or check this in the caller.

@rwmjones
Copy link
Contributor Author

rwmjones commented Aug 26, 2024

Note free(NULL) is a no-op, according to POSIX.

My impression is that the code of caml_free_signal_stack is not designed to support this and will do weird things in this case -- in particular there is an assertion that the signal stack is not NULL, so this use-case will abort on assert failure in debug mode.

The plan to fix this in nbdkit is to not call caml_c_thread_* at all on the main thread. We didn't know how to do this before as there's no "this is the main thread" or "this thread is already registered" API in OCaml, but @gadmm pointed out that it could be done by keeping our own thread-local flag.

If you wanted to ensure that this use-case is supported, I guess you should add an early exit in caml_free_signal_stack on NULL inputs, or check this in the caller.

Shouldn't caml_c_thread_* functions ignore the case when you're trying to register a thread that is already known [actually it does this already] or unregister one of these "main threads"? IDK, anyway the patch fixes the crash and the fix above should resolve it properly.

@gadmm
Copy link
Contributor

gadmm commented Aug 26, 2024

while it is good to initialize memory, is it technically a bug to free a block whose memory was (partly) left uninitialized?

My understanding is that the bug reported is not that we free a caml_thread_t that is partially uninitialized, but that we call free on the signal_stack field (through the call to caml_free_signal_stack in thread_detach_from_runtime) that is unitialized.

This does not sound like a user-facing issue in this sense either; per the documentation caml_c_thread_unregister should only be called after a succesful caml_c_thread_register, and it is out of reach to make the C FFI entirely memory-safe in the face of all possible incorrect use (unfortunately).

@gasche
Copy link
Member

gasche commented Aug 26, 2024

Anyway, I think the current change is good -- even if indeed the users should maybe rethink their implementation.

(I see that over at #13400 @gadmm has provided useful feedback on how to write the code differently. If you do happen to come up with additions to the current API to improve @rwmjones' use-case, I would be happy to consider implementing them or reviewing their implementation. For now it look like a purely user-side solution could be reasonable, so maybe there is no need.)

@gasche gasche merged commit 760ae23 into ocaml:trunk Aug 26, 2024
gasche added a commit that referenced this pull request Aug 26, 2024
otherlibs/systhreads/st_stubs.c: Avoid free of uninitialized pointer

(cherry picked from commit 760ae23)
shasheene pushed a commit to rescuezilla/nbdkit that referenced this pull request Aug 1, 2025
Thanks: Guillaume Munch-Maccagnoni and Gabriel Scherer
Link: ocaml/ocaml#13400
Link: ocaml/ocaml#13401
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Uninitialized free in caml_c_thread_unregister if you call it from the main thread

3 participants