Fix a nasty bug in the bytecode VM initialisation#13549
Fix a nasty bug in the bytecode VM initialisation#13549stedolan wants to merge 3 commits intoocaml:trunkfrom
Conversation
The callback thunk code is now initialised once, not per-thread, avoiding leaking of code fragments.
|
This diff appears to solve indeed the bytecode issues (#13402 and probably others) for me. However, are you sure it is safe to use a single copy for the whole program, as two elements of |
|
Urgh, well spotted, I'd forgotten it actually mutates it after initialisation. I'll update this with a better fix. |
Instead of mutating bytecode, this version uses the same mechanism in native code and bytecode: separate code exists for 1, 2 and 3 arguments, and more arguments are handled by iterating callback3.
|
Just pushed a version that (hopefully) fixes the bug without introducing a new one. It removes the optimised caml_callbackN for bytecode, and instead uses fixed code sequences for caml_callback1/2/3, and iterates caml_callback3 for more arguments. (This is the mechanism used by native code callbacks). |
ghost
left a comment
There was a problem hiding this comment.
This is smart and looks correct to me. And it still appears to work.
|
This looks correct to me. However, I'm not an expert on this part of the runtime. Perhaps a review from someone who understands this better would be useful. @Octachron, we should consider this for 5.3. |
|
My impression is that @xavierleroy would be the right person to look at this. |
|
Thanks for pinging me. From a quick look, I agree we have a problem here, but I'm a bit sad about mirroring the native-code behavior (using "slices" of at most 3 arguments for callbacks) in the bytecode behavior. My feeling is that it's time to change a bit the API of |
|
Also: just exposing |
- Use the new `caml_bytecode_interpreter` API to jump straight to the function's code. - Avoid modifying bytecode in place. - Avoid per-thread bytecode. - Register bytecode early and only once. Fixes: ocaml#13512 Closes: ocaml#13549
|
That was easier than I thought. See #13553 for my alternate proposal. |
The patch
The bytecode VM uses a fixed 7-instruction sequence of bytecode to set up a call from C into OCaml. This bytecode sequence requires initialisation, to register the code fragment (and thread the bytecode, if not a debug build).
Currently, this code sequence is defined per-thread and initialised on first use. This patch changes it to be defined globally and initialised at startup. (The main thread's copy was already initialised at startup through
caml_init_callbacks. The change here is to make that the only copy).Using a code sequence per thread instead of one global one is somewhat less efficient but not by itself a bug. The problem is that the code fragments leaked, as they were not unregistered at thread termination. This causes #13512.
The bug
With the old code, the following sequence of unfortunate events can occur, if you are very, very unlucky (or you are @jmid with multicoretests, manufacturing bad luck on an industrial scale):
Domain.spawn f, which creates a new domain and usescaml_callback_resto invokef.caml_callback_resis called, the callback initialisation logic runs, creating a 28-byte code fragment. This fragment covers the 7-instruction callback sequence stored in thread-local storage.In other words: dangling code fragments that are not unregistered are not just a resource leak but a soundness issue. They leave blindspots in the address space that the GC cannot perceive and cause weird bugs later on.