Original bug ID: 6554 Reporter:@diml Assigned to:@mshinwell Status: closed (set by @xavierleroy on 2016-12-07T10:34:42Z) Resolution: fixed Priority: normal Severity: major Version: 4.02.0 Target version: 4.02.1+dev Fixed in version: 4.02.1+dev Category: runtime system and C interface Monitored by:@gasche@yakobowski
We were getting random segfault in one of our system, after some investigation it turns out to be due to a race condition in caml_get_raw_backtrace:
caml_alloc might run a minor collection. The minor collection might run finalisers which might raise and catch exceptions, modifying the current backtrace. If [caml_backtrace_pos] ends up smaller because of this the end of [res] is garbage.
We'll push a fix today to at least avoid the segfault. This is still not completely satisfactory as this shows again that when you get a backtrace, you might get a completely random one.
Here is a program that reproduce the bug, to be compiled with 'ocamlopt -g -inline 0':
let () = Printexc.record_backtrace true
let finaliser _ = try raise Exit with _ -> ()
let create () =
let x = ref () in
Gc.finalise finaliser x;
let f () = raise Exit
let () =
let minor_size = (Gc.get ()).Gc.minor_heap_size in
while true do
ignore (create () : unit ref);
with _ ->
for i = 1 to minor_size / 2 - 1 do
ignore (ref ())
ignore (Printexc.get_backtrace () : string)
The text was updated successfully, but these errors were encountered:
Sorry, I read your comment and Jeremy's one in the wrong order (i.e. I understood that your commit didn't fix the problem), and looking only at the last sentence of your bug report, not the code itself.
Since you are saving the data on the stack, I assume that this function is not called from the top of the stack when a Stack_overflow is raised, but after the stack has already been unwinded, just to be sure ?
I believe we only end up in this function if the user asks for the backtrace; in particular, this isn't the function that stashes the backtrace. As such I think we should be ok if stack space is tight when the exception actually occurs.