Original bug ID: 6919 Reporter:@ygrek Status: closed (set by @damiendoligez on 2015-07-10T14:09:54Z) Resolution: fixed Priority: urgent Severity: crash Version: 4.02.2 Target version: 4.02.3+dev Fixed in version: 4.02.3+dev Category: runtime system and C interface Tags: patch Monitored by:@ygrek@dbuenzli@yakobowski
Bug description
We are experiencing strange crashes in Gc after switching from 4.02.1 to 4.02.2 but I don't have a small repro case for now (and as such cannot exclude misbehaving C bindings etc but the code is stable with 4.02.1), maybe you have a quick idea based on symptoms.
My investigation led me to the following changeset :
AFAIU it changes behaviour in the way that final_table offsets are now not updated after every minor collection, I do not know whether
it is an important invariant.
Here are the details of my issue if of any use :
It crashes when calling functions from final_table, in my case it is a Gc alarm registered by ocamlnet, but that alarm
just sets one mutable variable, so it is not a suspect.
At the start of program final_table looks alright with one entry like this :
instead of "Private_Dirty" string it can be any ocaml value.
Stack trace looks like this :
(gdb) bt
#0 0x00000000005d8467 in camlGc__call_alarm_1056 () at gc.ml:87 #1 0x00000000006633ba in caml_start_program () #2 0x000000000065f3db in caml_gc_compaction () #3 0x00000000004acb71 in camlMemory__reclaim_s_1540 () at memory.ml:77 #4 0x00000000004acdc5 in camlMemory__reclaim_1555 () at memory.ml:92
When run with debug runtime it fails on assert on line 163 in byterun/finalize.c
FWIW I just ran into this (with various symptoms: application crashing in pthread_cancel unwinder on exit, segfault after fork when Lwt is built with libev but not when built without, or segfault after fork when using OpenSSL from Lwt even without libev): ocsigen/lwt#168
I've created the testcase below before finding this bug (indeed from OCamlnet's Netsys_pollset_win32.ml), and I confirm that the patch fixes both the testcase and the segfaults in my application:
let x = ref false
let _ = Gc.create_alarm (fun () -> x := true)
let () =
Gc.compact ();
Gc.compact ()
(* ocamlc x.ml -runtime-variant d -o x && ./x
...
file finalise.c; line 163 ### Assertion failed: old == young *)
Original bug ID: 6919
Reporter: @ygrek
Status: closed (set by @damiendoligez on 2015-07-10T14:09:54Z)
Resolution: fixed
Priority: urgent
Severity: crash
Version: 4.02.2
Target version: 4.02.3+dev
Fixed in version: 4.02.3+dev
Category: runtime system and C interface
Tags: patch
Monitored by: @ygrek @dbuenzli @yakobowski
Bug description
We are experiencing strange crashes in Gc after switching from 4.02.1 to 4.02.2 but I don't have a small repro case for now (and as such cannot exclude misbehaving C bindings etc but the code is stable with 4.02.1), maybe you have a quick idea based on symptoms.
My investigation led me to the following changeset :
444d6c2#diff-ff9cb580dcca5bf97a4e407aba803b81R260
AFAIU it changes behaviour in the way that final_table offsets are now not updated after every minor collection, I do not know whether
it is an important invariant.
Here are the details of my issue if of any use :
It crashes when calling functions from final_table, in my case it is a Gc alarm registered by ocamlnet, but that alarm
just sets one mutable variable, so it is not a suspect.
At the start of program final_table looks alright with one entry like this :
(gdb) ml_dump/r final_table 4
*0x1c3c300: Closure( camlGc__call_alarm_1056 , 0x3 )
*0x1c3c308: ( ( 1 ) , Closure( camlNetsys_win32__fun_2219 , 0x3 ) )
*0x1c3c310: NULL
*0x1c3c318: NULL
but at crash time it is obviously wrong :
(gdb) ml_dump final_table 4
*0x227e300: Closure( camlGc__call_alarm_1056 , 0x3 )
*0x227e308: u'Private_Dirty: 12 kB'
*0x227e310: NULL
*0x227e318: NULL
instead of "Private_Dirty" string it can be any ocaml value.
Stack trace looks like this :
(gdb) bt
#0 0x00000000005d8467 in camlGc__call_alarm_1056 () at gc.ml:87
#1 0x00000000006633ba in caml_start_program ()
#2 0x000000000065f3db in caml_gc_compaction ()
#3 0x00000000004acb71 in camlMemory__reclaim_s_1540 () at memory.ml:77
#4 0x00000000004acdc5 in camlMemory__reclaim_1555 () at memory.ml:92
When run with debug runtime it fails on assert on line 163 in byterun/finalize.c
void caml_final_do_strong_roots (scanning_action f)
{
uintnat i;
struct to_do *todo;
Assert (old == young);
I would be very much grateful for any pointers how to debug this or provide more info..
Steps to reproduce
None for now, but I can reproduce it locally in less than 5 minutes.
The text was updated successfully, but these errors were encountered: