Add CFI_SIGNAL_FRAME to ARM64 and RiscV runtimes.#13241
Conversation
|
There are some aspects of the code, only semi-related to your changes, that I find confusing.
I am not sure if before-the-label or after-the-label is the right choice and would welcome confirmation (this might also deserve a comment in the source), but I would expect all backends to use the same strategy for each function. |
80c1bdd to
8bd7d49
Compare
The cfi annotation should be put after the label. |
CFI_SIGNAL_FRAME is required for GDB to correctly display backtraces through stack swaps.
8bd7d49 to
b53c46f
Compare
b53c46f to
c207512
Compare
|
@dustanddreams are you satisfied with the changes as they stand?
The backends should be consistent with each other except for on amd64 where there is an optimisation allowing tail-calling into caml_call_gc function (that @dustanddreams explained to me offline). |
ghost
left a comment
There was a problem hiding this comment.
I'm satisfied with these changes (including the misindentation fixes).
|
Approving on behalf of @dustanddreams and merging.
|
|
The reason some of the frames were originally tagged as signal handler frames was that GDB cut the backtrace when the stacks did not grow towards 0 address, which happens when the stacks switch at external calls, callbacks and effect handlers. From the original PR message, it is not clear to me whether the stacks were indeed cut at those positions and this PR fixes them; the PR message only shows the effect of applying this patch. Also, this backtrace seems ill-formed why does |
|
Thank you for the extra context @kayceesrk Using the meander example from Retrofitting effect handlers onto OCaml https://dl.acm.org/doi/10.1145/3453483.3454039 $ cat meander.ml
external ocaml_to_c
: unit -> int = "ocaml_to_c"
exception E1
exception E2
let c_to_ocaml () = raise E1
let _ = Callback.register
"c_to_ocaml" c_to_ocaml
let omain () =
try (* h1 *)
try (* h2 *) ocaml_to_c ()
with E2 -> 0
with E1 -> 42
let _ = assert (omain () = 42)% $ cat meander_c.c
#include <caml/mlvalues.h>
#include <caml/callback.h>
value ocaml_to_c (value unit) {
caml_callback(*caml_named_value
("c_to_ocaml"), Val_unit);
return Val_int(0);
}% Compiled using Before this change on Linux / ARM64 on Ubuntu 24.04: (gdb) run
Starting program: /home/tsmc/ocaml/meander.exe
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
Downloading separate debug info for system-supplied DSO at 0xfffff7ffb000
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
Breakpoint 1, 0x0000aaaaaaabd168 in caml_program ()
(gdb) bt
#0 0x0000aaaaaaabd168 in caml_program ()
#1 <signal handler called>
#2 caml_startup (argv=<optimized out>) at runtime/startup_nat.c:141
#3 caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#4 0x0000aaaaaaae91b0 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:135
#5 caml_startup (argv=<optimized out>) at runtime/startup_nat.c:140
#6 caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#7 0x0000000000000001 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) c
Continuing.
Breakpoint 2, ocaml_to_c (unit=1) at meander_c.c:5
5 caml_callback(*caml_named_value
(gdb) bt
#0 ocaml_to_c (unit=1) at meander_c.c:5
#1 0x0000aaaaaaae9590 in caml_c_call ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) c
Continuing.
Breakpoint 3, camlMeander.c_to_ocaml_273 () at meander.ml:5
5 let c_to_ocaml () = raise E1
(gdb) bt
#0 camlMeander.c_to_ocaml_273 () at meander.ml:5
#1 <signal handler called>
#2 0x0000aaaaaab20d60 in camlMeander.7 ()
#3 0x0000aaaaaab20d60 in camlMeander.data_begin ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) info br
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000aaaaaaabd168 <caml_program>
breakpoint already hit 1 time
2 breakpoint keep y 0x0000aaaaaaac3a80 in ocaml_to_c at meander_c.c:5
breakpoint already hit 1 time
3 breakpoint keep y 0x0000aaaaaaabda70 in camlMeander.c_to_ocaml_273 at meander.ml:5
breakpoint already hit 1 time
(gdb) There are three issues with this gdb session:
After this change: (gdb) run
Starting program: /home/tsmc/ocaml/meander.exe
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
Downloading separate debug info for system-supplied DSO at 0xfffff7ffb000
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
Breakpoint 1, 0x0000aaaaaaabd168 in caml_program ()
(gdb) bt
#0 0x0000aaaaaaabd168 in caml_program ()
#1 <signal handler called>
#2 caml_startup (argv=<optimized out>) at runtime/startup_nat.c:141
#3 caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#4 0x0000aaaaaaae91b0 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:135
#5 caml_startup (argv=<optimized out>) at runtime/startup_nat.c:140
#6 caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#7 0x0000000000000001 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) c
Continuing.
Breakpoint 2, ocaml_to_c (unit=1) at meander_c.c:5
5 caml_callback(*caml_named_value
(gdb) bt
#0 ocaml_to_c (unit=1) at meander_c.c:5
#1 <signal handler called>
#2 0x0000aaaaaaabdab8 in camlMeander.omain_278 () at meander.ml:10
#3 0x0000aaaaaaabdc9c in camlMeander.entry () at meander.ml:13
#4 0x0000aaaaaaabd210 in caml_program ()
#5 <signal handler called>
#6 caml_startup (argv=<optimized out>) at runtime/startup_nat.c:141
#7 caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#8 0x0000aaaaaaae91b0 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:135
#9 caml_startup (argv=<optimized out>) at runtime/startup_nat.c:140
#10 caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#11 0x0000000000000001 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) c
Continuing.
Breakpoint 3, camlMeander.c_to_ocaml_273 () at meander.ml:5
5 let c_to_ocaml () = raise E1
(gdb) bt
#0 camlMeander.c_to_ocaml_273 () at meander.ml:5
#1 <signal handler called>
#2 0x0000aaaaaab20d60 in camlMeander.7 ()
#3 0x0000aaaaaab20d60 in camlMeander.data_begin ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)The frames are not labelled as (gdb) run
Starting program: /home/tsmc/ocaml/meander.exe
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 3, 0x000055555556d980 in caml_program ()
(gdb) bt
#0 0x000055555556d980 in caml_program ()
#1 <signal handler called>
#2 0x0000555555599196 in caml_startup_common (pooling=<optimised out>, argv=0x7fffffffe518) at runtime/startup_nat.c:128
#3 caml_startup_common (argv=0x7fffffffe518, pooling=<optimised out>) at runtime/startup_nat.c:87
#4 0x000055555559920f in caml_startup_exn (argv=<optimised out>) at runtime/startup_nat.c:135
#5 caml_startup (argv=<optimised out>) at runtime/startup_nat.c:140
#6 caml_main (argv=<optimised out>) at runtime/startup_nat.c:147
#7 0x000055555556d882 in main (argc=<optimised out>, argv=<optimised out>) at runtime/main.c:37
(gdb) c
Continuing.
Breakpoint 1, 0x0000555555572e50 in ocaml_to_c ()
(gdb) bt
#0 0x0000555555572e50 in ocaml_to_c ()
#1 <signal handler called>
#2 0x000055555556e063 in camlMeander.omain_278 ()
#3 0x000055555556e1e2 in camlMeander.entry ()
#4 0x000055555556d9e3 in caml_program ()
#5 <signal handler called>
#6 0x0000555555599196 in caml_startup_common (pooling=<optimised out>, argv=0x7fffffffe518) at runtime/startup_nat.c:128
#7 caml_startup_common (argv=0x7fffffffe518, pooling=<optimised out>) at runtime/startup_nat.c:87
#8 0x000055555559920f in caml_startup_exn (argv=<optimised out>) at runtime/startup_nat.c:135
#9 caml_startup (argv=<optimised out>) at runtime/startup_nat.c:140
#10 caml_main (argv=<optimised out>) at runtime/startup_nat.c:147
#11 0x000055555556d882 in main (argc=<optimised out>, argv=<optimised out>) at runtime/main.c:37
(gdb) c
Continuing.
Breakpoint 2, 0x000055555556e010 in camlMeander.c_to_ocaml_273 ()
(gdb) bt
#0 0x000055555556e010 in camlMeander.c_to_ocaml_273 ()
#1 <signal handler called>
#2 0x0000555555574078 in caml_callback_exn (closure=<optimised out>, arg=<optimised out>) at runtime/callback.c:206
#3 0x000055555557467d in caml_callback (closure=<optimised out>, arg=<optimised out>) at runtime/callback.c:347
#4 0x0000555555572e71 in ocaml_to_c ()
#5 <signal handler called>
#6 0x000055555556e063 in camlMeander.omain_278 ()
#7 0x000055555556e1e2 in camlMeander.entry ()
#8 0x000055555556d9e3 in caml_program ()
#9 <signal handler called>
#10 0x0000555555599196 in caml_startup_common (pooling=<optimised out>, argv=0x7fffffffe518) at runtime/startup_nat.c:128
#11 caml_startup_common (argv=0x7fffffffe518, pooling=<optimised out>) at runtime/startup_nat.c:87
#12 0x000055555559920f in caml_startup_exn (argv=<optimised out>) at runtime/startup_nat.c:135
#13 caml_startup (argv=<optimised out>) at runtime/startup_nat.c:140
#14 caml_main (argv=<optimised out>) at runtime/startup_nat.c:147
#15 0x000055555556d882 in main (argc=<optimised out>, argv=<optimised out>) at runtime/main.c:37
(gdb) info br
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000555555572e50 <ocaml_to_c>
breakpoint already hit 1 time
2 breakpoint keep y 0x000055555556e010 <camlMeander.c_to_ocaml_273>
breakpoint already hit 1 time
3 breakpoint keep y 0x000055555556d980 <caml_program>
breakpoint already hit 1 timeThere are still issues with inaccurate stack frames which requires a small fix to the CFI directives on ARM64 tmcgilchrist@761f333 after that you get an equivalent gdb session to Linux / x86_64. (gdb) run
Starting program: /home/tsmc/ocaml/meander.exe
This GDB supports auto-downloading debuginfo from the following URLs:
<https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
Downloading separate debug info for system-supplied DSO at 0xfffff7ffb000
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
Breakpoint 1, 0x0000aaaaaaabd168 in caml_program ()
(gdb) bt
#0 0x0000aaaaaaabd168 in caml_program ()
#1 <signal handler called>
#2 0x0000aaaaaaae9138 in caml_startup_common (pooling=0, argv=0xfffffffffc78) at runtime/startup_nat.c:128
#3 caml_startup_common (argv=0xfffffffffc78, pooling=0) at runtime/startup_nat.c:87
#4 0x0000aaaaaaae91b0 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:135
#5 caml_startup (argv=<optimized out>) at runtime/startup_nat.c:140
#6 caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#7 0x0000aaaaaaabcfd0 in main (argc=<optimized out>, argv=<optimized out>) at runtime/main.c:37
(gdb) c
Continuing.
Breakpoint 2, ocaml_to_c (unit=1) at meander_c.c:5
5 caml_callback(*caml_named_value
(gdb) bt
#0 ocaml_to_c (unit=1) at meander_c.c:5
#1 <signal handler called>
#2 0x0000aaaaaaabdab8 in camlMeander.omain_278 () at meander.ml:10
#3 0x0000aaaaaaabdc9c in camlMeander.entry () at meander.ml:13
#4 0x0000aaaaaaabd210 in caml_program ()
#5 <signal handler called>
#6 0x0000aaaaaaae9138 in caml_startup_common (pooling=4, argv=0xaaaaaab24ac8) at runtime/startup_nat.c:128
#7 caml_startup_common (argv=0xaaaaaab24ac8, pooling=4) at runtime/startup_nat.c:87
#8 0x0000aaaaaaae91b0 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:135
#9 caml_startup (argv=<optimized out>) at runtime/startup_nat.c:140
#10 caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#11 0x0000aaaaaaabcfd0 in main (argc=<optimized out>, argv=<optimized out>) at runtime/main.c:37
(gdb) c
Continuing.
Breakpoint 3, camlMeander.c_to_ocaml_273 () at meander.ml:5
5 let c_to_ocaml () = raise E1
(gdb) bt
#0 camlMeander.c_to_ocaml_273 () at meander.ml:5
#1 <signal handler called>
#2 0x0000aaaaaaac4d28 in caml_callback_exn (closure=<optimized out>, arg=<optimized out>, arg@entry=1) at runtime/callback.c:206
#3 0x0000aaaaaaac531c in caml_callback (closure=<optimized out>, arg=arg@entry=1) at runtime/callback.c:347
#4 0x0000aaaaaaac3aa0 in ocaml_to_c (unit=<optimized out>) at meander_c.c:5
#5 <signal handler called>
#6 0x0000aaaaaaabdab8 in camlMeander.omain_278 () at meander.ml:10
#7 0x0000aaaaaaabdc9c in camlMeander.entry () at meander.ml:13
#8 0x0000aaaaaaabd210 in caml_program ()
#9 <signal handler called>
#10 0x0000aaaaaaae9138 in caml_startup_common (pooling=4, argv=0xaaaaaab24ac8) at runtime/startup_nat.c:128
#11 caml_startup_common (argv=0xaaaaaab24ac8, pooling=4) at runtime/startup_nat.c:87
#12 0x0000aaaaaaae91b0 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:135
#13 caml_startup (argv=<optimized out>) at runtime/startup_nat.c:140
#14 caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#15 0x0000aaaaaaabcfd0 in main (argc=<optimized out>, argv=<optimized out>) at runtime/main.c:37
(gdb) |
|
Thanks for the clarification.
Is this commit part of a PR? |
|
For Linux / RiscV the situation is worse. Before this change: (gdb) run
Starting program: /home/tsmc/ocaml/meander.exe
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/riscv64-linux-gnu/libthread_db.so.1".
Breakpoint 1, 0x0000002aaaac5350 in caml_program ()
(gdb) bt
#0 0x0000002aaaac5350 in caml_program ()
#1 0x0000002aaaae70a4 in caml_start_program ()
#2 0x0000000000000001 in ?? ()
(gdb) c
Continuing.
Breakpoint 3, 0x0000002aaaacb3f0 in ocaml_to_c ()
(gdb) bt
#0 0x0000002aaaacb3f0 in ocaml_to_c ()
#1 0x0000002aaaae6f98 in caml_c_call ()
#2 0x0000002aaab31878 in ?? ()
(gdb) c
Continuing.
Breakpoint 2, 0x0000002aaaac5c88 in camlMeander.c_to_ocaml_273 ()
(gdb) bt
#0 0x0000002aaaac5c88 in camlMeander.c_to_ocaml_273 ()
#1 0x0000002aaaae70a4 in caml_start_program ()
#2 0x0000002aaaac5ce0 in camlMeander.omain_278 ()
#3 0x0000000000000002 in ?? ()
(gdb) info br
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000002aaaac5350 <caml_program+8>
breakpoint already hit 1 time
2 breakpoint keep y 0x0000002aaaac5c88 <camlMeander.c_to_ocaml_273+28>
breakpoint already hit 1 time
3 breakpoint keep y 0x0000002aaaacb3f0 <ocaml_to_c+12>
breakpoint already hit 1 time
(gdb) After this change: Small improvement as sections of the backtrace get labelled correctly with |
This change brings the ARM64 and RiscV runtimes into sync with the amd64 and s390x runtimes which tag certain runtime functions as signal handlers (caml_call_realloc_stack, caml_call_gc, caml_c_call, caml_c_call_stack_args, caml_start_program and caml_runstack), for the purpose of displaying backtraces correctly in GDB on Linux. See Retrofitting Effect Handlers onto OCaml paper section 2.3 Stack Unwinding for further details.
With this change backtraces are displayed with
<signal handler called>shown whenever GDB encounters these functions e.g.and
NOTE that power.S is not consistent either but I am working on a fix for it.