Skip to content

Restore GDB backtrace on Linux#13261

Merged
gasche merged 1 commit into
ocaml:trunkfrom
tmcgilchrist:cfi_signal_frame
Jun 26, 2024
Merged

Restore GDB backtrace on Linux#13261
gasche merged 1 commit into
ocaml:trunkfrom
tmcgilchrist:cfi_signal_frame

Conversation

@tmcgilchrist

Copy link
Copy Markdown
Contributor

Small correction to #13241 reported by @kayceesrk

Using the meander example from Retrofitting effect handlers onto OCaml https://dl.acm.org/doi/10.1145/3453483.3454039
Running a GDB session on Ubuntu 24.04/arm64:

$ cat meander.ml 
external ocaml_to_c
         : unit -> int = "ocaml_to_c"
exception E1
exception E2
let c_to_ocaml () = raise E1
let _ = Callback.register
          "c_to_ocaml" c_to_ocaml
let omain () =
  try (* h1 *)
    try (* h2 *) ocaml_to_c ()
    with E2 -> 0
  with E1 -> 42
let _ = assert (omain () = 42)%   
 $ cat meander_c.c 
#include <caml/mlvalues.h>
#include <caml/callback.h>

value ocaml_to_c (value unit) {
    caml_callback(*caml_named_value
                  ("c_to_ocaml"), Val_unit);
    return Val_int(0);
}%                                                        

Compiled using ocamlopt meander_c.c meander.ml -o meander.exe

(gdb) run
Starting program: /home/tsmc/ocaml/meander.exe 

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
Downloading separate debug info for system-supplied DSO at 0xfffff7ffb000
[Thread debugging using libthread_db enabled]                                                                                                             
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".

Breakpoint 2, <signal handler called>
(gdb) bt
#0  <signal handler called>
#1  0x0000aaaaaaae9138 in caml_startup_common (pooling=0, argv=0xfffffffffc78) at runtime/startup_nat.c:128
#2  caml_startup_common (argv=0xfffffffffc78, pooling=0) at runtime/startup_nat.c:87
#3  0x0000aaaaaaae91b0 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:135
#4  caml_startup (argv=<optimized out>) at runtime/startup_nat.c:140
#5  caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#6  0x0000aaaaaaabcfd0 in main (argc=<optimized out>, argv=<optimized out>) at runtime/main.c:37
(gdb) c
Continuing.

Breakpoint 1, 0x0000aaaaaaabd168 in caml_program ()
(gdb) bt
#0  0x0000aaaaaaabd168 in caml_program ()
#1  <signal handler called>
#2  caml_startup (argv=<optimized out>) at runtime/startup_nat.c:141
#3  caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#4  0x0000aaaaaaae91b0 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:135
#5  caml_startup (argv=<optimized out>) at runtime/startup_nat.c:140
#6  caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#7  0x0000000000000001 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) c
Continuing.

Breakpoint 3, ocaml_to_c (unit=1) at meander_c.c:5
5	    caml_callback(*caml_named_value
(gdb) bt
#0  ocaml_to_c (unit=1) at meander_c.c:5
#1  <signal handler called>
#2  0x0000aaaaaaabdab8 in camlMeander.omain_278 () at meander.ml:10
#3  0x0000aaaaaaabdc9c in camlMeander.entry () at meander.ml:13
#4  0x0000aaaaaaabd210 in caml_program ()
#5  <signal handler called>
#6  caml_startup (argv=<optimized out>) at runtime/startup_nat.c:141
#7  caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#8  0x0000aaaaaaae91b0 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:135
#9  caml_startup (argv=<optimized out>) at runtime/startup_nat.c:140
#10 caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#11 0x0000000000000001 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) c
Continuing.

Breakpoint 4, camlMeander.c_to_ocaml_273 () at meander.ml:5
5	let c_to_ocaml () = raise E1
(gdb) bt
#0  camlMeander.c_to_ocaml_273 () at meander.ml:5
#1  <signal handler called>
#2  0x0000aaaaaab20d60 in camlMeander.7 ()
#3  0x0000aaaaaab20d60 in camlMeander.data_begin ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) info br
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x0000aaaaaaabd168 <caml_program>
	breakpoint already hit 1 time
2       breakpoint     keep y   0x0000aaaaaaae9640 <caml_start_program+4>
	breakpoint already hit 1 time
3       breakpoint     keep y   0x0000aaaaaaac3a80 in ocaml_to_c at meander_c.c:5
	breakpoint already hit 1 time
4       breakpoint     keep y   0x0000aaaaaaabda70 in camlMeander.c_to_ocaml_273 at meander.ml:5
	breakpoint already hit 1 time

The bracktrace is corrupted when jumping from the C Runtime into initial OCaml frames, and then again when executing the callback.

With this fix to CFI you get the following (correct) behaviour

(gdb) run
Starting program: /home/tsmc/ocaml/meander.exe 

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".

Breakpoint 2, <signal handler called>
(gdb) bt
#0  <signal handler called>
#1  0x0000aaaaaaae9138 in caml_startup_common (pooling=0, argv=0xfffffffffc78) at runtime/startup_nat.c:128
#2  caml_startup_common (argv=0xfffffffffc78, pooling=0) at runtime/startup_nat.c:87
#3  0x0000aaaaaaae91b0 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:135
#4  caml_startup (argv=<optimized out>) at runtime/startup_nat.c:140
#5  caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#6  0x0000aaaaaaabcfd0 in main (argc=<optimized out>, argv=<optimized out>) at runtime/main.c:37
(gdb) c
Continuing.

Breakpoint 1, 0x0000aaaaaaabd168 in caml_program ()
(gdb) bt
#0  0x0000aaaaaaabd168 in caml_program ()
#1  <signal handler called>
#2  0x0000aaaaaaae9138 in caml_startup_common (pooling=0, argv=0xfffffffffc78) at runtime/startup_nat.c:128
#3  caml_startup_common (argv=0xfffffffffc78, pooling=0) at runtime/startup_nat.c:87
#4  0x0000aaaaaaae91b0 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:135
#5  caml_startup (argv=<optimized out>) at runtime/startup_nat.c:140
#6  caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#7  0x0000aaaaaaabcfd0 in main (argc=<optimized out>, argv=<optimized out>) at runtime/main.c:37
(gdb) c
Continuing.

Breakpoint 3, ocaml_to_c (unit=1) at meander_c.c:5
5	    caml_callback(*caml_named_value
(gdb) bt
#0  ocaml_to_c (unit=1) at meander_c.c:5
#1  <signal handler called>
#2  0x0000aaaaaaabdab8 in camlMeander.omain_278 () at meander.ml:10
#3  0x0000aaaaaaabdc9c in camlMeander.entry () at meander.ml:13
#4  0x0000aaaaaaabd210 in caml_program ()
#5  <signal handler called>
#6  0x0000aaaaaaae9138 in caml_startup_common (pooling=4, argv=0xaaaaaab24ac8) at runtime/startup_nat.c:128
#7  caml_startup_common (argv=0xaaaaaab24ac8, pooling=4) at runtime/startup_nat.c:87
#8  0x0000aaaaaaae91b0 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:135
#9  caml_startup (argv=<optimized out>) at runtime/startup_nat.c:140
#10 caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#11 0x0000aaaaaaabcfd0 in main (argc=<optimized out>, argv=<optimized out>) at runtime/main.c:37
(gdb) c
Continuing.

Breakpoint 4, camlMeander.c_to_ocaml_273 () at meander.ml:5
5	let c_to_ocaml () = raise E1
(gdb) bt
#0  camlMeander.c_to_ocaml_273 () at meander.ml:5
#1  <signal handler called>
#2  0x0000aaaaaaac4d28 in caml_callback_exn (closure=<optimized out>, arg=<optimized out>, arg@entry=1) at runtime/callback.c:206
#3  0x0000aaaaaaac531c in caml_callback (closure=<optimized out>, arg=arg@entry=1) at runtime/callback.c:347
#4  0x0000aaaaaaac3aa0 in ocaml_to_c (unit=<optimized out>) at meander_c.c:5
#5  <signal handler called>
#6  0x0000aaaaaaabdab8 in camlMeander.omain_278 () at meander.ml:10
#7  0x0000aaaaaaabdc9c in camlMeander.entry () at meander.ml:13
#8  0x0000aaaaaaabd210 in caml_program ()
#9  <signal handler called>
#10 0x0000aaaaaaae9138 in caml_startup_common (pooling=4, argv=0xaaaaaab24ac8) at runtime/startup_nat.c:128
#11 caml_startup_common (argv=0xaaaaaab24ac8, pooling=4) at runtime/startup_nat.c:87
#12 0x0000aaaaaaae91b0 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:135
#13 caml_startup (argv=<optimized out>) at runtime/startup_nat.c:140
#14 caml_main (argv=<optimized out>) at runtime/startup_nat.c:147
#15 0x0000aaaaaaabcfd0 in main (argc=<optimized out>, argv=<optimized out>) at runtime/main.c:37
(gdb) 

@kayceesrk

Copy link
Copy Markdown
Contributor

The bug was introduced at #13079 (comment). Without the ability to run native debuggers in the testsuite, it is not possible to catch these bugs. Thankfully, this is being worked on: #13199 (comment).

@ghost ghost left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have naively expected both versions of arm64.S to be equivalent (i.e. that it would be safe to do CFI_OFFSET after CFI_ADJUST, taking the adjustment offset into account). But then the new flavour is consistent with the use of CFI_ADJUST elsewhere in the file. (e.g. in ENTER_FUNCTION)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants