Avoid traversing C parts of frame pointer chain when reallocating stack#13635
Conversation
6547687 to
9670387
Compare
| To guarantee this when stack arguments are used, the actual pushing | ||
| of arguments is done by this separate function */ | ||
| FUNCTION(caml_c_call_copy_stack_args) | ||
| CFI_STARTPROC |
There was a problem hiding this comment.
Not labelling this with CFI_SIGNAL_FRAME will mean gdb shows this as a stack frame rather than <signal handler called>. On ARM64 Linux running the c_call.ml test:
(gdb) bt
#0 0x0000aaaaaaae8078 in caml_c_call_copy_stack_args ()
#1 <signal handler called>
#2 0x0000aaaaaaabc618 in camlC_call.f_274 () at c_call.ml:16
#3 0x0000aaaaaaabc6e8 in camlC_call.entry () at c_call.ml:24
#4 0x0000aaaaaaabafdc in caml_program ()
#5 <signal handler called>
#6 0x0000aaaaaaae7b88 in caml_startup_common (pooling=-1430997904, argv=0xaaaaaab4b460) at runtime/startup_nat.c:127macOS LLDB doesn't care and will show the full backtrace, using the same test case.
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
* frame #0: 0x0000000100008a00 c_call.opt`fp_backtrace_many_args(argv0=4302438272, a=3, b=5, c=7, d=9, e=11, f=13, g=15, h=17, i=19, j=21, k=23) at c_call_.c:24:3 [opt]
frame #1: 0x00000001000391d0 c_call.opt`caml_c_call_copy_stack_args + 36
frame #2: 0x0000000100039180 c_call.opt`caml_c_call_stack_args + 52
frame #3: 0x0000000100003a98 c_call.opt`camlC_call.f_274 + 128
frame #4: 0x0000000100003b68 c_call.opt`camlC_call.entry + 72
frame #5: 0x000000010000344c c_call.opt`caml_program + 116
frame #6: 0x0000000100039260 c_call.opt`caml_start_program + 132I noticed the stack trace gets broken when you enter fp_backtrace_many_args on ARM64 GDB:
(gdb) bt
#0 fp_backtrace_many_args (argv0=281474571960208, a=<optimized out>, b=5, c=7, d=9, e=11, f=13, g=15, h=17, i=19,
j=21, k=23) at c_call_.c:25
#1 0x0000aaaaaaae8090 in caml_c_call_copy_stack_args ()
#2 0x0000aaaaaab4b450 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000aaaaaaac1600 in fp_backtrace_many_args at c_call_.c:24
breakpoint already hit 1 time
....There's probably something slightly wrong with the CFI information being emitted.
For me GDB is getting lost on line 647 stp TMP, TMP2, [sp, -16]!; CFI_ADJUST(16)
There was a problem hiding this comment.
Thanks for spotting this!
That CFI_ADJUST is wrong. I removed it in amd64.S and forgot to when mirroring the changes in arm64.S (I don't know how lldb managed to successfully get a backtrace with the broken CFI entry, though - I tested the qsort.ml program, stopping in check_frames).
There was a problem hiding this comment.
Not labelling this with
CFI_SIGNAL_FRAMEwill mean gdb shows this as a stack frame rather than<signal handler called>.
This is true but I'm happy with that. The CFI_SIGNAL_FRAME thing is a trick to convince gdb to continue unwinding past a break in stacks, which isn't needed for caml_c_call_copy_stack_args since it's called directly on the C stack. I'm happy for it to appear in backtraces, because that means in particular that it shows up in stack-based profilers: copying stack args is rare and somewhat expensive, and so I think we shouldn't hide that it's happening.
There was a problem hiding this comment.
I agree it's useful to see properly labelled frames.
tmcgilchrist
left a comment
There was a problem hiding this comment.
Looks good now.
I get this backtrace with Linux ARM64 as expected.
(gdb) bt
#0 fp_backtrace_many_args (argv0=281474571960208, a=3, b=5, c=7, d=9, e=11, f=13, g=15, h=17, i=19, j=21, k=23)
at c_call_.c:24
#1 0x0000aaaaaaae8090 in caml_c_call_copy_stack_args ()
#2 <signal handler called>
#3 0x0000aaaaaaabc618 in camlC_call.f_274 () at c_call.ml:16
#4 0x0000aaaaaaabc6e8 in camlC_call.entry () at c_call.ml:24
#5 0x0000aaaaaaabafdc in caml_program ()
#6 <signal handler called>
#7 0x0000aaaaaaae7b88 in caml_startup_common (pooling=-1430997936, argv=0xaaaaaab4b460) at runtime/startup_nat.c:127
#8 caml_startup_common (argv=0xaaaaaab4b460, pooling=-1430997936) at runtime/startup_nat.c:86
#9 0x0000aaaaaaae7c00 in caml_startup_exn (argv=<optimized out>) at runtime/startup_nat.c:134
#10 caml_startup (argv=<optimized out>) at runtime/startup_nat.c:139
#11 caml_main (argv=<optimized out>) at runtime/startup_nat.c:146
#12 0x0000aaaaaaabadd0 in main (argc=<optimized out>, argv=<optimized out>) at runtime/main.c:37
(gdb) q|
The MSVC failures are just because amd64nt.asm needs updating (which I've done locally), buuuut... I'm getting a repeatable segfault with mingw-w64 (contradicting AppVeyor) for:
|
| CFI_STARTPROC | ||
| /* Set up a frame pointer even without WITH_FRAME_POINTERS, | ||
| which we use to pop an unknown number of arguments later */ | ||
| pushq %rbp; CFI_ADJUST(8) |
There was a problem hiding this comment.
Is there a reason why you don't use ENTER_FUNCTION and LEAVE_FUNCTION, like arm64, in this function?
There was a problem hiding this comment.
Yeah, it's to achieve Set up a frame pointer even without WITH_FRAME_POINTERS: the ENTER_FUNCTION and LEAVE_FUNCTION macros are noops without frame pointers, but this function actually uses %rbp to restore the stack and needs the push/pop logic even when WITH_FRAME_POINTERS is off.
(This doesn't apply to arm64, because on arm64 ENTER_FUNCTION and LEAVE_FUNCTION unconditionally set up a frame pointer, presumably because arm64 is less register-starved so optimising this away is less important)
When the OCaml stack grows we need to rewrite frame pointers (if enabled) to point to the new stack. However, when using a C library that was not compiled with frame pointers enabled, we cannot assume that there is an unbroken chain of frame pointers through both the OCaml and C parts of the stack. Doing so leads to segfaults. Instead, we note that the only frame pointers that can point to OCaml stacks (the ones that need updating) are those already on OCaml stacks, plus the first ones pushed after any OCaml->C calls. These can be found by traversing the struct c_stack_link chain, without needing to traverse any intervening C frames. This imposes a new constraint on the runtime assembly stubs: after switching to C they must not push anything to the stack before calling a C function. This was already true for all but caml_c_call_stack_args. Enforcing this invariant for caml_c_call_stack_args is straightforward enough, and simplifies the DWARF backtrace logic. For arm64, a side-effect of this change is that DWARF backtraces now work on stacks containing calls to caml_c_call_stack_args, which were broken before. (Tested with macos lldb)
54f8005 to
d32da79
Compare
dra27
left a comment
There was a problem hiding this comment.
This has gone through precheck#999 as well.
Adding a "green tick" to @tmcgilchrist and @fabbing's reviewing... @stedolan and I went through this synchronously yesterday (and thoroughly convinced ourselves that the Windows change is correct!)
|
My intuition is that this is not fixing a regression in 5.3.0 itself, and therefore should wait for 5.4.0, but could be cherry-picked to 5.3.1 to get it released a little sooner? |
|
Note we don’t have a frame pointers build in GitHub CI. It would be useful
to configure on for Ubuntu 24.04 on amd64.
I think it would be useful to go out earlier. The majority of Linux distros
ship without frame pointers enabled and this fix nicely avoids the segfault
reported in
#13575
…On Sat, 7 Dec 2024 at 1:11 am, David Allsopp ***@***.***> wrote:
My intuition is that this is not fixing a regression in 5.3.0 itself, and
therefore should wait for 5.4.0, but could be cherry-picked to 5.3.1 to get
it released a little sooner?
—
Reply to this email directly, view it on GitHub
<#13635 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABJXOLUF57QH6QVBSMUJWL2EGWBTAVCNFSM6AAAAABSP3HLGCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMRTGMZTKOBTG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
I admit that I don't have a very clear view of the safety of the fix. Nevertheless, taking in account that this fix changes the c call assembly even outside of the frame pointer mode, and that we will not have the time for a wide scale test before the upcoming release of OCaml 5.3.0, I am inclined to not include the fix in 5.3.0 and wait for 5.3.1 . However, if @fabbing and @tmcgilchrist are certain that the fix cannot have introduced any bug, I could be convinced to backport it to 5.3.0 . |
|
I'd like to see it in 5.3.1, but I don't think it's worth sneaking this patch into 5.3.0 at the last moment. 5.2.0 has the issue in #13575 and 5.3.0 is no worse in this regard, so the bug is not new and letting the fix wait until 5.3.1 seems fine to me. |
|
I agree including it in 5.3.1 would be fine. |
Avoid traversing C parts of frame pointer chain when reallocating stack
When the OCaml stack grows we need to rewrite frame pointers (if enabled) to point to the new stack.
However, when using a C library that was not compiled with frame pointers enabled, we cannot assume that there is an unbroken chain of frame pointers through both the OCaml and C parts of the stack. Doing so leads to segfaults (#13575).
Instead, note that the only frame pointers that can point to OCaml stacks (the ones that need updating) are those already on OCaml stacks, plus the first ones pushed after any OCaml->C calls. These can be found by traversing the struct c_stack_link chain, without needing to traverse any intervening C frames.
This imposes a new constraint on the runtime assembly stubs: after switching to C they must not push anything to the stack before calling a C function. This was already true for all but
caml_c_call_stack_args. Enforcing this invariant forcaml_c_call_stack_argsis straightforward enough, and simplifies the DWARF backtrace logic. (A side effect of this change is that DWARF backtraces now work throughcaml_c_call_stack_argson arm64 - this was broken before)I've added a new hairy test for this logic, checking that the frame pointer chain continues to make sense after the stack is reallocated inside a call to a C function compiled without frame pointers (
qsortfrom libc, tested on my non-FP-enabled Debian machine), inside a C callback using stack args, inside a C callback not using stack args, inside a finalizer.