Restore frame-pointers support for amd64 #11144

fabbing · 2022-03-29T14:45:04Z

Introduction

Frame-pointers support was partially lost with the multicore merge, this PR aims to restore it for amd64 architecture.
A quick introduction to frame-pointers is available at 11031.

Changes

Most of the changes are located inside amd64.S and are just adaptations of what was already done before. Nevertheless, some of the changes are new because of the way the runtime currently works.

Fiber reallocation

The runtime may need to reallocate a fiber to accomadate for the stack space required for it to perform its work.
When a stack is reallocated, addresses pointing into the reallocated fiber must be updated to reflect the new stack space on the reallocated fiber. Like exception handler addresses are rewritten, the linked list of frame-pointers must be updated.
The rewrite_frame_pointers function does this by following the linked list of frame-pointers and adjusting the address values when necessary; it allows to also update addresses located on other stacks (other fibers, or the C stack).

Reserving rbp for frame-pointers at C calls

The rbp register was used to save information for DWARF unwinding at C calls, the rbx register is now used instead allowing rbp to be used as base pointer.

Added tests to the testsuite

Those should hopefully validate that frame-pointers work as expected.

c_call.ml: checks the correctness of the frame-pointers backtrace during the execution of different C calls.
effects.ml: checks the correctness of the frame-pointers backtrace in several key locations when using effect.
stack_reallocations.ml: checks rewrite_frame_pointers works as expected when the computation fiber needs to be reallocated.
stack_reallocations2.ml: checks rewrite_frame_pointers and caml_resume work as expected when the “effect handling” fiber needs to be reallocated.

The last two tests detect more problems when run under the debug runtime, since the reallocated fiber stack space is memset to a cannary value quickly showing dangling base pointer.

Acknowledgement

Thanks to @ctk21, @kayceesrk and @Engil for their advice.
And to @shindere for his help in improving the tests.

fabbing · 2022-03-29T14:58:53Z

runtime/fiber.c

+  delta = (char*)Stack_high(new_stack) - (char*)Stack_high(old_stack);
+
+  /* Walk the frame-pointers linked list */
+  for (frame = __builtin_frame_address(0); frame; frame = next) {


Is it okay to use this GCC builtin (also supported by clang) which returns the current rbp register value?
An alternative would be to implement our own caml_frame_address in amd64 (and other architectures that requires it).

At this stage, this should be fine, yes - --enable-frame-pointers-mode has always been for gcc/clang on Linux x86_64.

kayceesrk · 2022-03-30T08:59:20Z

Thanks for this work @fabbing. For me, the main use of frame pointers is using perf with call graph enabled using fp (frame pointer mode) rather than dwarf mode. Since DWARF unwinding is slow, perf saves a copy of the stack memory at every sample. This can be quite inefficient (and imprecise since our stacks are not contiguous).

It would be useful to take a very small program that uses effect handlers and does some long-running computation. With fp mode you should be able to see the full call graph (all the way down to the main function) in the profile as opposed to a truncated one in dwarf mode. Hopefully, you should also see that the size of the profile is much smaller under fp mode.

Would you be able to confirm this @fabbing?

gasche · 2022-03-31T08:35:35Z

Since DWARF unwinding is slow, perf saves a copy of the stack memory at every sample. This can be quite inefficient (and imprecise since our stacks are not contiguous).

Could we discuss this somewhere? (is it discussed somewhere?) I use perf record --call-graph=dwarf for my OCaml profiling (I don't really want to have a new switch just for this), and I naively assumed that someone had magically worked out how to describe the new weird stack layout to the debugging dwarves. Should we assume that --call-graph=dwarf will be broken with the Multicore runtime forever, or does someone have better news?

kayceesrk · 2022-03-31T11:16:28Z

Could we discuss this somewhere? (is it discussed somewhere?)

Not discussed. But mentioned in the PLDI 2021 paper on retrofitting effect handlers -- Section 5.5 in https://kcsrk.info/papers/retro-concurrency_pldi_21.pdf. Unfortunately, given that perf works by dumping (a segment) of the stack at every sample, there is no easy fix for this. The traces with OCaml 5 will be truncated.

perf record --call-graph=fp will be accurate and fast compared to --call-graph=dwarf. @lpw25 mentioned in the past that within Jane Street they use +fp switch for perf profiling since DWARF based call graph is too slow for them.

sadiqj · 2022-04-03T16:22:54Z

Should we assume that --call-graph=dwarf will be broken with the Multicore runtime forever, or does someone have better news?

On call-graph dwarf, it's bad news yes. perf's method for dwarf call-graphs isn't going to work well when there are non-contiguous stacks. To deal with that I think there'd have to be dwarf unwinding in the kernel. The good news is that if you have a reasonably modern Intel CPU then Last Branch Record (lbr) call-graph sampling works well.

This is a slightly modified sched.ml that runs for many iterations and allocates an array instead of printing something to the console. Here is using perf call-graph DWARF:

The screenshot is showing caml_make_vec and caml_c_call as the top entries and you lose the tie-up between OCaml and C stacks.

With call-graph LBR (on my i7-12700F):

Here the screenshot shows going from OCaml to C correctly (specifically Sched.say to caml_make_vec)

It seems like AMD Zen 3 also has some kind of branch sampling that's going in to perf: https://lwn.net/Articles/875869/

kayceesrk · 2022-04-04T04:45:49Z

Last Branch Record (lbr) call-graph sampling works well.

Thanks @sadiqj. Does this come for free on OCaml 5 i.e, no work required on the OCaml side?

sadiqj · 2022-04-04T14:04:33Z

Thanks @sadiqj. Does this come for free on OCaml 5 i.e, no work required on the OCaml side?

Yes, that test was done on trunk.

fabbing · 2022-04-05T12:32:23Z

With a program very similar to the one used by @sadiqj, we get under perf, as expected, a complete backtrace with frame-pointers:

# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 13K of event 'cycles'
# Event count (approx.): 15072055413
#
# Children      Self       Samples  Command  Shared Object     Symbol                                                  
# ........  ........  ............  .......  ................  ........................................................
#
    85.87%     0.00%             0  Domain0  libc.so.6         [.] __libc_start_call_main
    85.87%     0.00%             0  Domain0  sched_fp.exe      [.] main
    85.87%     0.00%             0  Domain0  sched_fp.exe      [.] caml_start_program
    85.87%     0.00%             0  Domain0  sched_fp.exe      [.] caml_startup__code_begin
    85.38%     0.08%            11  Domain0  sched_fp.exe      [.] Sched.entry
            |          
             --85.30%--Sched.entry
                       |          
                       |--31.49%--caml_runstack
                       |          |          
                       |          |--15.47%--Sched.test_479
                       |          |          |          
                       |          |           --15.15%--Sched.say_289
                       |          |                     |          
                       |          |                      --15.10%--caml_c_call
                       |          |                                |          
                       |          |                                 --14.94%--caml_make_vect
                       |          |                                           |          
                       |          |                                            --0.70%--caml_process_pending_actions
                       |          |                                                      |          
                       |          |                                                       --0.68%--caml_poll_gc_work
                       |          |                                                                 |          
                       |          |                                                                  --0.67%--caml_major_collection_slice
                       |          |                                                                            major_collection_slice
                       |          |          
                       |          |--11.23%--Sched.fun_610
                       |          |          |          
                       |          |           --10.29%--Sched.say_289
                       |          |                     |          
                       |          |                      --10.25%--caml_c_call
                       |          |                                |          
                       |          |                                 --10.19%--caml_make_vect
                       |          |                                           |          
                       |          |                                            --1.11%--caml_process_pending_actions
                       |          |                                                      caml_poll_gc_work
                       |          |                                                      |          
                       |          |                                                       --1.08%--caml_major_collection_slice
                       |          |                                                                 |          
                       |          |                                                                  --1.07%--major_collection_slice
                       |          |                                                                            |          
                       |          |                                                                             --1.01%--caml_sweep
                       |          |                                                                                       |          
                       |          |                                                                                        --0.56%--pool_sweep
                       |          |          
                       |           --4.32%--Sched.fun_622
                       |                     Sched.say_289
                       |                     |          
                       |                      --4.31%--caml_c_call
                       |                                |          
                       |                                 --4.29%--caml_make_vect
                       |          
                       |--15.41%--Sched.fun_589
                       |          |          
                       |           --15.12%--Sched.say_289
                       |                     |          
                       |                      --15.06%--caml_c_call
                       |                                |          
                       |                                 --15.00%--caml_make_vect
                       |                                           |          
                       |                                            --1.95%--caml_process_pending_actions
                       |                                                      |          
                       |                                                       --1.95%--caml_poll_gc_work
                       |                                                                 |          
                       |                                                                  --1.93%--caml_major_collection_slice
                       |                                                                            |          
                       |                                                                             --1.92%--major_collection_slice
                       |                                                                                       |          
                       |                                                                                        --1.38%--caml_sweep
                       |                                                                                                  |          
                       |                                                                                                   --0.66%--pool_sweep
                       |          
                       |--11.71%--Sched.fun_596
                       |          |          
                       |           --11.47%--Sched.say_289
                       |                     |          
                       |                      --11.44%--caml_c_call
                       |                                |          
                       |                                 --11.35%--caml_make_vect
                       |                                           |          
                       |                                            --2.12%--caml_process_pending_actions
                       |                                                      |          
                       |                                                       --2.08%--caml_poll_gc_work
                       |                                                                 caml_major_collection_slice
                       |                                                                 |          
                       |                                                                  --2.05%--major_collection_slice
                       |                                                                            |          
                       |                                                                             --1.56%--caml_sweep
                       |                                                                                       |          
                       |                                                                                        --0.62%--pool_sweep
                       |          
                       |--10.03%--Sched.fun_558
                       |          |          
                       |           --9.98%--Sched.say_289
                       |                     |          
                       |                      --9.96%--caml_c_call
                       |                                |          
                       |                                 --9.88%--caml_make_vect
                       |                                           |          
                       |                                            --0.99%--caml_process_pending_actions
                       |                                                      |          
                       |                                                       --0.98%--caml_poll_gc_work
                       |                                                                 |          
                       |                                                                  --0.95%--caml_major_collection_slice
                       |                                                                            |          
                       |                                                                             --0.93%--major_collection_slice
                       |          
                       |--5.84%--Sched.say_289
                       |          |          
                       |           --5.77%--caml_c_call
                       |                     |          
                       |                      --5.72%--caml_make_vect
                       |                                |          
                       |                                 --1.20%--caml_process_pending_actions
                       |                                           caml_poll_gc_work
                       |                                           |          
                       |                                            --1.17%--caml_major_collection_slice
                       |                                                      |          
                       |                                                       --1.15%--major_collection_slice
                       |                                                                 |          
                       |                                                                  --1.11%--caml_sweep
                       |                                                                            |          
                       |                                                                             --0.52%--pool_sweep
                       |          
                       |--4.50%--Sched.fun_563
                       |          |          
                       |           --4.49%--Sched.say_289
                       |                     |          
                       |                      --4.48%--caml_c_call
                       |                                |          
                       |                                 --4.44%--caml_make_vect
                       |          
                       |--4.35%--Sched.fun_605
                       |          |          
                       |           --4.34%--Sched.say_289
                       |                     |          
                       |                      --4.32%--caml_c_call
                       |                                |          
                       |                                 --4.27%--caml_make_vect
                       |          
                        --0.63%--Stdlib.Effect.match_with_363
                                  |          
                                   --0.51%--caml_c_call

    81.77%     0.45%            60  Domain0  sched_fp.exe      [.] caml_c_call
    81.24%     0.30%            40  Domain0  sched_fp.exe      [.] Sched.say_289
    81.13%    67.64%          8938  Domain0  sched_fp.exe      [.] caml_make_vect
    31.56%     0.45%            59  Domain0  sched_fp.exe      [.] caml_runstack
    15.53%     0.29%            39  Domain0  sched_fp.exe      [.] Sched.test_479
    15.48%     0.09%            12  Domain0  sched_fp.exe      [.] Sched.fun_589
    11.73%     0.04%             5  Domain0  sched_fp.exe      [.] Sched.fun_596
    11.26%     0.14%            19  Domain0  sched_fp.exe      [.] Sched.fun_610
    10.53%     0.00%             0  Domain0  [unknown]         [k] 0000000000000000
    10.05%     0.05%             7  Domain0  sched_fp.exe      [.] Sched.fun_558
     9.20%     5.72%           756  Domain0  libc.so.6         [.] _int_malloc
[...]

As explained by @kayceesrk, recording the execution with frame-pointers unwind method (perf record -g --call-graph fp ./sched_fp.exe) also results in a much smaller profile file of 1,6MB rather than the 106MB one from recording with dwarf (perf record -g --call-graph dwarf ./sched_fp.exe).
In this very specific example there was not much difference in perf execution time between fp and dwarf unwiding.

sadiqj · 2022-04-05T13:01:07Z

In this very specific example there was not much difference in perf execution time between fp and dwarf unwiding.

perf record may have taken similar amounts of time but perf report should have taken much longer with dwarf since that's when the unwinding actually happens.

The other advantage frame pointers has is that you can walk the stack trace from within eBPF probes, something that you can't easily do with dwarf.

gasche · 2022-04-05T13:21:15Z

But the problem with frame pointers remain that they are not enabled by default (for reasonable reasons), so they require changing to a different environment to profile, which is a pain, and I don't do it in practice. Of course I agree that it's nice to make it work for people that use it, not trying to criticize the work here, but for my actual usage I'm more interested in @sadiqj's LBR stuff. (I tried to get an AMD CPU on my last laptop for the fun of it, but failed, so I should be good; but with the rapid increase of Mac M1 machines maybe we should be nervous about getting comfy with Intel-only technology.)

I wish these battle-won profiling tricks were documented in a central place. (I think there is some profiling documentation on ocaml.org?) Especially now that the tricks are changing in 5.x, having them spread over many blog posts and internal documents is not very good. (Which reminded me of the distantly-related #11058.)

gasche · 2022-04-05T13:22:51Z

(Yes, generating perf reports with call-graph=dwarf is slow, but I never was in a situation where it was slower and more cumbersome than maintaining two switches in parallel, or inconvenient enough that I would move to +fp as my main switch. I tend to be lucky to profile programs that run in at most a handful of seconds, and the report generation also only takes seconds.)

xavierleroy · 2022-04-05T17:27:39Z

@gasche: frame pointers are like 8-track tapes: obsolete 1970's technologies that some Americans really like and refuse to abandon. Not much we can do here. At least they are happy with an OPAM switch and don't demand frame pointer support by default. Count your blessings.

kayceesrk · 2022-04-06T03:50:13Z

@fabbing Thanks for the perf work. The PR is marked as draft. Is there still work left to do on this before we can start reviewing?

fabbing · 2022-04-06T10:28:33Z

Thanks KC!
I think the PR can now drop its draft status and start being reviewed, as I don't expect many changes to be necessary. But there are still a few points I would like to discuss:

Stack alignment

Stack alignment should still be correct: the testsuite also passes with the debug runtime enabling the CHECK_STACK_ALIGNMENT macro which checks for correct stack alignment at every C_CALL.
Moreover the only place in amd64.S where the frame base pointer is actually pushed onto the C stack, and not onto an OCaml one, is for caml_raise_exception and the stack pointer is discarded soon after.
But does function call on the OCaml stacks (fibers) also have to be aligned?
I don't think anything is currently being done to check they are correctly aligned and adding one word at each function prologue (in amd64.S) will have shifted everything!

Frame-pointers at caml_start_program
The caml_start_program function does not currently push its base pointer during the prologue.
It's something that wasn't done before, for example in 4.14

ocaml/runtime/amd64.S

Line 474 in bfb4b1e

FUNCTION(G(caml_start_program))

Adding frame-pointer here would actually push rbp onto the C stack and would require another word to enforce alignment; it would also impact caml_callback*_asm functions.
What is the good reason this wasn't done before? And as corollary, should caml_start_program do this in 5.00?
Saved rbp at main is not NULL
The fp_backtrace function used for the frame-pointers tests is quite convoluted because the saved rbp at main is not NULL as I naively expected (it's often a small value around 0x1~0xf).
So a lot of precautions are taken not to crash the test by attempting to dereference an invalid pointer.
The rewrite_frame_pointers function also carrefully avoids this but more elegantly by testing if it's the top of the current fiber.
Tests
How many more tests checking the consisency of frame-pointers do we want?
I think at least one involving effects would be welcome.

fabbing · 2022-04-22T15:47:22Z

The added tests proved useful since they showed an oversight in case of fiber reallocation with effects!
When a computation fiber is spun-up, caml_runstack saves a base pointer pointing into the "effect handling" fiber. If the computation performs an effect, the work done by the "effect handling" fiber may require the reallocation of its stack. The rewrite_frame_pointers function kicks in but is unable to update this saved rbp because it walks the stack using the rbp linked list and at this point of time the computation fiber is no longer part of it.
The solution found, thanks to @ctk21, is to make caml_resume to overwrite the saved rbp in caml_runstack when a computation fiber is continue-d, in case the "effect handling" fiber has been reallocated.

xavierleroy · 2022-05-04T08:55:59Z

Stack alignment [...]
But does function call on the OCaml stacks (fibers) also have to be aligned?
I don't think anything is currently being done to check they are correctly aligned and adding one word at each function prologue (in amd64.S) will have shifted everything!

Interesting observation, thanks! The only reason to keep the stack 16-aligned in x86-64 is to support the use of SSE instructions that demand 16-byte alignment. C compilers like to produce these instructions even for standard library functions like printf, so the C stack must be kept 16-aligned. In OCaml 4 it required OCaml stack frames to be 16-aligned. But in OCaml 5, the OCaml stacks should be fine with 8-alignment, and we can save stack space by aligning OCaml frames to 8 bytes, not to 16!

xavierleroy · 2022-05-04T08:58:24Z

Apart from that: this PR needs to be thoroughly reviewed, and I'm not volunteering because I don't care about frame pointer support. I won't object to having them supported again in OCaml 5, but core developers who are actually interested in the feature should do the review.

vsiles · 2022-06-10T06:58:00Z

For what it is worth, we have a couple tools at Meta (perf analysis, flame graphs, ...) that still relies on the frame pointer. Having it around for a few releases would be of great help.

let-def · 2022-06-27T12:55:06Z

I have reviewed most of the code and I am satisfied with it.

That being said, the representation of stack headers (defined in fiber.h) is still a bit mysterious to me. What I read made sense to me, though I cannot guarantee that it is correct. Here are some notes on the cases I found tricky:

with effects, we can have a linked-list of stack fragments/fibers
the first frame (the oldest) of each fragment has a pointer to the last frame of parent fragment
a fragment can be reallocated, which causes its address to change: in this situation we need to update all frames inside this fragment (but never step outside, the parent pointer can be stale)
a fragment can be suspended and its parent resumed: in this situation, the frames in the parent can change, and the parent fragment itself can be reallocated or even deallocated

Because of this last case, the first frame pointer in a fragment can only be trusted if this fragment is "active". When suspended, we cannot make any assumption on the first frame pointer. That's why this pointer is updated every time we switch stack (https://github.com/ocaml/ocaml/pull/11144/files#diff-532fb04aef00f67a2b487d213655f8846c69ed0bcc92d5f0c8fa37e246aeee3fR281): this reattaches the fragment to the active stack. All tools that make use of frame pointers only scan the current stack; they should never look at a suspended fragment, so overwriting on stack switch is enough.

fabbing · 2022-06-28T12:47:08Z

Thanks a lot for having a look @let-def !

That being said, the representation of stack headers (defined in fiber.h) is still a bit mysterious to me. What I read made sense to me, though I cannot guarantee that it is correct. Here are some notes on the cases I found tricky:

with effects, we can have a linked-list of stack fragments/fibers

the first frame (the oldest) of each fragment has a pointer to the last frame of parent fragment

a fragment can be reallocated, which causes its address to change: in this situation we need to update all frames inside this fragment (but never step outside, the parent pointer can be stale)

a fragment can be suspended and its parent resumed: in this situation, the frames in the parent can change, and the parent fragment itself can be reallocated or even deallocated

Because of this last case, the first frame pointer in a fragment can only be trusted if this fragment is "active". When suspended, we cannot make any assumption on the first frame pointer. That's why this pointer is updated every time we switch stack (https://github.com/ocaml/ocaml/pull/11144/files#diff-532fb04aef00f67a2b487d213655f8846c69ed0bcc92d5f0c8fa37e246aeee3fR281): this reattaches the fragment to the active stack. All tools that make use of frame pointers only scan the current stack; they should never look at a suspended fragment, so overwriting on stack switch is enough.

That's right indeed, this was the last issue discovered and hopefully it's now handled correctly and validated by the tests stack_reallocations.ml and stack_reallocations2.ml.

let-def · 2022-06-28T12:55:43Z

Yes. If that was not clear in my first message: I think this PR is correct and ready for merge, but I don't feel 100% confident on these tricky parts.

gasche · 2022-06-29T08:03:02Z

@let-def are you planning to Approve this PR?

Do you trust that there is no regression risk for people who do not use the feature? My impression from skimming the assembly changes is "yes", but I haven't reviewed the PR myself.

If there is opposition to the change, or if some people would like to have the time to review before this might be merged, now is the time to say so.

kayceesrk · 2022-06-29T09:43:57Z

@gasche

I discovered a bug in frame pointer handling with effect handlers specifically in the place where saved rbp is overwritten in amd64.S. I've had a chat with @fabbing about this, and we have worked out a fix. Please hold off merging until the fix is in.

fabbing · 2022-06-30T12:00:26Z

@kayceesrk actually found 2 cases that weren't handled correctly:

At caml_resume the oldest saved rbp (highest one in the stack) that was updated was the one in the first fiber of the continuation and should instead be updated in the last fiber
caml_reperform must also update the oldest saved rbp in case the fiber is reallocated while searching for the matching effect hanlder

We think we have solved both issues and also added another test, designed by @kayceesrk, to the testsuite that triggered both issues.

gasche · 2022-06-30T13:42:07Z

Question: is the documentation produced by @let-def in #11144 (comment) integrated in the code as relevant, as a comment? If not, then that could be a useful addition to the PR, right?

kayceesrk · 2022-06-30T15:13:35Z

runtime/amd64.S

@@ -849,6 +907,9 @@ CFI_STARTPROC
        movq    (Stack_handler-1)(%rdi), %r10
        movq    %rsi, Handler_parent(%r10)       /* Append to last_fiber */
        leaq    1(%rsi), %rdi  /* %rdi (last_fiber) := Val_ptr(old stack) */
+     /* Need to update the olderst savde frame pointer here as the execution of


Suggested change

/* Need to update the olderst savde frame pointer here as the execution of

/* Need to update the oldest saved frame pointer here as the execution of

kayceesrk · 2022-06-30T15:14:01Z

There are code comments in amd64.S that mention this [1]. Perhaps they can be expanded with information from #11144 (comment) @fabbing?

[1] https://github.com/ocaml/ocaml/pull/11144/files#diff-532fb04aef00f67a2b487d213655f8846c69ed0bcc92d5f0c8fa37e246aeee3fR910-R936

shindere · 2022-10-11T08:28:11Z

Thanks, @fabbing! Am I correct that it's your point 3 which explains why you need ot install a signal handler in your test? In my opinion, this should be understood and fixed as much as posisble so that there is no need for a signal handler in the test because it feels like it's hiding a real problem under the hood rather than properly solving the problem once and for all.

fabbing commented Mar 29, 2022

View reviewed changes

fabbing marked this pull request as ready for review April 6, 2022 10:28

fabbing force-pushed the fp_redux branch from e5a8955 to d02b1f8 Compare April 14, 2022 07:54

fabbing force-pushed the fp_redux branch from 82589cd to aea270b Compare April 22, 2022 15:10

fabbing added a commit to fabbing/ocaml that referenced this pull request Apr 22, 2022

Add ocaml#11144 entry to Changes

360e9b5

kayceesrk self-assigned this May 24, 2022

let-def self-assigned this May 30, 2022

Add some CFI instructions at handle_overlow

c8ba671

fabbing and others added 10 commits June 30, 2022 13:35

Update Iextcall of emit_instr for frame-pointers

bc73ca8

Add a clarification to stack layout diagram

bf30f79

Restore frame-pointers support for amd64

354cf81

ocamltest: add the "frame_pointers" action

7279b29

Add tests for frame-pointers

05d1be7

caml_resume overwrite last rbp in case the stack was reallocated

c3cc56c

Remove no longer accurate comment in fp effects test

c0a5317

Updage ocaml#11144 entry to Changes

0c25ba9

Fix for fibers

e89f61c

Add a test for reperform and stack reallocation at handler lookup

dacf58a

fabbing force-pushed the fp_redux branch from bb80972 to dacf58a Compare June 30, 2022 11:43

kayceesrk approved these changes Jun 30, 2022

View reviewed changes

kayceesrk reviewed Jun 30, 2022

View reviewed changes

Improve UPDATE_BASE_POINTER macro comment

8509495

let-def approved these changes Jun 30, 2022

View reviewed changes

gasche merged commit 35c15e0 into ocaml:trunk Jun 30, 2022

copy mentioned this pull request Dec 14, 2022

5.1 trunk supports frame pointers again ocaml/opam-repository#22668

Merged

tmcgilchrist mentioned this pull request Feb 28, 2024

perf profiling in 4.14 and 5.0: calls to GC misattributed? #12563

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore frame-pointers support for amd64 #11144

Restore frame-pointers support for amd64 #11144

fabbing commented Mar 29, 2022 •

edited

Loading

fabbing Mar 29, 2022

dra27 Mar 29, 2022

kayceesrk commented Mar 30, 2022

gasche commented Mar 31, 2022 •

edited

Loading

kayceesrk commented Mar 31, 2022

sadiqj commented Apr 3, 2022

kayceesrk commented Apr 4, 2022

sadiqj commented Apr 4, 2022

fabbing commented Apr 5, 2022 •

edited

Loading

sadiqj commented Apr 5, 2022

gasche commented Apr 5, 2022

gasche commented Apr 5, 2022

xavierleroy commented Apr 5, 2022

kayceesrk commented Apr 6, 2022

fabbing commented Apr 6, 2022 •

edited

Loading

fabbing commented Apr 22, 2022

xavierleroy commented May 4, 2022

xavierleroy commented May 4, 2022

vsiles commented Jun 10, 2022

let-def commented Jun 27, 2022 •

edited

Loading

fabbing commented Jun 28, 2022

let-def commented Jun 28, 2022

gasche commented Jun 29, 2022

kayceesrk commented Jun 29, 2022

fabbing commented Jun 30, 2022

gasche commented Jun 30, 2022

kayceesrk Jun 30, 2022

kayceesrk commented Jun 30, 2022

shindere commented Oct 11, 2022 via email

	/* Need to update the olderst savde frame pointer here as the execution of
	/* Need to update the oldest saved frame pointer here as the execution of

Restore frame-pointers support for amd64 #11144

Restore frame-pointers support for amd64 #11144

Conversation

fabbing commented Mar 29, 2022 • edited Loading

Introduction

Changes

Fiber reallocation

Reserving rbp for frame-pointers at C calls

Added tests to the testsuite

Acknowledgement

fabbing Mar 29, 2022

Choose a reason for hiding this comment

dra27 Mar 29, 2022

Choose a reason for hiding this comment

kayceesrk commented Mar 30, 2022

gasche commented Mar 31, 2022 • edited Loading

kayceesrk commented Mar 31, 2022

sadiqj commented Apr 3, 2022

kayceesrk commented Apr 4, 2022

sadiqj commented Apr 4, 2022

fabbing commented Apr 5, 2022 • edited Loading

sadiqj commented Apr 5, 2022

gasche commented Apr 5, 2022

gasche commented Apr 5, 2022

xavierleroy commented Apr 5, 2022

kayceesrk commented Apr 6, 2022

fabbing commented Apr 6, 2022 • edited Loading

fabbing commented Apr 22, 2022

xavierleroy commented May 4, 2022

xavierleroy commented May 4, 2022

vsiles commented Jun 10, 2022

let-def commented Jun 27, 2022 • edited Loading

fabbing commented Jun 28, 2022

let-def commented Jun 28, 2022

gasche commented Jun 29, 2022

kayceesrk commented Jun 29, 2022

fabbing commented Jun 30, 2022

gasche commented Jun 30, 2022

kayceesrk Jun 30, 2022

Choose a reason for hiding this comment

kayceesrk commented Jun 30, 2022

shindere commented Oct 11, 2022 via email

fabbing commented Mar 29, 2022 •

edited

Loading

gasche commented Mar 31, 2022 •

edited

Loading

fabbing commented Apr 5, 2022 •

edited

Loading

fabbing commented Apr 6, 2022 •

edited

Loading

let-def commented Jun 27, 2022 •

edited

Loading