runtime: framepointer-based stack unwinding omits memmove's callee #58835
It looks like the immediate caller of
What version of Go are you using (
The text was updated successfully, but these errors were encountered:
I don't think so.
Do you know if C memmove saves frame pointer (I guess it depends on implementation) and whether
It's not fixed by CL 466316 (details below).
For what it's worth, I came across this while asking
This would affect a question @aclements had elsewhere recently about whether, when using
I don't know what performance is saved by not saving the frame pointer, but I want to make the point that the costs of various forms of work that take place in those frameless leaf functions can add up.
I don't know my way around stack management, so maybe this is a nonsense suggestion: is there room for a middle-ground where the caller carves out an extra word of stack space (as it would for an additional
Maybe the heuristics for this could be extended to take the "complexity" of the leaf function into account, similar to how it is done for inlining? Not sure how feasible this is, but looking at memmove's implementation, I suspect it wouldn't cause a noticeable perf hit if it pushed a frame pointer. But I haven't measured that 😅.
Not sure if this could work. I think it would cause the positions of the return addr and the frame pointer to be reversed on the stack. Frame pointer unwinders rely on the relative position of these two values on the stack.
@rhysh quick question: Have you tried using the DWARF unwinder in perf for this?
Thanks for the suggestion, but I think we'd want to keep the assembler simple and avoid surprises (e.g. it doesn't save frame pointer for a 50-instruction function, but suddenly does for a 51-instruction function?) If you think it is helpful, it wouldn't be hard to manually save the frame pointer, or declare a nonzero frame size (if this comes often, we can think about adding a new text flag, e.g. HAVEFRAME, to force saving frame pointer).
Agree with @cherrymui, frameless leaf functions should remain frameless, for performance and code size. AFAIK,
Indeed @cherrymui , changing the end of the TEXT directive from
What sort of data would motivate accepting this change to
I haven't used that since
Here's part of what I get from
And with @cherrymui 's suggested workaround and
go run omits DWARF info, but I'm not sure about
Either way, I'm +1 on your proposed change to memmove if it doesn't cause noticeable performance regressions.
If benchmark results show minimal performance impact, and it helps debugging/profiling, I think it would be acceptable.