Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: rewrite gentraceback as an iterator API #54466

Closed
aclements opened this issue Aug 15, 2022 · 37 comments
Closed

runtime: rewrite gentraceback as an iterator API #54466

aclements opened this issue Aug 15, 2022 · 37 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FeatureRequest FrozenDueToAge
Milestone

Comments

@aclements
Copy link
Member

Currently, all stack walking logic is in one venerable, large, and very, very complicated function: runtime.gentraceback. This function has three distinct operating modes: printing, populating a PC buffer, or invoking a callback. And it has three different modes of unwinding: physical Go frames, inlined Go frames, and cgo frames. It also has several flags. All of this logic is very interwoven.

I would like to replace all of this with a caller-driven iterator-style interface. This is a tracking issue for that change.

An iterator API will consolidate the logic for unwinding and allow us to lift out printing and pcbuf populating into separate code, while replacing the callback mode with direct use of the new API. It will allow us to better layer the different modes of unwinding by creating separate iterator types for physical, inlined, and cgo frames, while keeping the interface ergonomic. This is also a good opportunity to generally clean up this code.

As a follow-on, I plan to dramatically simplify the defer implementation. Regabi enabled many simplifications to defer and we've implemented many of them already, but there are more aggressive simplifications we haven't tackled yet. Part of this is simplifying open-coded defers, and doing that efficiently requires being able to simultaneously walk the stack frames and the defer stack. An iterator API will make this much easier to do.

An alternative approach would be to use a callback interface rather than an iterator. This would be an improvement over the status quo and also be a simpler change, but I think it has two drawbacks: 1. It makes layering physical/inlined frame unwinding more awkward because you need multiple levels of callbacks. 2. It's poorly suited to parallel iteration like we need for the open-coded defer implementation. Long term, an iterator API probably makes it simpler to scan a goroutine stack while the goroutine is still running (reducing per-goroutine latency for goroutines with large stacks) because we can easily pause and resume unwinding, while a callback API doesn't easily afford this opportunity.

@aclements aclements self-assigned this Aug 15, 2022
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Aug 15, 2022
@randall77
Copy link
Contributor

Keep in mind #7181, where we want to print bottom and top of deep stacks. It probably requires walking the stack twice, or keeping a buffer of 50 frames, or something like that.
I think it would not be a problem with this plan, just FYI.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/424254 mentions this issue: runtime: drop function context from traceback

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/424255 mentions this issue: runtime: drop redundant argument to getArgInfo

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/424257 mentions this issue: runtime: switch gp when jumping stacks during traceback

@aclements
Copy link
Member Author

Keep in mind #7181, where we want to print bottom and top of deep stacks.

I think this will actually make that dramatically easier by inverting the flow of traceback printing. With this change, the printer will drive the stack walk instead of the other way around, so I think it will be much easier for the printer to keep the buffer it needs, and to do so without adding complexity around the stack walk itself.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/424514 mentions this issue: runtime: replace stkframe.arglen/argmap with methods

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/424516 mentions this issue: runtime: consolidate stkframe and its methods into stkframe.go

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/424515 mentions this issue: runtime: make getStackMap a method of stkframe

@mknyszek mknyszek added this to the Go1.20 milestone Aug 17, 2022
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/425936 mentions this issue: runtime: simplify stkframe.argMapInternal

gopherbot pushed a commit that referenced this issue Sep 2, 2022
Currently, gentraceback tracks the closure context of the outermost
frame. This used to be important for "unstarted" calls to reflect
function stubs, where "unstarted" calls are either deferred functions
or the entry-point of a goroutine that hasn't run. Because reflect
function stubs have a dynamic argument map, we have to reach into
their closure context to fetch to map, and how to do this differs
depending on whether the function has started. This was discovered in
issue #25897.

However, as part of the register ABI, "go" and "defer" were made much
simpler, and any "go" or "defer" of a function that takes arguments or
returns results gets wrapped in a closure that provides those
arguments (and/or discards the results). Hence, we'll see that closure
instead of a direct call to a reflect stub, and can get its static
argument map without any trouble.

The one case where we may still see an unstarted reflect stub is if
the function takes no arguments and has no results, in which case the
compiler can optimize away the wrapper closure. But in this case we
know the argument map is empty: the compiler can apply this
optimization precisely because the target function has no argument
frame.

As a result, we no longer need to track the closure context during
traceback, so this CL drops all of that mechanism.

We still have to be careful about the unstarted case because we can't
reach into the function's locals frame to pull out its context
(because it has no locals frame). We double-check that in this case
we're at the function entry.

I would prefer to do this with some in-code PCDATA annotations of
where to find the dynamic argument map, but that's a lot of mechanism
to introduce for just this. It might make sense to consider this along
with #53609.

Finally, we beef up the test for this so it more reliably forces the
runtime down this path. It's fundamentally probabilistic, but this
tweak makes it better. Scheduler testing hooks (#54475) would make it
possible to write a reliable test for this.

For #54466, but it's a nice clean-up all on its own.

Change-Id: I16e4f2364ba2ea4b1fec1e27f971b06756e7b09f
Reviewed-on: https://go-review.googlesource.com/c/go/+/424254
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Auto-Submit: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
gopherbot pushed a commit that referenced this issue Sep 2, 2022
The f funcInfo argument is always the same as frame.fn, so we don't
need to pass it. I suspect that was there to make the signatures of
getArgInfoFast and getArgInfo more similar, but it's not necessary.

For #54466.

Change-Id: Idc717f4df09e97cad49d52c5b7edf28090908cba
Reviewed-on: https://go-review.googlesource.com/c/go/+/424255
Run-TryBot: Austin Clements <austin@google.com>
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Sep 2, 2022
Currently, when traceback jumps from the system stack to a user stack
(e.g., during profiling tracebacks), it leaves gp pointing at the g0.
This is currently harmless since it's only used during profiling, so
the code paths in gentraceback that care about gp aren't used, but
it's really confusing and would certainly break if _TraceJumpStack
were ever used in a context other than profiling.

Fix this by updating gp to point to the user g when we switch stacks.

For #54466.

Change-Id: I1541e004667a52e37671803ce45c91d8c5308830
Reviewed-on: https://go-review.googlesource.com/c/go/+/424257
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
gopherbot pushed a commit that referenced this issue Sep 2, 2022
Currently, stkframe.arglen and stkframe.argmap are populated by
gentraceback under a particular set of circumstances. But because they
can be constructed from other fields in stkframe, they don't need to
be computed eagerly at all. They're also rather misleading, as they're
only part of computing the actual argument map and most callers should
be using getStackMap, which does the rest of the work.

This CL drops these fields from stkframe. It shifts the functions that
used to compute them, getArgInfoFast and getArgInfo, into
corresponding methods stkframe.argBytes and stkframe.argMapInternal.
argBytes is expected to be used by callers that need to know only the
argument frame size, while argMapInternal is used only by argBytes and
getStackMap.

We also move some of the logic from getStackMap into argMapInternal
because the previous split of responsibilities didn't make much sense.
This lets us return just a bitvector from argMapInternal, rather than
both a bitvector, which carries a size, and an "actually use this
size".

The getArgInfoFast function was inlined before (and inl_test checked
this). We drop that requirement from stkframe.argBytes because the
uses of this have shifted and now it's only called from heap dumping
(which never happens) and conservative stack frame scanning (which
very, very rarely happens).

There will be a few follow-up clean-up CLs.

For #54466. This is a nice clean-up on its own, but it also serves to
remove pointers from the traceback state that would eventually become
troublesome write barriers once we stack-rip gentraceback.

Change-Id: I107f98ed8e7b00185c081de425bbf24af02a4163
Reviewed-on: https://go-review.googlesource.com/c/go/+/424514
Run-TryBot: Austin Clements <austin@google.com>
Auto-Submit: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Sep 2, 2022
This places getStackMap alongside argBytes and argMapInternal as
another method of stkframe.

For #54466, albeit rather indirectly.

Change-Id: I411dda3605dd7f996983706afcbefddf29a68a85
Reviewed-on: https://go-review.googlesource.com/c/go/+/424515
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Run-TryBot: Austin Clements <austin@google.com>
Auto-Submit: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Sep 2, 2022
The stkframe struct and its methods are strewn across different source
files. Since they actually have a pretty coherent theme at this point,
migrate it all into a new file, stkframe.go. There are no code changes
in this CL.

For #54466, albeit rather indirectly.

Change-Id: Ibe53fc4b1106d131005e1c9d491be838a8f14211
Reviewed-on: https://go-review.googlesource.com/c/go/+/424516
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Auto-Submit: Austin Clements <austin@google.com>
gopherbot pushed a commit that referenced this issue Sep 2, 2022
Use an early return to reduce indentation and clarify flow.

For #54466.

Change-Id: I12ce810bea0f22b8707a175dc5ba66241c0a9a21
Reviewed-on: https://go-review.googlesource.com/c/go/+/425936
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Cherry Mui <cherryyz@google.com>
Auto-Submit: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@aclements
Copy link
Member Author

As a follow-on, I plan to dramatically simplify the defer implementation.

Note to self: simplifying defer using an iterator-style gentraceback would also put us in a much better position to fix panic/recover causing bad stacks and memory leaks in the race detector (#26813, #37233).

@aclements
Copy link
Member Author

Note to self: an iterator-style traceback would make it much easier to eliminate the limit on CPU profile stack frames (#56029).

@felixge
Copy link
Contributor

felixge commented Dec 6, 2022

I hacked on this a bit last week, see: felixge#4

It's still far away from being mergable, and perhaps the WIP code is even worse than the initial code right now. I was trying to keep the current behavior exactly as is (bug-for-bug compatibility), but this lead to some gnarly hacks around write barriers that can probably be removed again in the end.

Anyway I still managed to make quite a bit of progress towards inverting the loop control by taking many small steps while passing the test suite for each commit. So if anybody wants to take a look and leave some high-level comments (e.g. on whether or not a state machine inside the iterator is overkill or not), I'd be grateful.

I don't know when I'll find time to continue this work (or if I should), but the next step would be to finish moving the pcbuf filling into gentraceback2 while also dealing with the remaining for loop that handles cgo traceback. After that it should be fairly easy to also extract the printing use case.

@aclements
Copy link
Member Author

@felixge , I just mailed out the work-in-progress CL I got to a few months ago before this got bumped by higher priorities. It's a pretty different approach from the one you took, though I ran into the same issues with write barriers, which I was in the process of engineering around when I put the work on hold.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/468298 mentions this issue: runtime: simplify traceback PC back-up logic

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/468300 mentions this issue: runtime: new API for filling PC traceback buffers

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/468299 mentions this issue: runtime: move cgo traceback into unwinder

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/468301 mentions this issue: runtime: delete gentraceback

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/468297 mentions this issue: runtime: replace all callback uses of gentraceback with unwinder

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/468296 mentions this issue: runtime: make unsafe.Slice usable from nowritebarrierrec

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/472956 mentions this issue: runtime: add a benchmark of Callers

@aclements
Copy link
Member Author

I've been digging more into testing tracebacks. I came up with a different approach that seems quite promising for testing the really dark corner cases: set your own single-step flag just before doing something we want to test, and in the SIGTRAP handler keep taking tracebacks and single-stepping until we get what we're looking for or return from the test situation. I have a rough prototype of this on x86 that works on Linux and may trivially work on other Unixes, and I suspect would work without much trouble on Windows. I'm pretty sure I can do it on ARM64, too, but haven't written the code. I'm not sure about other architectures.

However, I also looked at the coverage of the traceback code from the runtime and runtime/pprof tests and it's actually pretty good. Between that and @prattmic's assessment that the code change actually isn't too bad, I'm no longer worried that significantly beefing up the tests is a prerequisite. Better testing is always nice, of course.

gopherbot pushed a commit that referenced this issue Mar 10, 2023
We're about to make major changes to tracebacks. We have benchmarks of
stack copying, but not of PC buffer filling, so add some that we can
track through these changes.

For #54466.

Change-Id: I3ed61d75144ba03b61517cd9834eeb71c99d74df
Reviewed-on: https://go-review.googlesource.com/c/go/+/472956
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
Currently, gentraceback keeps a copy of the stack bounds of the stack
it's walking in the "stack" variable. Now that "gp" always refers to
the G whose stack it's walking, we can simply use gp.stack instead of
keeping a separate copy.

For #54466.

Change-Id: I68256e5dff6212cfcf14eda615487e66a92d4914
Reviewed-on: https://go-review.googlesource.com/c/go/+/458215
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
gentraceback also tracks the funcID of the callee, which is more
general. Fix this up to happen in all cases and eliminate waspanic in
favor of checking the funcID of the caller.

For #54466.

Change-Id: Idc98365a6f05022db18ddcd5b3ed8684a6872a88
Reviewed-on: https://go-review.googlesource.com/c/go/+/458216
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
Currently, gentraceback resolves the funcInfo of the caller prior to
processing the current frame (calling the callback, printing it, etc).
As a result, if this lookup fails in a verbose context, it will print
the failure before printing the frame that it's already resolved.

To fix this, move the resolution of LR to a funcInfo to after current
frame processing.

This also has the advantage that we can reduce the scope of "flr" (the
caller's funcInfo) to only the post-frame part of the loop, which will
make it easier to stack-rip gentraceback into an iterator.

For #54466.

Change-Id: I8be44d4eac598a686c32936ab37018b8aa97c00b
Reviewed-on: https://go-review.googlesource.com/c/go/+/458217
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
For #54466.

Change-Id: I4d8e1953703b6c763e5bd53024da43efcc993489
Reviewed-on: https://go-review.googlesource.com/c/go/+/466095
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Austin Clements <austin@google.com>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
We've replicated the code to expand inlined frames in many places in
the runtime at this point. This CL adds a simple iterator API that
abstracts this out.

We also use this to try out a new idea for structuring tests of
runtime internals: rather than exporting this whole internal data type
and API, we write the test in package runtime and import the few bits
of std we need. The idea is that, for tests of internals, it's easier
to inject public APIs from std than it is to export non-public APIs
from runtime. This is discussed more in #55108.

For #54466.

Change-Id: Iebccc04ff59a1509694a8ac0e0d3984e49121339
Reviewed-on: https://go-review.googlesource.com/c/go/+/466096
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Austin Clements <austin@google.com>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
Since srcFunc can represent information for either an real text
function or an inlined function, this means we no longer have to
synthesize a fake _func just to call showframe on an inlined frame.

This is cleaner and also eliminates the one case where _func values
live in the heap. This will let us mark them NotInHeap, which will in
turn eliminate pesky write barriers in the traceback rewrite.

For #54466.

Change-Id: Ibf5e24d01ee4bf384c825e1a4e2922ef444a438e
Reviewed-on: https://go-review.googlesource.com/c/go/+/466097
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
We're about to rewrite this code and it has almost no test coverage
right now.

This test is also more complete than the existing
TestTracebackInlineExcluded, so we delete that test.

For #54466.

Change-Id: I144154282dac5eb3798f7d332b806f44c4a0bdf6
Reviewed-on: https://go-review.googlesource.com/c/go/+/466098
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
This converts all places in the runtime that perform inline expansion
to use the new inlineUnwinder abstraction.

For #54466.

Change-Id: I48d996fb6263ed5225bd21d30914a27ae434528d
Reviewed-on: https://go-review.googlesource.com/c/go/+/466099
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
Currently, gentraceback consumes the gp.cgoCtxt slice by copying the
slice header and then sub-slicing it as it unwinds. The code for this
is nice and clear, but we're about to lift this state into a structure
and mutating it is going to introduce write barriers that are
disallowed in gentraceback.

This CL replaces the mutable slice header with an index into
gp.cgoCtxt.

For #54466.

Change-Id: I6b701bb67d657290a784baaca34ed02d8247ede2
Reviewed-on: https://go-review.googlesource.com/c/go/+/466863
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
Many compiler-generated panics are dynamically changed to a "throw"
when they happen in the runtime. One effect of this is that they are
allowed in nowritebarrierrec contexts. Currently, the unsafe.Slice
panics don't have this treatment.

We're about to expose more code that uses unsafe.Slice to the write
barrier checker (it's actually already there and it just can't see
through an indirect call), so give these panics the dynamic check.

Very indirectly updates #54466.

Change-Id: I65cb96fa17eb751041e4fa25a1c1bd03246c82ba
Reviewed-on: https://go-review.googlesource.com/c/go/+/468296
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
This is a really nice simplification for all of these call sites.

It also achieves a nice performance improvement for stack copying:

goos: linux
goarch: amd64
pkg: runtime
cpu: Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
                       │   before    │                after                │
                       │   sec/op    │   sec/op     vs base                │
StackCopyPtr-48          89.25m ± 1%   79.78m ± 1%  -10.62% (p=0.000 n=20)
StackCopy-48             83.48m ± 2%   71.88m ± 1%  -13.90% (p=0.000 n=20)
StackCopyNoCache-48      2.504m ± 2%   2.195m ± 1%  -12.32% (p=0.000 n=20)
StackCopyWithStkobj-48   21.66m ± 1%   21.02m ± 2%   -2.95% (p=0.000 n=20)
geomean                  25.21m        22.68m       -10.04%

Updates #54466.

Change-Id: I31715b7b6efd65726940041d3052bb1c0a1186f3
Reviewed-on: https://go-review.googlesource.com/c/go/+/468297
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
Updates #54466.

Change-Id: If070cf3f484e3e02b8e586bff466e0018b1a1845
Reviewed-on: https://go-review.googlesource.com/c/go/+/468298
Run-TryBot: Austin Clements <austin@google.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
Currently, gentraceback's loop ends with a call to tracebackCgoContext
to process cgo frames. This requires spreading various parts of the
printing and pcbuf logic across these two functions.

Clean this up by moving cgo unwinding into unwinder and then lifting
the printing and pcbuf logic from tracebackCgoContext into
gentraceback along with the other printing and pcbuf logic.

Updates #54466.

Change-Id: Ic71afaa5ae110c0ea5be9409e267e4284e36a8c9
Reviewed-on: https://go-review.googlesource.com/c/go/+/468299
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
Currently, filling PC traceback buffers is one of the jobs of
gentraceback. This moves it into a new function, tracebackPCs, with a
simple API built around unwinder, and changes all callers to use this
new API.

Updates #54466.

Change-Id: Id2038bded81bf533a5a4e71178a7c014904d938c
Reviewed-on: https://go-review.googlesource.com/c/go/+/468300
Reviewed-by: Michael Pratt <mpratt@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
gopherbot pushed a commit that referenced this issue Mar 10, 2023
Printing is the only remaining functionality of gentraceback. Move
this into the traceback printing code and eliminate gentraceback. This
lets us simplify the logic, which fixes at least one minor bug:
previously, if inline unwinding pushed the total printed count over
_TracebackMaxFrames, we would print extra frames and then fail to
print "additional frames elided".

The cumulative performance effect of the series of changes starting
with "add a benchmark of Callers" (CL 472956) is:

goos: linux
goarch: amd64
pkg: runtime
cpu: Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
                       │  baseline   │              unwinder               │
                       │   sec/op    │   sec/op     vs base                │
Callers/cached-48        1.464µ ± 1%   1.684µ ± 1%  +15.03% (p=0.000 n=20)
Callers/inlined-48       1.391µ ± 1%   1.536µ ± 1%  +10.42% (p=0.000 n=20)
Callers/no-cache-48      10.50µ ± 1%   11.11µ ± 0%   +5.82% (p=0.000 n=20)
StackCopyPtr-48          88.74m ± 1%   81.22m ± 2%   -8.48% (p=0.000 n=20)
StackCopy-48             80.90m ± 1%   70.56m ± 1%  -12.78% (p=0.000 n=20)
StackCopyNoCache-48      2.458m ± 1%   2.209m ± 1%  -10.15% (p=0.000 n=20)
StackCopyWithStkobj-48   26.81m ± 1%   25.66m ± 1%   -4.28% (p=0.000 n=20)
geomean                  518.8µ        512.9µ        -1.14%

The performance impact of intermediate CLs in this sequence varies a
lot as we went through many refactorings. The slowdown in Callers
comes primarily from the introduction of unwinder because that doesn't
get inlined and results in somewhat worse code generation in code
that's extremely hot in those microbenchmarks. The performance gains
on stack copying come mostly from replacing callbacks with direct use
of the unwinder.

Updates #54466.
Fixes #32383.

Change-Id: I4970603b2861633eecec30545e852688bc7cc9a4
Reviewed-on: https://go-review.googlesource.com/c/go/+/468301
Reviewed-by: Michael Pratt <mpratt@google.com>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/475960 mentions this issue: runtime: for deep stacks, print both the top 50 and bottom 50 frames

gopherbot pushed a commit that referenced this issue Mar 21, 2023
This is relatively easy using the new traceback iterator.

Ancestor tracebacks are now limited to 50 frames. We could keep that
at 100, but the fact that it used 100 before seemed arbitrary and
unnecessary.

Fixes #7181
Updates #54466

Change-Id: If693045881d84848f17e568df275a5105b6f1cb0
Reviewed-on: https://go-review.googlesource.com/c/go/+/475960
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Michael Pratt <mpratt@google.com>
@golang golang locked and limited conversation to collaborators Mar 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FeatureRequest FrozenDueToAge
Projects
None yet
Development

No branches or pull requests

7 participants