Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: CPU profiles should attribute time spent in runtime.morestack to the function call that triggered it #25943

richardartoul opened this issue Jun 18, 2018 · 8 comments


Copy link

@richardartoul richardartoul commented Jun 18, 2018

What did you do?

Details can be found in #18138

What did you expect to see?

I expected CPU profiles to attribute time spent in runtime.morestack to the function that triggered stack growth.

What did you see instead?

Time spent in runtime.morestack is measured properly, but is not tied to any particular stack or series of function calls, making it difficult to determine which code path is experiencing excessive goroutine stack growth.

Does this issue reproduce with the latest release (go1.10.3)?


System details

go version go1.10.2 darwin/amd64
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/38/w8yq1thn7njcv28q20376y_40000gn/T/go-build965061643=/tmp/go-build -gno-record-gcc-switches -fno-common"
GOROOT/bin/go version: go version go1.10.2 darwin/amd64
GOROOT/bin/go tool compile -V: compile version go1.10.2
uname -v: Darwin Kernel Version 16.7.0: Thu Jan 11 22:59:40 PST 2018; root:xnu-3789.73.8~1/RELEASE_X86_64
ProductName:	Mac OS X
ProductVersion:	10.12.6
BuildVersion:	16G1212
lldb --version: lldb-370.0.42
Copy link

@hyangah hyangah commented Jun 18, 2018

Should we make the existing cpu profile to show the morestack call stack stitched with the goroutine stack that caused stack growth (that means I guess runtime.gentraceback behavior change)?

Or do we want to create a new built-in profile for stack size changes (per @josharian's comment in the issue 18138)?

cc @aclements

@hyangah hyangah changed the title CPU profiles should attribute time spent in runtime.morestack to the function call that triggered it runtime: CPU profiles should attribute time spent in runtime.morestack to the function call that triggered it Jun 18, 2018
Copy link

@richardartoul richardartoul commented Jun 18, 2018

Just want to add that if this isn’t a terribly complex task I’m happy to contribute engineering time if someone is willing to point me in the right direction

Copy link

@ianlancetaylor ianlancetaylor commented Jun 19, 2018

I have to think that this would be most useful in the standard CPU and memory usage profiles, but I don't know how to implement that.

Copy link

@heschi heschi commented Jul 27, 2018

Just back from a Go NYC meetup, where @richardartoul went through the backstory here, so I took a second look at this bug. I think I see at least the first problem, but I haven't actually tested it. Notes for whoever works on this next:

Profiling (on Linux) is driven by the sigprof handler. It calls gentraceback with _TraceJumpStack, which requests that it trace through transitions to the system stack. But morestack is unusual in that it does its own system stack transition without calling systemstack, and gentraceback only handles stack switches done specifically by systemstack.

I think I'd start by changing that test to do check against a list of functions that switch to the system stack, not just systemstack. (I don't think setsSP is exactly right because gogo isn't doing a system stack transition?) AFAICT that'd do it, but I'm probably missing something.

(All of this is for CPU profiling. @hyangah looked at memory profiling and IIRC concluded that there was a hardcoded test in there somewhere that just cut off the stack trace for runtime functions, but I don't have the link handy.)

Copy link

@gopherbot gopherbot commented Jul 30, 2018

Change mentions this issue: runtime: handle morestack system stack transition in gentraceback

gopherbot pushed a commit that referenced this issue Aug 23, 2018
gentraceback handles system stack transitions, but only when they're
done by systemstack(). Handle morestack() too.

I tried to do this generically but systemstack and morestack are
actually *very* different functions. Most notably, systemstack returns
"normally", just messes with $sp along the way. morestack never
returns at all -- it calls newstack, and newstack then jumps both
stacks and functions back to whoever called morestack. I couldn't
think of a way to handle both of them generically. So don't.

The current implementation does not include systemstack on the generated
traceback. That's partly because I don't know how to find its stack frame
reliably, and partly because the current structure of the code wants to
do the transition before the call, not after. If we're willing to
assume that morestack's stack frame is 0 size, this could probably be

For posterity's sake, alternatives tried:

- Have morestack put a dummy function into schedbuf, like systemstack
does. This actually worked (see patchset 1) but more by a series of
coincidences than by deliberate design. The biggest coincidence was that
because morestack_switch was a RET and its stack was 0 size, it actually
worked to jump back to it at the end of newstack -- it would just return
to the caller of morestack. Way too subtle for me, and also a little
slower than just jumping directly.

- Put morestack's PC and SP into schedbuf, so that gentraceback could
treat it like a normal function except for the changing SP. This was a
terrible idea and caused newstack to reenter morestack in a completely
unreasonable state.

To make testing possible I did a small redesign of testCPUProfile to
take a callback that defines how to check if the conditions pass to it
are satisfied. This seemed better than making the syntax of the "need"
strings even more complicated.

Updates #25943

Change-Id: I9271a30a976f80a093a3d4d1c7e9ec226faf74b4
Run-TryBot: Heschi Kreinick <>
TryBot-Result: Gobot Gobot <>
Reviewed-by: Austin Clements <>
Copy link

@heschi heschi commented Aug 23, 2018

This should now be fixed for CPU profiles. Note that due to implementation issues runtime.morestack function won't appear in the backtrace at all, but runtime.newstack will.

Memory profiling of runtime functions is a separate issue which I haven't looked at closely. The change above probably helped, but it doesn't matter due to this policy in the protobuf writer:

if hideRuntime {
if f := runtime.FuncForPC(addr); f != nil && strings.HasPrefix(f.Name(), "runtime.") {
// Found non-runtime. Show any runtime uses above it.
hideRuntime = false

which accounts all memory allocation by runtime functions to the user code that called them. This is probably to hide stuff like string-to-[]byte conversions, which don't show up explicitly in user code.

I'm going to close this issue since the original comment requested CPU only. We can reopen if necessary.

@heschi heschi closed this Aug 23, 2018
Copy link

@josharian josharian commented Aug 25, 2018

Note that due to implementation issues runtime.morestack function won't appear in the backtrace at all, but runtime.newstack will.

This makes me a bit nervous. Having runtime.morestack in the backtrace is useful as an optimization clue.

Copy link

@heschi heschi commented Aug 29, 2018

I agree, but since the morestack call is never returned to, putting it in the backtrace would require a level of specific hackery that I was uncomfortable with. I tried making newstack return normally in the common case but it proved more difficult than I think it's worth.

I don't think newstack is that much less clear than morestack. If it actually causes someone trouble we can revisit.

@golang golang locked and limited conversation to collaborators Aug 29, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants