Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: CPU profiles should attribute time spent in runtime.morestack to the function call that triggered it #25943

Closed
richardartoul opened this issue Jun 18, 2018 · 8 comments

Comments

Projects
None yet
6 participants
@richardartoul
Copy link

commented Jun 18, 2018

What did you do?

Details can be found in #18138

What did you expect to see?

I expected CPU profiles to attribute time spent in runtime.morestack to the function that triggered stack growth.

What did you see instead?

Time spent in runtime.morestack is measured properly, but is not tied to any particular stack or series of function calls, making it difficult to determine which code path is experiencing excessive goroutine stack growth.

Does this issue reproduce with the latest release (go1.10.3)?

Yes.

System details

go version go1.10.2 darwin/amd64
GOARCH="amd64"
GOBIN=""
GOCACHE="/Users/richardartoul/Library/Caches/go-build"
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/richardartoul/gocode"
GORACE=""
GOROOT="/usr/local/go"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/38/w8yq1thn7njcv28q20376y_40000gn/T/go-build965061643=/tmp/go-build -gno-record-gcc-switches -fno-common"
GOROOT/bin/go version: go version go1.10.2 darwin/amd64
GOROOT/bin/go tool compile -V: compile version go1.10.2
uname -v: Darwin Kernel Version 16.7.0: Thu Jan 11 22:59:40 PST 2018; root:xnu-3789.73.8~1/RELEASE_X86_64
ProductName:	Mac OS X
ProductVersion:	10.12.6
BuildVersion:	16G1212
lldb --version: lldb-370.0.42
  Swift-3.1
@hyangah

This comment has been minimized.

Copy link
Contributor

commented Jun 18, 2018

Should we make the existing cpu profile to show the morestack call stack stitched with the goroutine stack that caused stack growth (that means I guess runtime.gentraceback behavior change)?

Or do we want to create a new built-in profile for stack size changes (per @josharian's comment in the issue 18138)?

cc @aclements

@hyangah hyangah changed the title CPU profiles should attribute time spent in runtime.morestack to the function call that triggered it runtime: CPU profiles should attribute time spent in runtime.morestack to the function call that triggered it Jun 18, 2018

@richardartoul

This comment has been minimized.

Copy link
Author

commented Jun 18, 2018

Just want to add that if this isn’t a terribly complex task I’m happy to contribute engineering time if someone is willing to point me in the right direction

@ianlancetaylor ianlancetaylor added this to the Go1.12 milestone Jun 19, 2018

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2018

I have to think that this would be most useful in the standard CPU and memory usage profiles, but I don't know how to implement that.

@heschik

This comment has been minimized.

Copy link
Contributor

commented Jul 27, 2018

Just back from a Go NYC meetup, where @richardartoul went through the backstory here, so I took a second look at this bug. I think I see at least the first problem, but I haven't actually tested it. Notes for whoever works on this next:

Profiling (on Linux) is driven by the sigprof handler. It calls gentraceback with _TraceJumpStack, which requests that it trace through transitions to the system stack. But morestack is unusual in that it does its own system stack transition without calling systemstack, and gentraceback only handles stack switches done specifically by systemstack.

I think I'd start by changing that test to do check against a list of functions that switch to the system stack, not just systemstack. (I don't think setsSP is exactly right because gogo isn't doing a system stack transition?) AFAICT that'd do it, but I'm probably missing something.

(All of this is for CPU profiling. @hyangah looked at memory profiling and IIRC concluded that there was a hardcoded test in there somewhere that just cut off the stack trace for runtime functions, but I don't have the link handy.)

@gopherbot

This comment has been minimized.

Copy link

commented Jul 30, 2018

Change https://golang.org/cl/126795 mentions this issue: runtime: handle morestack system stack transition in gentraceback

gopherbot pushed a commit that referenced this issue Aug 23, 2018

runtime: handle morestack system stack transition in gentraceback
gentraceback handles system stack transitions, but only when they're
done by systemstack(). Handle morestack() too.

I tried to do this generically but systemstack and morestack are
actually *very* different functions. Most notably, systemstack returns
"normally", just messes with $sp along the way. morestack never
returns at all -- it calls newstack, and newstack then jumps both
stacks and functions back to whoever called morestack. I couldn't
think of a way to handle both of them generically. So don't.

The current implementation does not include systemstack on the generated
traceback. That's partly because I don't know how to find its stack frame
reliably, and partly because the current structure of the code wants to
do the transition before the call, not after. If we're willing to
assume that morestack's stack frame is 0 size, this could probably be
fixed.

For posterity's sake, alternatives tried:

- Have morestack put a dummy function into schedbuf, like systemstack
does. This actually worked (see patchset 1) but more by a series of
coincidences than by deliberate design. The biggest coincidence was that
because morestack_switch was a RET and its stack was 0 size, it actually
worked to jump back to it at the end of newstack -- it would just return
to the caller of morestack. Way too subtle for me, and also a little
slower than just jumping directly.

- Put morestack's PC and SP into schedbuf, so that gentraceback could
treat it like a normal function except for the changing SP. This was a
terrible idea and caused newstack to reenter morestack in a completely
unreasonable state.

To make testing possible I did a small redesign of testCPUProfile to
take a callback that defines how to check if the conditions pass to it
are satisfied. This seemed better than making the syntax of the "need"
strings even more complicated.

Updates #25943

Change-Id: I9271a30a976f80a093a3d4d1c7e9ec226faf74b4
Reviewed-on: https://go-review.googlesource.com/126795
Run-TryBot: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
@heschik

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2018

This should now be fixed for CPU profiles. Note that due to implementation issues runtime.morestack function won't appear in the backtrace at all, but runtime.newstack will.

Memory profiling of runtime functions is a separate issue which I haven't looked at closely. The change above probably helped, but it doesn't matter due to this policy in the protobuf writer:

if hideRuntime {
if f := runtime.FuncForPC(addr); f != nil && strings.HasPrefix(f.Name(), "runtime.") {
continue
}
// Found non-runtime. Show any runtime uses above it.
hideRuntime = false

which accounts all memory allocation by runtime functions to the user code that called them. This is probably to hide stuff like string-to-[]byte conversions, which don't show up explicitly in user code.

I'm going to close this issue since the original comment requested CPU only. We can reopen if necessary.

@heschik heschik closed this Aug 23, 2018

@josharian

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2018

Note that due to implementation issues runtime.morestack function won't appear in the backtrace at all, but runtime.newstack will.

This makes me a bit nervous. Having runtime.morestack in the backtrace is useful as an optimization clue.

@heschik

This comment has been minimized.

Copy link
Contributor

commented Aug 29, 2018

I agree, but since the morestack call is never returned to, putting it in the backtrace would require a level of specific hackery that I was uncomfortable with. I tried making newstack return normally in the common case but it proved more difficult than I think it's worth.

I don't think newstack is that much less clear than morestack. If it actually causes someone trouble we can revisit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.