Skip to content

runtime/pprof: collecting mutex or block profiles add ~100ms of wall time #63078

@rhysh

Description

@rhysh

What version of Go are you using (go version)?

go1.21 version
go version go1.21.1 darwin/arm64

Does this issue reproduce with the latest release?

Yes, Go 1.21 is the latest release series.

What operating system and processor architecture are you using (go env)?

go env Output
$ go1.21 env
GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/rhys/Library/Caches/go-build'
GOENV='/Users/rhys/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/rhys/go/pkg/mod'
GONOPROXY='*'
GONOSUMDB='*'
GOOS='darwin'
GOPATH='/Users/rhys/go'
GOPRIVATE='*'
GOPROXY='direct'
GOROOT='/Users/rhys/go/version/go1.21'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='local'
GOTOOLDIR='/Users/rhys/go/version/go1.21/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.21.1'
GCCGO='gccgo'
AR='ar'
CC='clang'
CXX='clang++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/lf/n4ry0qv5639_3z0nhyhy3z7c0000gs/T/go-build1797745903=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

$ cat slowprof.go
package main

import (
	"flag"
	"io"
	"log"
	"runtime"
	"runtime/pprof"
)

func main() {
	cpu := flag.Bool("cpu", false, "collect a CPU profile")
	mutex := flag.Int("mutex", -1, "collect a mutex profile with specified debug level")
	block := flag.Int("block", -1, "collect a block profile with specified debug level")
	flag.Parse()

	if *cpu {
		err := pprof.StartCPUProfile(io.Discard)
		if err != nil {
			log.Fatalf("pprof.StartCPUProfile: %v", err)
		}
		pprof.StopCPUProfile()
	}

	if *mutex >= 0 {
		runtime.SetMutexProfileFraction(2)
		err := pprof.Lookup("mutex").WriteTo(io.Discard, *mutex)
		if err != nil {
			log.Fatalf(`pprof.Lookup("mutex").WriteTo: %v`, err)
		}
	}

	if *block >= 0 {
		runtime.SetBlockProfileRate(2)
		err := pprof.Lookup("block").WriteTo(io.Discard, *block)
		if err != nil {
			log.Fatalf(`pprof.Lookup("block").WriteTo: %v`, err)
		}
	}
}

$ go1.21 build -o /tmp/slowprof ./slowprof.go

$ for i in {1..10}; do time /tmp/slowprof; done                          
/tmp/slowprof  0.00s user 0.01s system 58% cpu 0.022 total
/tmp/slowprof  0.00s user 0.00s system 36% cpu 0.009 total
/tmp/slowprof  0.00s user 0.00s system 55% cpu 0.007 total
/tmp/slowprof  0.00s user 0.00s system 41% cpu 0.010 total
/tmp/slowprof  0.00s user 0.00s system 39% cpu 0.009 total
/tmp/slowprof  0.00s user 0.00s system 36% cpu 0.006 total
/tmp/slowprof  0.00s user 0.00s system 37% cpu 0.005 total
/tmp/slowprof  0.00s user 0.00s system 33% cpu 0.008 total
/tmp/slowprof  0.00s user 0.00s system 29% cpu 0.007 total
/tmp/slowprof  0.00s user 0.00s system 29% cpu 0.007 total


$ for i in {1..10}; do time GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0; done
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.01s system 7% cpu 0.120 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 4% cpu 0.117 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.01s system 7% cpu 0.120 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 6% cpu 0.120 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 5% cpu 0.115 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 5% cpu 0.113 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 5% cpu 0.112 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 6% cpu 0.113 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 5% cpu 0.111 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 6% cpu 0.118 total

$ for i in {1..10}; do time GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0; done
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.01s system 6% cpu 0.118 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.00s system 5% cpu 0.116 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.00s system 6% cpu 0.118 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.00s system 6% cpu 0.119 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.00s system 6% cpu 0.118 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.00s system 6% cpu 0.119 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.01s system 7% cpu 0.119 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.01s system 6% cpu 0.118 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.01s system 7% cpu 0.116 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.01s system 7% cpu 0.119 total

What did you expect to see?

Somewhat small overhead for using block and mutex profiles in otherwise lightweight tools. Small impact to app startup (which may include the call to runtime.SetBlockProfileRate).

What did you see instead?

The first call to runtime.tickspersecond calls usleep(100 * 1000), which (if uninterrupted) takes 100ms. There's one such call in runtime.SetBlockProfileRate https://github.com/golang/go/blob/go1.21.1/src/runtime/mprof.go#L471 when configuring the block profiler, even if that profile is never read. There are others in the code to write out mutex and block profiles.

Metadata

Metadata

Assignees

No one assigned

    Labels

    FrozenDueToAgeNeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.compiler/runtimeIssues related to the Go compiler and/or runtime.help wanted

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions