Skip to content

runtime: sub optimal gc scalability #21056

Open
@TocarIP

Description

@TocarIP

What version of Go are you using (go version)?

go version devel +4e9c86a Wed Jun 28 17:33:40 2017 +0000 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/localdisk/itocar/gopath"
GORACE=""
GOROOT="/localdisk/itocar/golang"
GOTOOLDIR="/localdisk/itocar/golang/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build633808214=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

What did you do?

Run test/bench/garbage/tree2 on machine with 88 threads (2 sockets, 22 cores per socket, 2 threads per core) 2x Xeon E5-2699 v4
With following options:
./tree2 -cpus=88 -heapsize=1000000000 -cpuprofile=tree2.pprof

What did you expect to see?

runtime.gcDrain taking insignificant amount of time

What did you see instead?

runtime.gcDrain taking about half of all time:

Showing top 10 nodes out of 33
      flat  flat%   sum%        cum   cum%
    36.95s 45.03% 45.03%     75.98s 92.59%  runtime.gcDrain /localdisk/itocar/golang/src/runtime/mgcmark.go
    12.38s 15.09% 60.11%     12.38s 15.09%  runtime.(*lfstack).pop /localdisk/itocar/golang/src/runtime/lfstack.go
     7.51s  9.15% 69.27%      7.51s  9.15%  runtime.greyobject /localdisk/itocar/golang/src/runtime/mgcmark.go
     6.28s  7.65% 76.92%     19.49s 23.75%  runtime.scanobject /localdisk/itocar/golang/src/runtime/mgcmark.go
     4.54s  5.53% 82.45%      4.54s  5.53%  runtime.(*lfstack).push /localdisk/itocar/golang/src/runtime/lfstack.go

Looking into runtime.gcDrain, I see that almost all time is spent on
35.66s 35.66s 924: if work.full == 0 {

I couldn't reproduce this behavior on machine with small number of cores.
Looking into cache miss profile shows that this is due to all cores updating head of work.full,
which causes all reads needed for check to miss cache.

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.PerformanceScalabilityIssues related to runtime/application scalabilitycompiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions