Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: gc exhausts all cpu causing a short delay #33927

Open
Anteoy opened this issue Aug 29, 2019 · 16 comments

Comments

@Anteoy
Copy link

commented Aug 29, 2019

What version of Go are you using (go version)?

$ go version
go version go1.12.8 linux/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH="amd64"
GOBIN="/home/zhoudazhuang/gobin/"
GOCACHE="/home/zhoudazhuang/.cache/go-build"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/zhoudazhuang/db11/jm/pro"
GOPROXY=""
GORACE=""
GOROOT="/home/zhoudazhuang/usr/local/go1.12.8/go"
GOTMPDIR=""
GOTOOLDIR="/home/zhoudazhuang/usr/local/go1.12.8/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build045404538=/tmp/go-build -gno-record-gcc-switches"


each linux online server:
cpu: 48c
memory: 96g
Linux ll-025065048-FWWG.AppPZFW.prod.bj1 2.6.32-642.el6.x86_64 #1 SMP Tue May 10 17:27:01 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

What did you do?

My tcp long connection server , Deploy a go program per machine. each go program deal about 800 000 user.
Usually my cpu utilization is only about 30%,But gc will burst to over 95%. This causes my application to occasionally generate traffic delay jitter during gc. There will probably be some processing time of 2-3 seconds or more (including json.marshal,net.dial, metux lock, base64 decode and any others).

I observed the gc log and the runtime sched log. The conclusion is that the gc concurrent scan takes up a lot of cpu and knows that my business has no remaining cpu available. I observed that about a quarter of the procs are used for gc, no allocation proc for other tasks. I don't understand why gc cost lots of my cpu so that my business has no cpu resources available for a while, so my business has delayed jitter. Is there any solution or whether it can limit the numbers of cpu?

What did you expect to see?

Gc will not cause my business to delay

What did you see instead?

Gc caused my business to delay

@Anteoy

This comment has been minimized.

Copy link
Author

commented Aug 29, 2019

inuse_objects:

File: comet
Type: inuse_objects
Time: Aug 28, 2019 at 5:42pm (CST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 45387382, 80.24% of 56565978 total
Dropped 132 nodes (cum <= 282829)
Showing top 10 nodes out of 77
flat flat% sum% cum cum%
9438703 16.69% 16.69% 9438703 16.69% github.com/golang/protobuf/proto.unmarshalStringPtr
8614824 15.23% 31.92% 9286588 16.42% acs/comet/client.NewClient
6101261 10.79% 42.70% 9079363 16.05% runtime.systemstack
5431684 9.60% 52.30% 19129076 33.82% acs/comet.HandleTCPConn
5221285 9.23% 61.53% 5226746 9.24% github.com/Anteoy/ants.(*Pool).retrieveWorker
3360027 5.94% 67.47% 3360028 5.94% time.NewTimer
2977516 5.26% 72.74% 2977516 5.26% runtime.malg
1429390 2.53% 75.27% 1445774 2.56% acs/comet/client.ClientsMaps.Set
1425451 2.52% 77.79% 1425451 2.52% syscall.anyToSockaddr
1387241 2.45% 80.24% 1387241 2.45% net.sockaddrToTCP

@Anteoy

This comment has been minimized.

Copy link
Author

commented Aug 29, 2019

I have adjusted GOGC, GOMAXPROCS, but this doesn't work, the number of inuse objects is what I really need.

@Anteoy

This comment has been minimized.

Copy link
Author

commented Aug 29, 2019

usually:
SCHED 657070ms: gomaxprocs=48 idleprocs=41 threads=141 spinningthreads=1 idlethreads=80 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 657179ms: gomaxprocs=48 idleprocs=43 threads=141 spinningthreads=1 idlethreads=82 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 657282ms: gomaxprocs=48 idleprocs=45 threads=141 spinningthreads=0 idlethreads=85 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 657388ms: gomaxprocs=48 idleprocs=46 threads=141 spinningthreads=0 idlethreads=85 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 657488ms: gomaxprocs=48 idleprocs=43 threads=141 spinningthreads=2 idlethreads=82 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 657589ms: gomaxprocs=48 idleprocs=44 threads=141 spinningthreads=1 idlethreads=83 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 657690ms: gomaxprocs=48 idleprocs=38 threads=141 spinningthreads=1 idlethreads=78 runqueue=6 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 657791ms: gomaxprocs=48 idleprocs=40 threads=141 spinningthreads=2 idlethreads=79 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 657892ms: gomaxprocs=48 idleprocs=40 threads=141 spinningthreads=2 idlethreads=80 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 657993ms: gomaxprocs=48 idleprocs=43 threads=141 spinningthreads=2 idlethreads=82 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 658093ms: gomaxprocs=48 idleprocs=25 threads=141 spinningthreads=2 idlethreads=65 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
when gc:
SCHED 658195ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=75 runqueue=222 [0 0 0 0 3 0 0 0 0 2 0 0 2 4 0 90 0 40 20 1 2 1 0 0 0 0 0 0 107 0 3 0 2 101 4 1 1 2 1 34 24 2 1 0 1 2 3 1]
SCHED 658302ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=84 runqueue=282 [0 0 0 0 10 0 0 0 0 1 0 0 0 14 0 70 0 22 3 0 3 1 2 6 1 2 1 0 110 1 3 4 2 101 2 5 2 4 90 21 4 3 0 4 2 1 5 232]
SCHED 658404ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=74 runqueue=264 [0 0 0 0 3 0 0 0 0 1 0 0 0 1 0 64 0 4 5 3 9 0 13 5 4 4 0 2 118 2 7 12 2 102 156 1 4 4 80 3 6 2 0 3 0 5 2 221]
SCHED 658506ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=365 [0 0 0 0 0 0 0 0 0 4 0 0 2 114 0 2 0 4 5 183 3 4 0 4 0 5 0 165 131 4 0 14 4 109 148 6 2 3 62 0 2 0 1 3 2 8 2 222]
SCHED 658610ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=80 runqueue=307 [0 0 0 0 3 0 0 0 0 2 0 0 3 111 0 7 0 8 2 183 17 2 2 5 1 3 4 181 78 5 6 5 12 112 142 6 2 3 135 5 5 4 5 0 5 4 6 224]
SCHED 658713ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=80 runqueue=434 [0 0 0 0 38 0 0 0 0 1 0 0 0 116 0 0 0 0 6 190 7 6 0 4 5 2 6 178 8 6 6 4 0 21 150 6 4 6 129 4 2 3 14 4 10 3 44 234]
SCHED 658816ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=86 runqueue=551 [0 0 0 0 4 0 0 0 0 6 0 0 11 126 0 9 0 4 11 211 2 7 9 4 13 5 10 187 10 1 7 2 1 1 161 2 3 8 131 5 9 13 5 74 4 192 1 241]
SCHED 658918ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=87 runqueue=1162 [0 0 0 0 3 0 0 0 0 4 0 0 5 90 0 1 0 6 4 218 1 19 23 16 21 11 5 188 20 102 7 15 3 12 83 17 169 5 136 3 14 2 11 23 26 187 10 249]
SCHED 659020ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=1188 [0 0 0 0 18 0 0 0 0 2 0 0 20 6 0 25 0 4 21 219 1 21 20 3 21 6 18 188 1 50 2 3 23 14 4 0 158 21 141 71 2 2 22 22 18 189 20 132]
SCHED 659121ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=1694 [0 0 0 0 17 0 0 0 0 0 0 0 21 19 0 31 0 0 98 225 1 23 34 10 3 5 11 195 9 33 27 22 5 3 10 54 159 30 134 22 22 31 26 239 78 193 50 22]
SCHED 659224ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=2094 [0 0 0 0 76 0 0 0 0 3 0 0 36 44 0 37 0 38 42 154 33 49 31 34 12 36 38 190 47 45 9 46 76 20 43 139 160 61 25 40 41 111 98 233 47 197 40 7]
SCHED 659332ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=2247 [0 0 0 0 72 0 0 0 0 4 0 0 3 43 0 20 0 2 48 14 39 54 18 35 24 38 190 111 51 51 24 44 33 4 4 6 175 7 4 18 3 18 60 224 56 204 15 3]
SCHED 659435ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=2934 [0 0 0 0 52 0 0 0 0 9 0 0 17 62 0 50 0 47 59 3 243 29 55 61 42 6 20 11 22 57 57 56 55 4 4 136 35 71 54 50 14 61 58 231 22 214 54 5]
SCHED 659536ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=85 runqueue=3724 [0 0 0 0 16 0 0 0 0 7 0 0 56 5 0 38 0 23 26 6 238 4 48 67 61 149 38 6 219 20 34 36 39 2 4 34 8 65 27 25 52 208 41 248 55 160 14 68]
SCHED 659638ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=3304 [0 0 0 0 78 0 0 0 0 6 0 0 12 57 0 67 0 79 68 4 249 3 20 77 33 7 56 7 20 84 61 64 70 4 5 61 4 5 70 69 13 54 6 77 23 69 76 46]
SCHED 659740ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=4087 [0 0 0 0 89 0 0 0 0 10 0 0 90 18 0 50 0 97 52 4 94 4 65 82 71 0 44 7 65 94 36 39 82 18 63 175 82 52 83 85 59 0 82 79 27 6 88 80]
SCHED 659842ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=5215 [0 0 0 0 84 0 0 0 0 10 0 0 66 13 0 46 0 92 45 1 76 2 62 77 65 81 39 7 61 89 30 33 77 6 56 170 77 47 78 79 54 74 79 74 15 8 83 74]
SCHED 659949ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=6466 [0 0 0 0 79 0 0 0 0 8 0 0 61 8 0 43 0 87 36 5 66 1 60 73 61 75 30 7 58 84 24 27 72 128 52 165 72 43 73 75 51 69 74 70 9 2 78 69]
SCHED 660051ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=7818 [0 0 0 0 74 0 0 0 0 5 0 0 56 1 0 37 0 83 26 4 52 5 54 68 56 72 25 6 55 77 19 21 67 124 47 160 67 38 68 70 47 64 69 66 2 7 73 67]
SCHED 660153ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=8730 [0 0 0 0 65 0 0 0 0 5 0 0 51 123 0 30 0 77 21 2 41 5 49 63 52 65 19 6 51 64 14 15 62 116 41 155 60 32 63 65 45 189 64 61 123 5 68 60]
SCHED 660255ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=9444 [0 0 0 0 54 0 0 0 0 6 0 0 45 119 0 25 0 67 16 4 27 4 44 58 48 60 13 6 248 53 9 10 56 115 36 150 55 27 58 57 42 178 60 56 122 2 64 55]
SCHED 660356ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=10116 [0 0 0 0 44 0 0 0 0 5 0 0 29 115 0 19 0 50 11 4 15 5 39 53 45 55 6 4 244 39 2 5 52 112 30 145 50 23 53 46 38 170 55 53 118 2 59 51]
SCHED 660459ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=10511 [0 0 0 0 33 0 0 0 0 5 0 0 25 110 0 13 0 36 8 3 5 4 38 48 40 50 125 3 241 27 124 0 49 107 27 140 45 19 44 37 35 159 49 47 114 5 55 49]
SCHED 660561ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=11053 [0 0 0 0 20 0 0 0 0 5 0 0 19 107 0 8 0 25 1 1 5 2 36 43 35 47 122 4 237 16 118 121 34 103 22 135 40 13 29 28 30 150 45 40 110 2 41 44]
SCHED 660663ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=131 runqueue=7581 [251 74 54 187 5 84 160 58 57 9 2 31 225 11 112 99 49 96 62 3 3 0 2 82 81 127 10 5 73 120 91 42 107 82 17 45 117 40 47 96 64 57 5 45 10 74 63 48]
SCHED 660765ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=70 runqueue=1301 [9 17 25 1 86 25 223 1 15 6 15 8 22 160 165 6 12 1 80 84 2 9 20 96 5 140 72 94 10 32 28 64 17 1 173 18 3 36 14 1 5 5 52 5 0 23 111 4]
SCHED 660869ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=87 runqueue=442 [179 17 1 6 5 0 1 3 4 2 3 7 1 7 17 1 3 3 41 4 18 0 2 11 1 1 4 0 3 27 3 4 4 1 3 2 0 8 0 4 39 82 2 7 2 0 0 82]
SCHED 660976ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=41 runqueue=1 [4 0 1 0 1 0 1 3 2 2 0 0 0 0 0 4 2 0 0 1 0 0 0 0 0 4 0 0 0 3 0 2 1 0 0 3 0 1 1 3 12 0 0 0 0 2 0 0]
SCHED 661077ms: gomaxprocs=48 idleprocs=0 threads=141 spinningthreads=0 idlethreads=88 runqueue=587 [3 5 7 108 10 36 63 9 10 15 11 10 4 9 66 5 0 13 17 1 16 60 7 30 25 34 18 4 6 1 21 8 9 6 11 25 9 0 13 7 13 3 0 6 13 7 6 12]
SCHED 661185ms: gomaxprocs=48 idleprocs=41 threads=141 spinningthreads=3 idlethreads=79 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 661286ms: gomaxprocs=48 idleprocs=36 threads=141 spinningthreads=3 idlethreads=76 runqueue=0 [0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 661391ms: gomaxprocs=48 idleprocs=43 threads=141 spinningthreads=2 idlethreads=82 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 661492ms: gomaxprocs=48 idleprocs=44 threads=141 spinningthreads=1 idlethreads=84 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 661593ms: gomaxprocs=48 idleprocs=43 threads=141 spinningthreads=2 idlethreads=83 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0]
SCHED 661693ms: gomaxprocs=48 idleprocs=41 threads=141 spinningthreads=1 idlethreads=81 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 661794ms: gomaxprocs=48 idleprocs=41 threads=141 spinningthreads=3 idlethreads=81 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 661895ms: gomaxprocs=48 idleprocs=44 threads=141 spinningthreads=2 idlethreads=83 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 661995ms: gomaxprocs=48 idleprocs=47 threads=141 spinningthreads=0 idlethreads=86 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 662096ms: gomaxprocs=48 idleprocs=44 threads=141 spinningthreads=1 idlethreads=83 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 662197ms: gomaxprocs=48 idleprocs=46 threads=141 spinningthreads=1 idlethreads=85 runqueue=0 [0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
SCHED 662297ms: gomaxprocs=48 idleprocs=46 threads=141 spinningthreads=1 idlethreads=85 runqueue=0 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

Runqueue will rise in gc and then come down after gc is completed

@Anteoy

This comment has been minimized.

Copy link
Author

commented Aug 29, 2019

general htop:
image
when gc:
image

gc log:
gc 89 @610.293s 8%: 12+2470+0.52 ms clock, 621+80759/29748/98+25 ms cpu, 5949->6102->3329 MB, 6511 MB goal, 48 P
gc 90 @637.818s 8%: 4.3+2333+0.10 ms clock, 208+37327/27991/24463+5.2 ms cpu, 6121->6373->3364 MB, 6659 MB goal, 48 P
gc 91 @658.110s 8%: 0.52+2532+0.51{"level":"error","time":"2019-08-29T11:36:03+08:00","caller":"/home/zhoudazhuang/db11/jm/pro/src/acs/comet/app_config_server.go:487","message":"debug-连接读取err: EOF\n"}
gc 92 @685.606s 8%: %: 14+2571+{"level":"error","time":"2019-08-29T11:36:30+08:00","caller":"/home/zhoudazhuang/db11/jm/pro/src/acs/comet/app_config_server.go:487","message":"debug-连接读取err: EOF\n"}
gc {"level":"error","time":"2019-08-29T11:36:48+08:00","caller":"/home/zhoudazhuang/db11/jm/pro/src/acs/comet/app_config_server.go:487","message":"debug-连接读取err: EOF\n"}
gc 94 @739.848s 8%: 69+2569+0.61 ms clock, 3344+84528/30748/0.023+29 ms cpu, 6549->6720->3375 MB, 7058 MB goal, 48 P
scvg4: inuse: 5553, idle: 1328, sys: 6882, released: 0, consumed: 6882 (MB)
gc 95 @782.050s 8%: 102+2280+0.077 ms clock, 4938+23049/27330/47392+3.6 ms cpu, 6244->6335->3092 MB, 6751 MB goal, 48 P
gc 96 @803.855s 8%: 0.27+2570+0.58 ms clock, 13+86393/30778/1.1+27 ms cpu, 5892->6018->3334 MB, 6184 MB goal, 48 P
gc 97 @822.751s 8%: 65+2398+0.12 ms clock, 3145+44984/28747/15792+6.2 ms cpu, 6285->6636->3430 MB, 6668 MB goal, 48 P
gc 98 @843.003s 8%: 51+2279+0.61 ms clock, 2461+48089/27308/6949+29 ms cpu, 6285->6727->3410 MB, 6860 MB goal, 48 P
gc 99 @855.891s 8%: 0.18+2476+0.57 ms clock, 8.7+81679/29688/57+27 ms cpu, 6037->6203->3274 MB, 6820 MB goal, 48 P
gc 100 @867.733s 8%: 1.2+2544+0.50 ms clock, 62+84639/30500/8.6+24 ms cpu, 5918->6065->3369 MB, 6548 MB goal, 48 P

@Anteoy

This comment has been minimized.

Copy link
Author

commented Aug 29, 2019

SCHED picture:
image

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Aug 29, 2019

The first thing to do is to see if you can reduce the amount of memory you are allocating. The memory profile should be able to help you there.

The GC is expected to use up to 25% of your CPU resources, and it sounds like that is what you are seeing. Is there anything running on your machine other than your Go program? Why do you have so many runnable goroutines?

@Anteoy

This comment has been minimized.

Copy link
Author

commented Aug 29, 2019

The first thing to do is to see if you can reduce the amount of memory you are allocating. The memory profile should be able to help you there.

maybe not, the number of inuse objects is what I really need. This is already the object occupancy after my optimization.

The GC is expected to use up to 25% of your CPU resources, and it sounds like that is what you are seeing.

maybe this 25% is Represents 25% of procs, It seems that even is not the cpu time slice ? It seems that that moment took up more than 95% of cpu resources.

Is there anything running on your machine other than your Go program?

No, when the gc occur, the cpu will boost over 95%

Why do you have so many runnable goroutines?

It is my tcp server , There are about 800 000 established long tcp connections, and there will be non-stop users coming in through tcp connection, about 5,000 per second, and some users (connections) will be disconnected at the same time.

@Anteoy

This comment has been minimized.

Copy link
Author

commented Aug 29, 2019

@ianlancetaylor Is there a relevant source code here that allows me to modify the threshold by 25% in runtime source code?

@Anteoy

This comment has been minimized.

Copy link
Author

commented Aug 29, 2019

gc pacer:

pacer: assist ratio=+4.083553e+000 (scan 4177 MB in 6635->7146 MB) workers=12++0.000000e+000
pacer: H_m_prev=3746967200 h_t=+8.565789e-001 H_T=6956540126 h_a=+8.890131e-001 H_a=7078070176 h_g=+1.000000e+000 H_g=7493934400 u_a=+5.026386e-001 u_g=+3.000000e-001 W_a=2597914400 goalΔ=+1.434211e-001 actualΔ=+3.243424e-002 u_a/u_g=+1.675462e+000
gc 118 @917.797s 9%: 82{"level":"error","time":"2019-08-29T17:28:39+08:00","caller":"/home/zhoudazhuang/db11/jm/pro/src/acs/comet/app_config_server.go:488","message":"debug-连接读取err: EOF\n"}
pacer: sweep done at heap size 3300MB; allocated 46MB during sweep; swept 902606 pages at +2.935864e-004 pages/byte
pacer: assist ratio=+6.471221e+000 (scan 4149 MB in 6189->6509 MB) workers=12++0.000000e+000
pacer: H_m_prev=3412940008 h_t=+9.011183e-001 H_T=6488402582 h_a=+9.996110e-001 H_a=6824552376 h_g=+1.000000e+000 H_g=6825880016 u_a=+6.356443e-001 u_g=+3.000000e-001 W_a=2624284568 goalΔ=+9.888174e-002 actualΔ=+9.849273e-002 u_a/u_g=+2.118814e+000
gc 119 @943.925s 9%: 49+2397+0.54 ms clock, 2393+45310/28753/18408+26 ms cpu, 6189->6508->3319 MB, 6509 MB goal, 48 P
pacer: sweep done at heap size 3499MB; allocated 180MB during sweep; swept 903532 pages at +3.068945e-004 pages/byte
pacer: assist ratio=+3.975386e+000 (scan 4027 MB in 6131->6638 MB) workers=12++0.000000e+000
pacer: H_m_prev=3480392352 h_t=+8.462152e-001 H_T=6425553323 h_a=+8.901853e-001 H_a=6578586384 h_g=+1.000000e+000 H_g=6960784704 u_a=+9.458663e-001 u_g=+3.000000e-001 W_a=2559212424 goalΔ=+1.537848e-001 actualΔ=+4.397006e-002 u_a/u_g=+3.152888e+000
gc 120 @959.075{"level":"error","time":"2019-08-29T17:29:20+08:00","caller":"/home/zhoudazhuang/db11/jm/pro/src/acs/comet/app_config_server.go:488","message":"debug-连接读取err: EOF\n"}
pacer: sweep done at heap size 3924MB; allocated 718MB during sweep; swept 878406 pages at +3.061361e-004 pages/byte
pacer: assist ratio=+4.010302e+000 (scan 3735 MB in 5946->6412 MB) workers=12++0.000000e+000
pacer: H_m_prev=3361922320 h_t=+8.537913e-001 H_T=6232302287 h_a=+9.080987e-001 H_a=6414879704 h_g=+1.000000e+000 H_g=6723844640 u_a=+9.295912e-001 u_g=+3.000000e-001 W_a=2689606496 goalΔ=+1.462087e-001 actualΔ=+5.430745e-002 u_a/u_g=+3.098637e+000
gc 121 @973.729s {"level":"error","time":"2019-08-29T17:29:35+08:00","caller":"/home/zhoudazhuang/db11/jm/pro/src/acs/comet/app_config_server.go:488","message":"debug-连接读取err: EOF\n"}
pacer: sweep done at heap size 3743MB; allocated 369MB during sweep; swept 869402 pages at +2.916881e-004 pages/byte
pacer: assist ratio=+3.599922e+000 (scan 3785 MB in 6222->6748 MB) workers=12++0.000000e+000
pacer: H_m_prev=3537958864 h_t=+8.427561e-001 H_T=6519595278 h_a=+8.807810e-001 H_a=6654125688 h_g=+1.000000e+000 H_g=7075917728 u_a=+9.500049e-001 u_g=+3.000000e-001 W_a=2775714840 goalΔ=+1.572439e-001 actualΔ=+3.802487e-002 u_a/u_g=+3.166683e+000
gc 122 @990.926s 9%: 2.6+2655+2.4 ms clock, 126+89366/31805/0+115 ms cpu, 6222->6345->3453 MB, 6748 MB goal, 48 P
pacer: sweep done at heap size {"level":"error","time":"2019-08-29T17:29:54+08:00","caller":"/home/zhoudazhuang/db11/jm/pro/src/acs/comet/app_config_server.go:488","message":"debug-连接读取err: EOF\n"}
pacer: assist ratio=+4.157021e+000 (scan 3955 MB in 6431->6907 MB) workers=12++0.000000e+000
pacer: H_m_prev=3621465144 h_t=+8.611717e-001 H_T=6740168459 h_a=+9.102292e-001 H_a=6917828576 h_g=+1.000000e+000 H_g=7242930288 u_a=+9.354194e-001 u_g=+3.000000e-001 W_a=2754253200 goalΔ=+1.388283e-001 actualΔ=+4.905752e-002 u_a/u_g=+3.118065e+000
gc 123 @1016.400s 9%: 7.6+2686+2.2 ms clock, 369+88653/32163/85+110 ms cpu, 6431->6597->3458 MB, 6907 MB goal, 48 P
pacer: sweep done at heap size 3909MB; allocated 450MB during sweep; swept 891685 pages at +2.879660e-004 pages/byte

@Anteoy

This comment has been minimized.

Copy link
Author

commented Aug 29, 2019

I found const gcBackgroundUtilization and dedicatedMarkWorkersNeeded and are trying to adjust it.

@Anteoy

This comment has been minimized.

Copy link
Author

commented Aug 30, 2019

I adjusted the proportion of gc cpu and the number of work, but this does not work. What I don't understand is why the CPU usage is usually low, and gc will always make my CPU grow up, even more than 95%, and then my service will be delayed.

@Anteoy

This comment has been minimized.

Copy link
Author

commented Aug 30, 2019

my cpu profile:
image

@Anteoy

This comment has been minimized.

Copy link
Author

commented Aug 30, 2019

top : use gc set GOGC=100
bottom: close gc GOGC=-1
image

@RLH

This comment has been minimized.

Copy link
Contributor

commented Aug 30, 2019

@Anteoy

This comment has been minimized.

Copy link
Author

commented Aug 30, 2019

@RLH Thanks for you advise. But I have tried modifying GOGC=200,300,500 or 50, 80 before, and also modified GOMAXPROCS and combined them. Later I even modified source code for runtime const gcBackgroundUtilization and dedicatedMarkWorkersNeeded. as you said modifying the GOGC env, but this does not solve my problem.

The problem with me is that sporadic gc will cause the cpu occupancy rate to reach 95% or more, which will cause my other code, such as proto.marshal, net.dial, base64 decode or lock, to cause a delay of more than 2 seconds or more.

Is there any other thing I can offer?

@RLH

This comment has been minimized.

Copy link
Contributor

commented Aug 30, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.