You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Though it's marked as possible but it really happened.
Issue Description
A race on map occurred and it caused panics and termination. Logs are attached below.
Issue Analysis
health_check.go:57 checkOneContainer() could be invoked asynchronously:
checkAllContainers()
handleContainerStart()
And the labels map is shared between different invocations.
Generally speaking there are two ways to solve this:
Always return a different copy of labels.
Avoid race conditions
In this case the second solution is preferred personally. Because to check the same container concurrently doesn't make sense. So maybe introduce some kinds of locking in checkOneContainer() to skip duplicated checks is a simple way(Just for suggestions).
Related Logs
Feb 9 11:03:58 dev01 eru-agent[13787]: time="2021-02-09T11:03:58+08:00" level=info msg="[Watch] Monitor: cid e1851e2 action start"
Feb 9 11:03:58 dev01 eru-agent[13787]: time="2021-02-09T11:03:58+08:00" level=info msg="[attach] attach redisproxy container e1851e2 success"
Feb 9 11:03:58 dev01 eru-agent[13787]: fatal error: concurrent map iteration and map write
Feb 9 11:03:58 dev01 eru-agent[13787]: goroutine 26264 [running]:
Feb 9 11:03:58 dev01 eru-agent[13787]: runtime.throw(0x12942cf, 0x26)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/usr/local/golang/src/runtime/panic.go:1116 +0x72 fp=0xc000141aa0 sp=0xc000141a70 pc=0x437a92
Feb 9 11:03:58 dev01 eru-agent[13787]: runtime.mapiternext(0xc000474060)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/usr/local/golang/src/runtime/map.go:853 +0x554 fp=0xc000141b20 sp=0xc000141aa0 pc=0x410d34
Feb 9 11:03:58 dev01 eru-agent[13787]: reflect.mapiternext(0xc000474060)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/usr/local/golang/src/runtime/map.go:1337 +0x2b fp=0xc000141b38 sp=0xc000141b20 pc=0x469c8b
Feb 9 11:03:58 dev01 eru-agent[13787]: reflect.Value.MapKeys(0x10ca180, 0xc0005642a0, 0x15, 0x0, 0xc000141c18, 0x40429a)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/usr/local/golang/src/reflect/value.go:1227 +0x10c fp=0xc000141bc8 sp=0xc000141b38 pc=0x4a02ac
Feb 9 11:03:58 dev01 eru-agent[13787]: encoding/json.mapEncoder.encode(0x12c2d58, 0xc00028ed80, 0x10ca180, 0xc0005642a0, 0x15, 0x10c0100)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/usr/local/golang/src/encoding/json/encode.go:785 +0xff fp=0xc000141d40 sp=0xc000141bc8 pc=0x55351f
Feb 9 11:03:58 dev01 eru-agent[13787]: encoding/json.mapEncoder.encode-fm(0xc00028ed80, 0x10ca180, 0xc0005642a0, 0x15, 0x1b00100)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/usr/local/golang/src/encoding/json/encode.go:777 +0x65 fp=0xc000141d80 sp=0xc000141d40 pc=0x55fbe5
Feb 9 11:03:58 dev01 eru-agent[13787]: encoding/json.(*encodeState).reflectValue(0xc00028ed80, 0x10ca180, 0xc0005642a0, 0x15, 0xc000140100)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/usr/local/golang/src/encoding/json/encode.go:358 +0x82 fp=0xc000141db8 sp=0xc000141d80 pc=0x5508c2
Feb 9 11:03:58 dev01 eru-agent[13787]: encoding/json.(*encodeState).marshal(0xc00028ed80, 0x10ca180, 0xc0005642a0, 0x880100, 0x0, 0x0)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/usr/local/golang/src/encoding/json/encode.go:330 +0xf4 fp=0xc000141e18 sp=0xc000141db8 pc=0x5504b4
Feb 9 11:03:58 dev01 eru-agent[13787]: encoding/json.Marshal(0x10ca180, 0xc0005642a0, 0x40, 0x106d340, 0xc00068a0b0, 0x45d964b800, 0x4b)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/usr/local/golang/src/encoding/json/encode.go:161 +0x52 fp=0xc000141e90 sp=0xc000141e18 pc=0x54faf2
Feb 9 11:03:58 dev01 eru-agent[13787]: github.com/projecteru2/agent/store/core.(*CoreStore).SetContainerStatus(0xc000509a40, 0x13f25c0, 0xc00003c658, 0xc000256370, 0xc000416000, 0x0, 0x0)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/Users/jason.zhuyj/projects/eru2/agent/store/core/container.go:29 +0x6d fp=0xc000141f78 sp=0xc000141e90 pc=0xb917ed
Feb 9 11:03:58 dev01 eru-agent[13787]: github.com/projecteru2/agent/engine.(*Engine).checkOneContainer(0xc0003fb1a0, 0xc000256370)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/Users/jason.zhuyj/projects/eru2/agent/engine/health_check.go:68 +0x77 fp=0xc000141fd0 sp=0xc000141f78 pc=0xbbd517
Feb 9 11:03:58 dev01 eru-agent[13787]: runtime.goexit()
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/usr/local/golang/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc000141fd8 sp=0xc000141fd0 pc=0x470121
Feb 9 11:03:58 dev01 eru-agent[13787]: created by github.com/projecteru2/agent/engine.(*Engine).handleContainerStart
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/Users/jason.zhuyj/projects/eru2/agent/engine/monitor.go:52 +0x2cf
Feb 9 11:03:58 dev01 eru-agent[13787]: goroutine 1 [select, 1115 minutes]:
Feb 9 11:03:58 dev01 eru-agent[13787]: runtime.gopark(0x12c65c8, 0x0, 0x1809, 0x1)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/usr/local/golang/src/runtime/proc.go:306 +0xe5 fp=0xc00018fa68 sp=0xc00018fa48 pc=0x43a685
Feb 9 11:03:58 dev01 eru-agent[13787]: runtime.selectgo(0xc00018fc28, 0xc00018fbd0, 0x2, 0x1, 0x1)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/usr/local/golang/src/runtime/select.go:338 +0xcef fp=0xc00018fb90 sp=0xc00018fa68 pc=0x44a7ef
Feb 9 11:03:58 dev01 eru-agent[13787]: github.com/projecteru2/agent/engine.(*Engine).Run(0xc0003fb1a0, 0x13f2580, 0xc00020cec0, 0x0, 0x0)
Feb 9 11:03:58 dev01 eru-agent[13787]: #011/Users/jason.zhuyj/projects/eru2/agent/engine/engine.go:107 +0x239 fp=0xc00018fc88 sp=0xc00018fb90 pc=0xbbcaf9
Feb 9 11:03:58 dev01 eru-agent[13787]: main.serve(0xc00020c3c0, 0x0, 0x0)
The text was updated successfully, but these errors were encountered:
Though it's marked as possible but it really happened.
Issue Description
A race on map occurred and it caused panics and termination. Logs are attached below.
Issue Analysis
health_check.go:57
checkOneContainer() could be invoked asynchronously:And the
labels
map is shared between different invocations.Generally speaking there are two ways to solve this:
In this case the second solution is preferred personally. Because to check the same container concurrently doesn't make sense. So maybe introduce some kinds of locking in
checkOneContainer()
to skip duplicated checks is a simple way(Just for suggestions).Related Logs
The text was updated successfully, but these errors were encountered: