New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal error: concurrent map read and map write #6235

Closed
syepes opened this Issue Apr 6, 2016 · 7 comments

Comments

Projects
None yet
5 participants
@syepes

syepes commented Apr 6, 2016

Bug report

Has anyone else been experiencing this issue?

System info:
InfluxDB Master (2f6340b, e22f098, c2ac8c8)
FreeBSD 10.3 / 16GB / ZFS

Steps to reproduce:
Just run InfluxDB (2f6340b, e22f098, c2ac8c8) for 1 or 2 days and it will crash.

Actual behavior:
Since the version 2f6340b I have been getting this fatal error: concurrent map read and map write

Additional info: Crash logs from version (2f6340b, e22f098, c2ac8c8)
Carsh_1.txt
Carsh_1a.txt
Carsh_2.txt
Carsh_3.txt

@jonseymour

This comment has been minimized.

Show comment
Hide comment
@jonseymour

jonseymour Apr 6, 2016

Contributor

Are you compiling with go1.6? There is a chance that this symptom doesn't occur in builds built with go1.4.3 (even though the issue is still latent in such builds). I say this because I had another program that had a latent map concurrency issue but I wasn't aware of it until I started compiling with 1.6.

Of course, the underlying concurrency issue should be addressed, irrespective of which version you are using.

Contributor

jonseymour commented Apr 6, 2016

Are you compiling with go1.6? There is a chance that this symptom doesn't occur in builds built with go1.4.3 (even though the issue is still latent in such builds). I say this because I had another program that had a latent map concurrency issue but I wasn't aware of it until I started compiling with 1.6.

Of course, the underlying concurrency issue should be addressed, irrespective of which version you are using.

@syepes

This comment has been minimized.

Show comment
Hide comment
@syepes

syepes Apr 6, 2016

@jonseymour Thanks for the info, I will rebuild InfluxDB with Go1.4.3 and see if it resolve this issue.

syepes commented Apr 6, 2016

@jonseymour Thanks for the info, I will rebuild InfluxDB with Go1.4.3 and see if it resolve this issue.

@jonseymour

This comment has been minimized.

Show comment
Hide comment
@jonseymour

jonseymour Apr 6, 2016

Contributor

My take on the issue is that that the map dereference at line 374 (https://github.com/influxdata/influxdb/blob/master/tsdb/shard.go#L374) used to be protected by use of the DatabaseIndex lock, but since 03ced4c this protection has been removed.

/cc @jwilder

Contributor

jonseymour commented Apr 6, 2016

My take on the issue is that that the map dereference at line 374 (https://github.com/influxdata/influxdb/blob/master/tsdb/shard.go#L374) used to be protected by use of the DatabaseIndex lock, but since 03ced4c this protection has been removed.

/cc @jwilder

@e-dard e-dard added the panic label Apr 6, 2016

@e-dard e-dard added this to the 0.13.0 milestone Apr 6, 2016

@e-dard e-dard self-assigned this Apr 6, 2016

e-dard added a commit that referenced this issue Apr 6, 2016

@e-dard e-dard referenced this issue Apr 6, 2016

Closed

Prevent concurrent access to a Series' shard IDs #6236

3 of 3 tasks complete
@e-dard

This comment has been minimized.

Show comment
Hide comment
@e-dard

e-dard Apr 6, 2016

Member

@syepes thanks for the detailed report. The issue should be fixed in the next release.

Member

e-dard commented Apr 6, 2016

@syepes thanks for the detailed report. The issue should be fixed in the next release.

@e-dard

This comment has been minimized.

Show comment
Hide comment
@e-dard

e-dard Apr 6, 2016

Member

fixed via #6190.

Member

e-dard commented Apr 6, 2016

fixed via #6190.

@duccyberfend

This comment has been minimized.

Show comment
Hide comment
@duccyberfend

duccyberfend May 16, 2016

this isn't fixed. I'm running the nightly (5/16/2016) and still see the crash.

fatal error: concurrent map read and map write

goroutine 200902351 [running]:
runtime.throw(0xd23aa0, 0x21)
        /usr/local/go/src/runtime/panic.go:547 +0x90 fp=0xc846174558 sp=0xc846174540
runtime.mapaccess1_faststr(0xa2cda0, 0xc8268ce540, 0xc820221320, 0xa, 0xc840576c00)
        /usr/local/go/src/runtime/hashmap_fast.go:202 +0x5b fp=0xc8461745b8 sp=0xc846174558
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).buildCursor(0xc82e307ad0, 0xc820221320, 0xa, 0xc823807ea0, 0x47, 0xc8434340a0, 0x3, 0x7fbef758a0e0, 0xc84bac8d00, 0x110dff0, ...)
        /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1164 +0x7f fp=0xc846174740 sp=0xc8461745b8
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).createVarRefSeriesIterator(0xc82e307ad0, 0xc84bac8d00, 0xc8223d9e00, 0xc823807ea0, 0x47, 0xc82f0abdb0, 0x0, 0x0, 0x110dff0, 0x0, ...)
        /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1140 +0x869 fp=0xc846174b80 sp=0xc846174740
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).createTagSetGroupIterators(0xc82e307ad0, 0xc84bac8d00, 0xc8223d9e00, 0xc8429afeb0, 0x1, 0x1, 0xc82f0abdb0, 0xc8429afec0, 0x1, 0x1, ...)
        /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1067 +0x413 fp=0xc846174de0 sp=0xc846174b80
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).createTagSetIterators.func1(0xc8425b45e0, 0xc83d3c5f80, 0x1, 0x1, 0xc82e307ad0, 0xc84bac8d00, 0xc8223d9e00, 0xc82f0abdb0, 0xc8213edb20, 0x0)
        /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1022 +0x11a fp=0xc846174f60 sp=0xc846174de0
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc846174f68 sp=0xc846174f60
created by github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).createTagSetIterators
        /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1023 +0x419

goroutine 1 [chan receive, 178 minutes]:
main.(*Main).Run(0xc8417c1f10, 0xc82000a0b0, 0x4, 0x4, 0x0, 0x0)
        /root/go/src/github.com/influxdata/influxdb/cmd/influxd/main.go:94 +0xa39
main.main()
        /root/go/src/github.com/influxdata/influxdb/cmd/influxd/main.go:45 +0x299

goroutine 17 [syscall, 178 minutes, locked to thread]:
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1998 +0x1

goroutine 5 [syscall, 178 minutes]:
os/signal.signal_recv(0x0)
        /usr/local/go/src/runtime/sigqueue.go:116 +0x132
os/signal.loop()
        /usr/local/go/src/os/signal/signal_unix.go:22 +0x18
created by os/signal.init.1
        /usr/local/go/src/os/signal/signal_unix.go:28 +0x37

goroutine 18 [IO wait, 178 minutes]:
net.runtime_pollWait(0x7fbf16e10e48, 0x72, 0x0)
        /usr/local/go/src/runtime/netpoll.go:160 +0x60
net.(*pollDesc).Wait(0xc8201e5330, 0x72, 0x0, 0x0)
        /usr/local/go/src/net/fd_poll_runtime.go:73 +0x3a
net.(*pollDesc).WaitRead(0xc8201e5330, 0x0, 0x0)
        /usr/local/go/src/net/fd_poll_runtime.go:78 +0x36
net.(*netFD).accept(0xc8201e52d0, 0x0, 0x7fbf16e4c4a8, 0xc8200bcae0)
        /usr/local/go/src/net/fd_unix.go:426 +0x27c
net.(*TCPListener).AcceptTCP(0xc820140470, 0x0, 0x0, 0x0)
        /usr/local/go/src/net/tcpsock_posix.go:254 +0x4d
net.(*TCPListener).Accept(0xc820140470, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/net/tcpsock_posix.go:264 +0x3d
github.com/influxdata/influxdb/tcp.(*Mux).Serve(0xc8201f1360, 0x7fbf16e0fec8, 0xc820140470, 0x0, 0x0)
        /root/go/src/github.com/influxdata/influxdb/tcp/mux.go:52 +0xb0
created by github.com/influxdata/influxdb/cmd/influxd/run.(*Server).Open
        /root/go/src/github.com/influxdata/influxdb/cmd/influxd/run/server.go:231 +0x3fb

duccyberfend commented May 16, 2016

this isn't fixed. I'm running the nightly (5/16/2016) and still see the crash.

fatal error: concurrent map read and map write

goroutine 200902351 [running]:
runtime.throw(0xd23aa0, 0x21)
        /usr/local/go/src/runtime/panic.go:547 +0x90 fp=0xc846174558 sp=0xc846174540
runtime.mapaccess1_faststr(0xa2cda0, 0xc8268ce540, 0xc820221320, 0xa, 0xc840576c00)
        /usr/local/go/src/runtime/hashmap_fast.go:202 +0x5b fp=0xc8461745b8 sp=0xc846174558
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).buildCursor(0xc82e307ad0, 0xc820221320, 0xa, 0xc823807ea0, 0x47, 0xc8434340a0, 0x3, 0x7fbef758a0e0, 0xc84bac8d00, 0x110dff0, ...)
        /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1164 +0x7f fp=0xc846174740 sp=0xc8461745b8
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).createVarRefSeriesIterator(0xc82e307ad0, 0xc84bac8d00, 0xc8223d9e00, 0xc823807ea0, 0x47, 0xc82f0abdb0, 0x0, 0x0, 0x110dff0, 0x0, ...)
        /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1140 +0x869 fp=0xc846174b80 sp=0xc846174740
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).createTagSetGroupIterators(0xc82e307ad0, 0xc84bac8d00, 0xc8223d9e00, 0xc8429afeb0, 0x1, 0x1, 0xc82f0abdb0, 0xc8429afec0, 0x1, 0x1, ...)
        /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1067 +0x413 fp=0xc846174de0 sp=0xc846174b80
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).createTagSetIterators.func1(0xc8425b45e0, 0xc83d3c5f80, 0x1, 0x1, 0xc82e307ad0, 0xc84bac8d00, 0xc8223d9e00, 0xc82f0abdb0, 0xc8213edb20, 0x0)
        /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1022 +0x11a fp=0xc846174f60 sp=0xc846174de0
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc846174f68 sp=0xc846174f60
created by github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).createTagSetIterators
        /root/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1023 +0x419

goroutine 1 [chan receive, 178 minutes]:
main.(*Main).Run(0xc8417c1f10, 0xc82000a0b0, 0x4, 0x4, 0x0, 0x0)
        /root/go/src/github.com/influxdata/influxdb/cmd/influxd/main.go:94 +0xa39
main.main()
        /root/go/src/github.com/influxdata/influxdb/cmd/influxd/main.go:45 +0x299

goroutine 17 [syscall, 178 minutes, locked to thread]:
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1998 +0x1

goroutine 5 [syscall, 178 minutes]:
os/signal.signal_recv(0x0)
        /usr/local/go/src/runtime/sigqueue.go:116 +0x132
os/signal.loop()
        /usr/local/go/src/os/signal/signal_unix.go:22 +0x18
created by os/signal.init.1
        /usr/local/go/src/os/signal/signal_unix.go:28 +0x37

goroutine 18 [IO wait, 178 minutes]:
net.runtime_pollWait(0x7fbf16e10e48, 0x72, 0x0)
        /usr/local/go/src/runtime/netpoll.go:160 +0x60
net.(*pollDesc).Wait(0xc8201e5330, 0x72, 0x0, 0x0)
        /usr/local/go/src/net/fd_poll_runtime.go:73 +0x3a
net.(*pollDesc).WaitRead(0xc8201e5330, 0x0, 0x0)
        /usr/local/go/src/net/fd_poll_runtime.go:78 +0x36
net.(*netFD).accept(0xc8201e52d0, 0x0, 0x7fbf16e4c4a8, 0xc8200bcae0)
        /usr/local/go/src/net/fd_unix.go:426 +0x27c
net.(*TCPListener).AcceptTCP(0xc820140470, 0x0, 0x0, 0x0)
        /usr/local/go/src/net/tcpsock_posix.go:254 +0x4d
net.(*TCPListener).Accept(0xc820140470, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/net/tcpsock_posix.go:264 +0x3d
github.com/influxdata/influxdb/tcp.(*Mux).Serve(0xc8201f1360, 0x7fbf16e0fec8, 0xc820140470, 0x0, 0x0)
        /root/go/src/github.com/influxdata/influxdb/tcp/mux.go:52 +0xb0
created by github.com/influxdata/influxdb/cmd/influxd/run.(*Server).Open
        /root/go/src/github.com/influxdata/influxdb/cmd/influxd/run/server.go:231 +0x3fb

@e-dard e-dard reopened this May 17, 2016

@e-dard e-dard modified the milestones: 1.0.0, 0.13.0 May 17, 2016

@e-dard e-dard referenced this issue May 17, 2016

Merged

Fix concurrent map access panic #6647

3 of 3 tasks complete

@e-dard e-dard closed this in #6647 May 18, 2016

@e-dard

This comment has been minimized.

Show comment
Hide comment
@e-dard

e-dard May 18, 2016

Member

@duccyberfend will be fixed in 1.0.0

Member

e-dard commented May 18, 2016

@duccyberfend will be fixed in 1.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment