Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gRPC v1.7.3 transport "panic: send on closed channel" on *serverHandlerTransport #8904

Closed
zbindenren opened this issue Nov 22, 2017 · 26 comments · Fixed by grpc/grpc-go#1687 or #8972
Closed
Labels
Milestone

Comments

@zbindenren
Copy link
Contributor

Hi

Similar to #8595 we still get a panic with v3.2.10. Here the log:

Nov 22 09:44:38 p1-linux-mlsu007 etcd[124081]: compacted raft log at 6678643315
Nov 22 09:44:46 p1-linux-mlsu007 etcd[124081]: apply entries took too long [3.216781052s for 2 entries]
Nov 22 09:44:46 p1-linux-mlsu007 etcd[124081]: avoid queries with large range/delete range!
Nov 22 09:44:47 p1-linux-mlsu007 etcd[124081]: purged file /appl/etcd/data/p1-linux-mlsu007/member/snap/0000000000003b38-000000018e1365ae.snap successfully
Nov 22 09:44:47 p1-linux-mlsu007 etcd[124081]: purged file /appl/etcd/data/p1-linux-mlsu007/member/snap/0000000000003b38-000000018e137937.snap successfully
Nov 22 09:44:47 p1-linux-mlsu007 etcd[124081]: purged file /appl/etcd/data/p1-linux-mlsu007/member/snap/0000000000003b38-000000018e138cc2.snap successfully
Nov 22 09:44:47 p1-linux-mlsu007 etcd[124081]: purged file /appl/etcd/data/p1-linux-mlsu007/member/snap/0000000000003b38-000000018e13a04f.snap successfully
Nov 22 09:44:47 p1-linux-mlsu007 etcd[124081]: purged file /appl/etcd/data/p1-linux-mlsu007/member/snap/0000000000003b38-000000018e13b3da.snap successfully
Nov 22 09:44:48 p1-linux-mlsu007 etcd[124081]: failed to send out heartbeat on time (exceeded the 150ms timeout for 56.359018ms)
Nov 22 09:44:48 p1-linux-mlsu007 etcd[124081]: server is likely overloaded
Nov 22 09:44:48 p1-linux-mlsu007 etcd[124081]: failed to send out heartbeat on time (exceeded the 150ms timeout for 56.399292ms)
Nov 22 09:44:48 p1-linux-mlsu007 etcd[124081]: server is likely overloaded
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: start to snapshot (applied: 6678653654, lastsnap: 6678648315)
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: panic: send on closed channel
Nov 22 09:44:59 p1-linux-mlsu007 systemd[1]: etcd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Nov 22 09:44:59 p1-linux-mlsu007 systemd[1]: Unit etcd.service entered failed state.
Nov 22 09:44:59 p1-linux-mlsu007 systemd[1]: etcd.service failed.
Nov 22 09:45:09 p1-linux-mlsu007 systemd[1]: etcd.service holdoff time over, scheduling restart.
@gyuho
Copy link
Contributor

gyuho commented Nov 22, 2017

Can you share the full stack trace?

@zbindenren
Copy link
Contributor Author

@gyuho unfortunately that is all I see in the logs. The server is started via systemd.

@gyuho
Copy link
Contributor

gyuho commented Nov 23, 2017

@zbindenren Are you sure it's running v3.2.10?

Even systemd should be able to display panic stack trace.
You had stack trace here #8595...

We can't help without stack trace.

@zbindenren
Copy link
Contributor Author

There seems to be a bug with journalctl: systemd/systemd#3277 I‘ll provide the stack trace asap.

@zbindenren
Copy link
Contributor Author

@gyuho here the stack:

Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: start to snapshot (applied: 6678653654, lastsnap: 6678648315)
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: panic: send on closed channel
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: goroutine 4676432 [running]:
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).do(0xc44bf29ea0, 0xc4633e69b0, 0xe62c60, 0xc447ed3d
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:170 +0x174
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).Write(0xc44bf29ea0, 0xc4715e6a00, 0xc47c2da96a, 0x5
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:263 +0xf8
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc.(*serverStream).SendMsg(0xc429991280, 0xed0840, 0xc490674400, 0x0, 0x0)
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/stream.go:611 +0x29d
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*serverStreamWithCtx).SendMsg(0xc462877230, 0xed0840, 0xc49067440
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: <autogenerated>:5 +0x5d
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus.(*monitoredServerStream).SendMsg(0xc44b5ce320, 0xed0840, 0xc49067
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus/server.go:61 +0x4b
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb.(*leaseLeaseKeepAliveServer).Send(0xc44030a430, 0xc490674400, 0
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb/rpc.pb.go:3432 +0x49
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*LeaseServer).leaseKeepAlive(0xc4291742c0, 0x14b76a0, 0xc44030a43
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/lease.go:118 +0x1b7
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*LeaseServer).LeaseKeepAlive.func1(0xc4424751a0, 0xc4291742c0, 0x
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/lease.go:74 +0x3f
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*LeaseServer).LeaseKeepAlive
Nov 22 09:44:59 p1-linux-mlsu007 etcd[124081]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/lease.go:75 +0x96

@gyuho gyuho changed the title panic: send on closed channel gRPC v1.7.3 transport "panic: send on closed channel" on *serverHandlerTransport Nov 23, 2017
@gyuho gyuho added this to the v3.3.0 milestone Nov 24, 2017
@gyuho
Copy link
Contributor

gyuho commented Nov 29, 2017

@zbindenren We are still trying to figure out if this is a bug in gRPC-side or etcd. How often do you see this panics? And any easy way to reproduce this?

@zbindenren
Copy link
Contributor Author

@gyuho We have this panics daily on all three nodes. Here the events for the 27th:

[root@p1-linux-mlsu006 ~]# journalctl -l --since='2017-11-27' --until='2017-11-28' |grep panic
Nov 27 09:32:42 p1-linux-mlsu006 etcd[80596]: panic: send on closed channel
Nov 27 09:32:42 p1-linux-mlsu006 etcd[80596]: panic: send on closed channel
Nov 27 09:32:57 p1-linux-mlsu006 etcd[39807]: panic: send on closed channel
Nov 27 16:20:57 p1-linux-mlsu006 etcd[40044]: panic: send on closed channel
Nov 27 16:21:12 p1-linux-mlsu006 etcd[43467]: panic: send on closed channel
Nov 27 16:21:28 p1-linux-mlsu006 etcd[43494]: panic: send on closed channel
Nov 27 16:42:57 p1-linux-mlsu006 etcd[43537]: panic: send on closed channel
[root@p1-linux-mlsu007 ~]# journalctl -l --since='2017-11-27' --until='2017-11-28' |grep panic
Nov 27 09:32:41 p1-linux-mlsu007 etcd[81077]: panic: send on closed channel
Nov 27 11:52:14 p1-linux-mlsu007 etcd[109013]: panic: send on closed channel
Nov 27 13:18:49 p1-linux-mlsu007 etcd[6570]: panic: send on closed channel
Nov 27 15:16:38 p1-linux-mlsu007 etcd[24833]: panic: send on closed channel
Nov 27 16:20:58 p1-linux-mlsu007 etcd[50016]: panic: send on closed channel
Nov 27 16:21:12 p1-linux-mlsu007 etcd[68518]: panic: send on closed channel
Nov 27 16:21:31 p1-linux-mlsu007 etcd[68546]: panic: send on closed channel
Nov 27 16:42:50 p1-linux-mlsu007 etcd[68590]: panic: send on closed channel
Nov 27 16:43:13 p1-linux-mlsu007 etcd[74379]: panic: send on closed channel
[root@p1-linux-mlsu008 ~]# journalctl -l --since='2017-11-27' --until='2017-11-28' |grep panic
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: panic: send on closed channel
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: panic: send on closed channel
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: panic: send on closed channel
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: panic: send on closed channel
Nov 27 16:42:54 p1-linux-mlsu008 etcd[16576]: panic: send on closed channel

Unfortunately I have no easy way to reproduce this. We have this bug only in production. The production cluster is a three node cluster with around 2000 clients. There are around 2000 lease and watch streams. Usually there is not a lot of watch traffic. The panic usually occurs when there are increased watch events.

Does this help? Do you need more information?

@gyuho
Copy link
Contributor

gyuho commented Nov 29, 2017

@zbindenren Ok this definitely needs fix.

Do you have full stack trace? Longer one than #8904 (comment) would be super helpful!

@zbindenren
Copy link
Contributor Author

@gyuho Here some stack traces from one node:

Nov 27 09:32:29 p1-linux-mlsu008 etcd[9260]: saved snapshot at index 6864775080
Nov 27 09:32:29 p1-linux-mlsu008 etcd[9260]: compacted raft log at 6864770080
Nov 27 09:32:34 p1-linux-mlsu008 etcd[9260]: apply entries took too long [4.636721567s for 4 entries]
Nov 27 09:32:34 p1-linux-mlsu008 etcd[9260]: avoid queries with large range/delete range!
Nov 27 09:32:41 p1-linux-mlsu008 etcd[9260]: lost the TCP streaming connection with peer d3b04b5071b4a6c1 (stream MsgApp v2 reader)
Nov 27 09:32:41 p1-linux-mlsu008 etcd[9260]: failed to read d3b04b5071b4a6c1 on stream MsgApp v2 (unexpected EOF)
Nov 27 09:32:41 p1-linux-mlsu008 etcd[9260]: peer d3b04b5071b4a6c1 became inactive
Nov 27 09:32:41 p1-linux-mlsu008 etcd[9260]: lost the TCP streaming connection with peer d3b04b5071b4a6c1 (stream Message reader)
Nov 27 09:32:42 p1-linux-mlsu008 etcd[9260]: lost the TCP streaming connection with peer d3b04b5071b4a6c1 (stream Message writer)
Nov 27 09:32:42 p1-linux-mlsu008 etcd[9260]: lost the TCP streaming connection with peer bbde196d079d68d3 (stream Message reader)
Nov 27 09:32:42 p1-linux-mlsu008 etcd[9260]: failed to read bbde196d079d68d3 on stream Message (unexpected EOF)
Nov 27 09:32:42 p1-linux-mlsu008 etcd[9260]: peer bbde196d079d68d3 became inactive
Nov 27 09:32:42 p1-linux-mlsu008 etcd[9260]: lost the TCP streaming connection with peer bbde196d079d68d3 (stream MsgApp v2 reader)
Nov 27 09:32:43 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 is starting a new election at term 15176
Nov 27 09:32:43 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 became candidate at term 15177
Nov 27 09:32:43 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 received MsgVoteResp from e27543f3820bc6d9 at term 15177
Nov 27 09:32:43 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 [logterm: 15176, index: 6864781963] sent MsgVote request to bbde196d079d68d3 at term 15177
Nov 27 09:32:43 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 [logterm: 15176, index: 6864781963] sent MsgVote request to d3b04b5071b4a6c1 at term 15177
Nov 27 09:32:43 p1-linux-mlsu008 etcd[9260]: raft.node: e27543f3820bc6d9 lost leader d3b04b5071b4a6c1 at term 15177
Nov 27 09:32:44 p1-linux-mlsu008 mule[38517]: lvl=ERR msg="terminating: etcdserver: request timed out"
Nov 27 09:32:45 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 is starting a new election at term 15177
Nov 27 09:32:45 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 became candidate at term 15178
Nov 27 09:32:45 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 received MsgVoteResp from e27543f3820bc6d9 at term 15178
Nov 27 09:32:45 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 [logterm: 15176, index: 6864781963] sent MsgVote request to bbde196d079d68d3 at term 15178
Nov 27 09:32:45 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 [logterm: 15176, index: 6864781963] sent MsgVote request to d3b04b5071b4a6c1 at term 15178
Nov 27 09:32:45 p1-linux-mlsu008 etcd[9260]: lost the TCP streaming connection with peer bbde196d079d68d3 (stream Message writer)
Nov 27 09:32:47 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 is starting a new election at term 15178
Nov 27 09:32:47 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 became candidate at term 15179
Nov 27 09:32:47 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 received MsgVoteResp from e27543f3820bc6d9 at term 15179
Nov 27 09:32:47 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 [logterm: 15176, index: 6864781963] sent MsgVote request to bbde196d079d68d3 at term 15179
Nov 27 09:32:47 p1-linux-mlsu008 etcd[9260]: e27543f3820bc6d9 [logterm: 15176, index: 6864781963] sent MsgVote request to d3b04b5071b4a6c1 at term 15179
Nov 27 09:32:47 p1-linux-mlsu008 etcd[9260]: lost the TCP streaming connection with peer d3b04b5071b4a6c1 (stream MsgApp v2 writer)
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: start to snapshot (applied: 6864780235, lastsnap: 6864775080)
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: saved snapshot at index 6864780235
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: compacted raft log at 6864775235
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: panic: send on closed channel
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: goroutine 1180462759 [running]:
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).do(0xc4250afdc0, 0xc449fc3360, 0xe62c60, 0xc451161901)
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:170 +0x174
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).Write(0xc4250afdc0, 0xc42feda400, 0xc444fd9b40, 0x5, 0x5, 0xc439355650, 0x21, 0x21, 0xc444fd9b46, 0x5, ...)
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:263 +0xf8
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc.(*serverStream).SendMsg(0xc447b42a00, 0xf0e1c0, 0xc449fc3310, 0x0, 0x0)
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/stream.go:611 +0x29d
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus.(*monitoredServerStream).SendMsg(0xc429e68d40, 0xf0e1c0, 0xc449fc3310, 0xc2893e, 0x20)
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus/server.go:61 +0x4b
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb.(*watchWatchServer).Send(0xc45bc2dca0, 0xc449fc3310, 0x0, 0xc451161b00)
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb/rpc.pb.go:3253 +0x49
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*serverWatchStream).sendLoop(0xc450574aa0)
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:355 +0xb86
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*watchServer).Watch.func1(0xc450574aa0)
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:133 +0x2b
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*watchServer).Watch
Nov 27 09:32:48 p1-linux-mlsu008 etcd[9260]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:135 +0x29b
Nov 27 09:32:48 p1-linux-mlsu008 systemd[1]: etcd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Nov 27 09:32:48 p1-linux-mlsu008 systemd[1]: Unit etcd.service entered failed state.
Nov 27 09:32:48 p1-linux-mlsu008 systemd[1]: etcd.service failed.
Nov 27 09:32:58 p1-linux-mlsu008 systemd[1]: etcd.service holdoff time over, scheduling restart.
Nov 27 15:16:20 p1-linux-mlsu008 etcd[42423]: start to snapshot (applied: 6874095485, lastsnap: 6874090405)
Nov 27 15:16:20 p1-linux-mlsu008 etcd[42423]: saved snapshot at index 6874095485
Nov 27 15:16:20 p1-linux-mlsu008 etcd[42423]: compacted raft log at 6874090485
Nov 27 15:16:26 p1-linux-mlsu008 etcd[42423]: apply entries took too long [5.335969569s for 39 entries]
Nov 27 15:16:26 p1-linux-mlsu008 etcd[42423]: avoid queries with large range/delete range!
Nov 27 15:16:28 p1-linux-mlsu008 etcd[42423]: start to snapshot (applied: 6874100508, lastsnap: 6874095485)
Nov 27 15:16:28 p1-linux-mlsu008 etcd[42423]: saved snapshot at index 6874100508
Nov 27 15:16:28 p1-linux-mlsu008 etcd[42423]: compacted raft log at 6874095508
Nov 27 15:16:32 p1-linux-mlsu008 etcd[42423]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000003b51-0000000199b9d44e.snap successfully
Nov 27 15:16:32 p1-linux-mlsu008 etcd[42423]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000003b51-0000000199b9e7d8.snap successfully
Nov 27 15:16:32 p1-linux-mlsu008 etcd[42423]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000003b51-0000000199b9fbd9.snap successfully
Nov 27 15:16:32 p1-linux-mlsu008 etcd[42423]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000003b51-0000000199ba0f81.snap successfully
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: panic: send on closed channel
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: goroutine 233544409 [running]:
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).do(0xc4324ed0a0, 0xc42394def0, 0xe62c60, 0xc4557bfd01)
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:170 +0x174
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).Write(0xc4324ed0a0, 0xc432cb4000, 0xc44dccda30, 0x5, 0x5, 0xc43152b530, 0x2b, 0x2b, 0xc44dccda36, 0x5, ...)
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:263 +0xf8
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc.(*serverStream).SendMsg(0xc4285bdc80, 0xed0840, 0xc428781860, 0x0, 0x0)
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/stream.go:611 +0x29d
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*serverStreamWithCtx).SendMsg(0xc46f15e4e0, 0xed0840, 0xc428781860, 0x0, 0x1e)
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: <autogenerated>:5 +0x5d
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus.(*monitoredServerStream).SendMsg(0xc452c64020, 0xed0840, 0xc428781860, 0xc422460960, 0xc428747410)
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus/server.go:61 +0x4b
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb.(*leaseLeaseKeepAliveServer).Send(0xc4342001a0, 0xc428781860, 0xc442249440, 0x46d95ffc9bfe4d25)
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb/rpc.pb.go:3432 +0x49
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*LeaseServer).leaseKeepAlive(0xc426218400, 0x14b76a0, 0xc4342001a0, 0xc42e9227c0, 0x68fbf9)
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/lease.go:118 +0x1b7
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*LeaseServer).LeaseKeepAlive.func1(0xc4328dff80, 0xc426218400, 0x14b76a0, 0xc4342001a0)
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/lease.go:74 +0x3f
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*LeaseServer).LeaseKeepAlive
Nov 27 15:16:36 p1-linux-mlsu008 etcd[42423]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/lease.go:75 +0x96
Nov 27 15:16:37 p1-linux-mlsu008 systemd[1]: etcd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Nov 27 15:16:37 p1-linux-mlsu008 systemd[1]: Unit etcd.service entered failed state.
Nov 27 15:16:37 p1-linux-mlsu008 systemd[1]: etcd.service failed.
Nov 27 16:20:49 p1-linux-mlsu008 etcd[119236]: apply entries took too long [1.403121619s for 11 entries]
Nov 27 16:20:49 p1-linux-mlsu008 etcd[119236]: avoid queries with large range/delete range!
Nov 27 16:20:50 p1-linux-mlsu008 etcd[119236]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000003b55-0000000199d60525.snap successfully
Nov 27 16:20:50 p1-linux-mlsu008 etcd[119236]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000003b55-0000000199d618ae.snap successfully
Nov 27 16:20:50 p1-linux-mlsu008 etcd[119236]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000003b55-0000000199d62c43.snap successfully
Nov 27 16:20:53 p1-linux-mlsu008 etcd[119236]: start to snapshot (applied: 6875947501, lastsnap: 6875942460)
Nov 27 16:20:53 p1-linux-mlsu008 etcd[119236]: saved snapshot at index 6875947501
Nov 27 16:20:53 p1-linux-mlsu008 etcd[119236]: compacted raft log at 6875942501
Nov 27 16:20:57 p1-linux-mlsu008 etcd[119236]: lost the TCP streaming connection with peer bbde196d079d68d3 (stream Message reader)
Nov 27 16:20:57 p1-linux-mlsu008 etcd[119236]: failed to read bbde196d079d68d3 on stream Message (unexpected EOF)
Nov 27 16:20:57 p1-linux-mlsu008 etcd[119236]: peer bbde196d079d68d3 became inactive
Nov 27 16:20:57 p1-linux-mlsu008 etcd[119236]: lost the TCP streaming connection with peer bbde196d079d68d3 (stream MsgApp v2 reader)
Nov 27 16:20:58 p1-linux-mlsu008 etcd[119236]: lost the TCP streaming connection with peer d3b04b5071b4a6c1 (stream MsgApp v2 reader)
Nov 27 16:20:58 p1-linux-mlsu008 etcd[119236]: failed to read d3b04b5071b4a6c1 on stream MsgApp v2 (unexpected EOF)
Nov 27 16:20:58 p1-linux-mlsu008 etcd[119236]: peer d3b04b5071b4a6c1 became inactive
Nov 27 16:20:58 p1-linux-mlsu008 etcd[119236]: lost the TCP streaming connection with peer d3b04b5071b4a6c1 (stream Message reader)
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: lost the TCP streaming connection with peer bbde196d079d68d3 (stream Message writer)
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: panic: send on closed channel
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: goroutine 40833272 [running]:
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).do(0xc44ab86af0, 0xc457618d20, 0xe62c60, 0xc444269901)
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:170 +0x174
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).Write(0xc44ab86af0, 0xc4432af000, 0xc42ab12328, 0x5, 0x5, 0xc42503e330, 0x23, 0x23, 0xc42ab1232e, 0x5, ...)
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:263 +0xf8
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc.(*serverStream).SendMsg(0xc43e998d80, 0xf0e1c0, 0xc457618aa0, 0x0, 0x0)
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/stream.go:611 +0x29d
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus.(*monitoredServerStream).SendMsg(0xc42d979380, 0xf0e1c0, 0xc457618aa0, 0xc2893e, 0x20)
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus/server.go:61 +0x4b
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb.(*watchWatchServer).Send(0xc43653b780, 0xc457618aa0, 0x0, 0xc444269b00)
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb/rpc.pb.go:3253 +0x49
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*serverWatchStream).sendLoop(0xc45d2732c0)
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:355 +0xb86
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*watchServer).Watch.func1(0xc45d2732c0)
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:133 +0x2b
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*watchServer).Watch
Nov 27 16:20:59 p1-linux-mlsu008 etcd[119236]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:135 +0x29b
Nov 27 16:20:59 p1-linux-mlsu008 systemd[1]: etcd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Nov 27 16:36:02 p1-linux-mlsu008 etcd[10353]: start to snapshot (applied: 6876374800, lastsnap: 6876369790)
Nov 27 16:36:02 p1-linux-mlsu008 etcd[10353]: saved snapshot at index 6876374800
Nov 27 16:36:02 p1-linux-mlsu008 etcd[10353]: compacted raft log at 6876369800
Nov 27 16:36:07 p1-linux-mlsu008 etcd[10353]: start to snapshot (applied: 6876379807, lastsnap: 6876374800)
Nov 27 16:36:07 p1-linux-mlsu008 etcd[10353]: saved snapshot at index 6876379807
Nov 27 16:36:07 p1-linux-mlsu008 etcd[10353]: compacted raft log at 6876374807
Nov 27 16:36:12 p1-linux-mlsu008 etcd[10353]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000003b60-0000000199dcb13c.snap successfully
Nov 27 16:36:12 p1-linux-mlsu008 etcd[10353]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000003b60-0000000199dcc4f9.snap successfully
Nov 27 16:36:12 p1-linux-mlsu008 etcd[10353]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000003b60-0000000199dcd88c.snap successfully
Nov 27 16:36:13 p1-linux-mlsu008 etcd[10353]: apply entries took too long [5.113383979s for 1 entries]
Nov 27 16:36:13 p1-linux-mlsu008 etcd[10353]: avoid queries with large range/delete range!
Nov 27 16:36:14 p1-linux-mlsu008 etcd[10353]: apply entries took too long [378.396588ms for 1 entries]
Nov 27 16:36:14 p1-linux-mlsu008 etcd[10353]: avoid queries with large range/delete range!
Nov 27 16:36:14 p1-linux-mlsu008 etcd[10353]: apply entries took too long [429.394802ms for 1 entries]
Nov 27 16:36:14 p1-linux-mlsu008 etcd[10353]: avoid queries with large range/delete range!
Nov 27 16:36:20 p1-linux-mlsu008 etcd[10353]: start to snapshot (applied: 6876385057, lastsnap: 6876379807)
Nov 27 16:36:20 p1-linux-mlsu008 etcd[10353]: saved snapshot at index 6876385057
Nov 27 16:36:20 p1-linux-mlsu008 etcd[10353]: compacted raft log at 6876380057
Nov 27 16:36:25 p1-linux-mlsu008 etcd[10353]: start to snapshot (applied: 6876390091, lastsnap: 6876385057)
Nov 27 16:36:25 p1-linux-mlsu008 etcd[10353]: saved snapshot at index 6876390091
Nov 27 16:36:25 p1-linux-mlsu008 etcd[10353]: compacted raft log at 6876385091
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: panic: send on closed channel
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: goroutine 12157639 [running]:
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).do(0xc437f33ea0, 0xc442b799a0, 0xe62c60, 0xc43449d901)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:170 +0x174
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).Write(0xc437f33ea0, 0xc436065300, 0xc4643a324a, 0x5, 0x5, 0xc425a69fb0, 0x21, 0x21, 0xc4643a32f0, 0x5, ...)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:263 +0xf8
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc.(*serverStream).SendMsg(0xc46aa8c400, 0xf0e1c0, 0xc442b79590, 0x0, 0x0)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/stream.go:611 +0x29d
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus.(*monitoredServerStream).SendMsg(0xc4926e4ce0, 0xf0e1c0, 0xc442b79590, 0xc2893e, 0x20)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus/server.go:61 +0x4b
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb.(*watchWatchServer).Send(0xc441c07ae0, 0xc442b79590, 0x0, 0xc43449db00)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb/rpc.pb.go:3253 +0x49
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*serverWatchStream).sendLoop(0xc45a1f8a00)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:355 +0xb86
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*watchServer).Watch.func1(0xc45a1f8a00)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:133 +0x2b
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*watchServer).Watch
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:135 +0x29b
Nov 27 16:36:41 p1-linux-mlsu008 systemd[1]: etcd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Nov 27 16:36:41 p1-linux-mlsu008 systemd[1]: Unit etcd.service entered failed state.
Nov 27 16:36:41 p1-linux-mlsu008 systemd[1]: etcd.service failed.
Nov 27 16:36:02 p1-linux-mlsu008 etcd[10353]: start to snapshot (applied: 6876374800, lastsnap: 6876369790)
Nov 27 16:36:02 p1-linux-mlsu008 etcd[10353]: saved snapshot at index 6876374800
Nov 27 16:36:02 p1-linux-mlsu008 etcd[10353]: compacted raft log at 6876369800
Nov 27 16:36:07 p1-linux-mlsu008 etcd[10353]: start to snapshot (applied: 6876379807, lastsnap: 6876374800)
Nov 27 16:36:07 p1-linux-mlsu008 etcd[10353]: saved snapshot at index 6876379807
Nov 27 16:36:07 p1-linux-mlsu008 etcd[10353]: compacted raft log at 6876374807
Nov 27 16:36:12 p1-linux-mlsu008 etcd[10353]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000003b60-0000000199dcb13c.snap successfully
Nov 27 16:36:12 p1-linux-mlsu008 etcd[10353]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000003b60-0000000199dcc4f9.snap successfully
Nov 27 16:36:12 p1-linux-mlsu008 etcd[10353]: purged file /appl/etcd/data/p1-linux-mlsu008/member/snap/0000000000003b60-0000000199dcd88c.snap successfully
Nov 27 16:36:13 p1-linux-mlsu008 etcd[10353]: apply entries took too long [5.113383979s for 1 entries]
Nov 27 16:36:13 p1-linux-mlsu008 etcd[10353]: avoid queries with large range/delete range!
Nov 27 16:36:14 p1-linux-mlsu008 etcd[10353]: apply entries took too long [378.396588ms for 1 entries]
Nov 27 16:36:14 p1-linux-mlsu008 etcd[10353]: avoid queries with large range/delete range!
Nov 27 16:36:14 p1-linux-mlsu008 etcd[10353]: apply entries took too long [429.394802ms for 1 entries]
Nov 27 16:36:14 p1-linux-mlsu008 etcd[10353]: avoid queries with large range/delete range!
Nov 27 16:36:20 p1-linux-mlsu008 etcd[10353]: start to snapshot (applied: 6876385057, lastsnap: 6876379807)
Nov 27 16:36:20 p1-linux-mlsu008 etcd[10353]: saved snapshot at index 6876385057
Nov 27 16:36:20 p1-linux-mlsu008 etcd[10353]: compacted raft log at 6876380057
Nov 27 16:36:25 p1-linux-mlsu008 etcd[10353]: start to snapshot (applied: 6876390091, lastsnap: 6876385057)
Nov 27 16:36:25 p1-linux-mlsu008 etcd[10353]: saved snapshot at index 6876390091
Nov 27 16:36:25 p1-linux-mlsu008 etcd[10353]: compacted raft log at 6876385091
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: panic: send on closed channel
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: goroutine 12157639 [running]:
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).do(0xc437f33ea0, 0xc442b799a0, 0xe62c60, 0xc43449d901)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:170 +0x174
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport.(*serverHandlerTransport).Write(0xc437f33ea0, 0xc436065300, 0xc4643a324a, 0x5, 0x5, 0xc425a69fb0, 0x21, 0x21, 0xc4643a32f0, 0x5, ...)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/transport/handler_server.go:263 +0xf8
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc.(*serverStream).SendMsg(0xc46aa8c400, 0xf0e1c0, 0xc442b79590, 0x0, 0x0)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/google.golang.org/grpc/stream.go:611 +0x29d
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus.(*monitoredServerStream).SendMsg(0xc4926e4ce0, 0xf0e1c0, 0xc442b79590, 0xc2893e, 0x20)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/grpc-ecosystem/go-grpc-prometheus/server.go:61 +0x4b
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb.(*watchWatchServer).Send(0xc441c07ae0, 0xc442b79590, 0x0, 0xc43449db00)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/etcdserverpb/rpc.pb.go:3253 +0x49
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*serverWatchStream).sendLoop(0xc45a1f8a00)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:355 +0xb86
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*watchServer).Watch.func1(0xc45a1f8a00)
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:133 +0x2b
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: created by github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc.(*watchServer).Watch
Nov 27 16:36:41 p1-linux-mlsu008 etcd[10353]: /home/gyuho/etcd/release/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/etcdserver/api/v3rpc/watch.go:135 +0x29b
Nov 27 16:36:41 p1-linux-mlsu008 systemd[1]: etcd.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Nov 27 16:36:41 p1-linux-mlsu008 systemd[1]: Unit etcd.service entered failed state.
Nov 27 16:36:41 p1-linux-mlsu008 systemd[1]: etcd.service failed.

@gyuho
Copy link
Contributor

gyuho commented Nov 29, 2017

Ok thanks. We will look into it.

@zbindenren
Copy link
Contributor Author

@gyuho Let me know if I can help more.

@gyuho
Copy link
Contributor

gyuho commented Nov 29, 2017

@zbindenren Sure, will do.

One question, is that cluster using TLS?

We have different gRPC handler when TLS is enabled.

@zbindenren
Copy link
Contributor Author

@gyuho You are right, the cluster is TLS with client certificate authentication. We had problems with user/password authentication, the server could not handle the load with username/password authentication.

@hexfusion
Copy link
Contributor

@gyuho I have not looked into this directly but I do recall seeing a similar issue with the grpc-gateway. What I found was that instead of caching the auth token where possible my Perl client was generating a new token during every request adding a lot of overhead which could result in panic. Could be unrelated but figure I would pass it along.

@gyuho
Copy link
Contributor

gyuho commented Dec 1, 2017

@zbindenren Upstream fix has been merged (grpc/grpc-go#1687).
We are still waiting for their release.

Hopefully, we can release the fix with v3.2.11 next week.

@zbindenren
Copy link
Contributor Author

@gyuho thanks, that would be great. I will test it as soon it is released.

@sokoow
Copy link

sokoow commented Dec 4, 2017

Are you guys planning to release 3.3 any soon ? :) Is this issue blocking it?

@gyuho
Copy link
Contributor

gyuho commented Dec 4, 2017

@sokoow We are waiting for gRPC team's release. Then we will release v3.2.11.
And this is the last issue blocking v3.3 as well.

@gyuho
Copy link
Contributor

gyuho commented Dec 5, 2017

v3.2.11 is released. Please try https://github.com/coreos/etcd/releases/tag/v3.2.11 and let us know if it fixes the issue.

@zbindenren
Copy link
Contributor Author

@gyuho thanks, I'll deploy it tomorrow on production and let you know.

@gyuho
Copy link
Contributor

gyuho commented Dec 7, 2017

@zbindenren Just checking in. Did the new release resolve the issue?

@zbindenren
Copy link
Contributor Author

@gyuho no leader election since update. The last leader election you see is from the update.

unbenannt

@gyuho
Copy link
Contributor

gyuho commented Dec 8, 2017

@zbindenren Thanks for confirming!

@sokoow
Copy link

sokoow commented Dec 9, 2017

Any sign of 3.3 release yet ?

@hexfusion
Copy link
Contributor

Any sign of 3.3 release yet ?

@sokoow v3.3.0-rc.0 is in the works. #8979 no official ETA yet, but I would say the PR is a good sign. 👍

@KlavsKlavsen
Copy link

I have this same issue in 3.2.17+dfsg-1 (ubuntu 18.04).. my entire cluster dies regularly :(
It seems the fix in 3.2.11 did not resolve the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 participants