Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

: TestStoreRangeDownReplicate failed under stress #10489

Closed
cockroach-teamcity opened this issue Nov 6, 2016 · 3 comments
Closed

: TestStoreRangeDownReplicate failed under stress #10489

cockroach-teamcity opened this issue Nov 6, 2016 · 3 comments
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.

Comments

@cockroach-teamcity
Copy link
Member

SHA: https://github.com/cockroachdb/cockroach/commits/fc842bc609b6a937998e5366703d7d318544169c

Stress build found a failed test:

I161106 08:38:49.442603 14740 storage/engine/rocksdb.go:340  opening in memory rocksdb instance
I161106 08:38:49.444408 14740 gossip/gossip.go:237  [n?] initial resolvers: []
W161106 08:38:49.444865 14740 gossip/gossip.go:1055  [n?] no resolvers found; use --join to specify a connected node
I161106 08:38:49.445228 14740 base/node_id.go:62  NodeID set to 1
I161106 08:38:49.473807 14740 storage/store.go:1174  [n1,s1]: failed initial metrics computation: [n1,s1]: system config not yet available
I161106 08:38:49.474138 14740 gossip/gossip.go:280  [n1] NodeDescriptor set to node_id:1 address:<network_field:"tcp" address_field:"127.0.0.1:41123" > attrs:<> locality:<> 
I161106 08:38:49.485622 14761 storage/replica_proposal.go:334  [s1,r1/1:/M{in-ax}] new range lease replica {1 1 1} 1970-01-01 00:00:00 +0000 UTC 900.000124ms following replica {0 0 0} 1970-01-01 00:00:00 +0000 UTC 0s [physicalTime=1970-01-01 00:00:00.000000123 +0000 UTC]
I161106 08:38:49.498541 14740 storage/engine/rocksdb.go:340  opening in memory rocksdb instance
I161106 08:38:49.505082 14740 gossip/gossip.go:237  [n?] initial resolvers: [127.0.0.1:41123]
W161106 08:38:49.505549 14740 gossip/gossip.go:1057  [n?] no incoming or outgoing connections
I161106 08:38:49.506871 14740 base/node_id.go:62  NodeID set to 2
I161106 08:38:49.534011 14740 storage/store.go:1174  [n2,s2]: failed initial metrics computation: [n2,s2]: system config not yet available
I161106 08:38:49.534384 14740 gossip/gossip.go:280  [n2] NodeDescriptor set to node_id:2 address:<network_field:"tcp" address_field:"127.0.0.1:41783" > attrs:<> locality:<> 
I161106 08:38:49.536563 14862 gossip/client.go:125  [n2] started gossip client to 127.0.0.1:41123
I161106 08:38:49.537281 14740 storage/engine/rocksdb.go:340  opening in memory rocksdb instance
I161106 08:38:49.538712 14740 gossip/gossip.go:237  [n?] initial resolvers: [127.0.0.1:41123]
W161106 08:38:49.539134 14740 gossip/gossip.go:1057  [n?] no incoming or outgoing connections
I161106 08:38:49.541903 14740 base/node_id.go:62  NodeID set to 3
I161106 08:38:49.598122 14740 storage/store.go:1174  [n3,s3]: failed initial metrics computation: [n3,s3]: system config not yet available
I161106 08:38:49.598469 14740 gossip/gossip.go:280  [n3] NodeDescriptor set to node_id:3 address:<network_field:"tcp" address_field:"127.0.0.1:51338" > attrs:<> locality:<> 
I161106 08:38:49.600196 14827 gossip/client.go:125  [n3] started gossip client to 127.0.0.1:41123
I161106 08:38:49.606546 14740 storage/engine/rocksdb.go:340  opening in memory rocksdb instance
I161106 08:38:49.625837 14740 gossip/gossip.go:237  [n?] initial resolvers: [127.0.0.1:41123]
W161106 08:38:49.626287 14740 gossip/gossip.go:1057  [n?] no incoming or outgoing connections
I161106 08:38:49.626617 14740 base/node_id.go:62  NodeID set to 4
I161106 08:38:49.654326 14979 gossip/client.go:125  [n4] started gossip client to 127.0.0.1:41123
I161106 08:38:49.659834 14740 gossip/gossip.go:280  [n4] NodeDescriptor set to node_id:4 address:<network_field:"tcp" address_field:"127.0.0.1:60836" > attrs:<> locality:<> 
I161106 08:38:49.660909 14740 storage/engine/rocksdb.go:340  opening in memory rocksdb instance
I161106 08:38:49.663458 14740 gossip/gossip.go:237  [n?] initial resolvers: [127.0.0.1:41123]
W161106 08:38:49.663991 14740 gossip/gossip.go:1057  [n?] no incoming or outgoing connections
I161106 08:38:49.682499 14740 base/node_id.go:62  NodeID set to 5
I161106 08:38:49.701087 14740 storage/store.go:1174  [n5,s5]: failed initial metrics computation: [n5,s5]: system config not yet available
I161106 08:38:49.701500 14740 gossip/gossip.go:280  [n5] NodeDescriptor set to node_id:5 address:<network_field:"tcp" address_field:"127.0.0.1:39429" > attrs:<> locality:<> 
I161106 08:38:49.726295 15078 gossip/client.go:125  [n5] started gossip client to 127.0.0.1:41123
I161106 08:38:49.746512 14363 gossip/server.go:263  [n1] refusing gossip from node 5 (max 3 conns); forwarding to 3 ({tcp 127.0.0.1:51338})
I161106 08:38:49.755838 15078 gossip/client.go:130  [n5] closing client to node 1 (127.0.0.1:41123): received forward from node 1 to 3 (127.0.0.1:51338)
I161106 08:38:49.756430 15094 gossip/client.go:125  [n5] started gossip client to 127.0.0.1:51338
I161106 08:38:49.885787 14740 storage/client_test.go:414  gossip network initialized
I161106 08:38:49.890131 14740 storage/replica_raftstorage.go:446  [s1,r1/1:/M{in-ax}] generated snapshot 513f1573 for range 1 at index 22 in 100.599µs.
I161106 08:38:49.907968 14740 storage/store.go:3103  streamed snapshot: kv pairs: 41, log entries: 12
I161106 08:38:49.915998 15150 storage/replica_raftstorage.go:592  [s2] [n2,s2,r1/?:{-}]: with replicaID [?], applying preemptive snapshot at index 22 (id=513f1573, encoded size=16, 1 rocksdb batches, 12 log entries)
I161106 08:38:49.920930 15150 storage/replica_raftstorage.go:595  [s2] [n2,s2,r1/?:/M{in-ax}]: with replicaID [?], applied preemptive snapshot in 0.005s
I161106 08:38:49.939076 14740 storage/replica_command.go:3242  change replicas: read existing descriptor range_id:1 start_key:"" end_key:"\377\377" replicas:<node_id:1 store_id:1 replica_id:1 > next_replica_id:2 
I161106 08:38:49.954203 15180 storage/replica.go:2010  [s1,r1/1:/M{in-ax}] proposing ADD_REPLICA {NodeID:2 StoreID:2 ReplicaID:2} for range 1: [{NodeID:1 StoreID:1 ReplicaID:1} {NodeID:2 StoreID:2 ReplicaID:2}]
I161106 08:38:49.984655 14740 storage/replica_raftstorage.go:446  [s1,r1/1:/M{in-ax}] generated snapshot 7056158a for range 1 at index 26 in 116.999µs.
W161106 08:38:49.990210 14776 raft/raft.go:794  [s1,r1/1:/M{in-ax}] 1 stepped down to follower since quorum is not active
I161106 08:38:50.002930 15181 storage/replica_raftstorage.go:592  [s3] [n3,s3,r1/?:{-}]: with replicaID [?], applying preemptive snapshot at index 26 (id=7056158a, encoded size=16, 1 rocksdb batches, 16 log entries)
I161106 08:38:50.006128 14740 storage/store.go:3103  streamed snapshot: kv pairs: 46, log entries: 16
I161106 08:38:50.007614 15181 storage/replica_raftstorage.go:595  [s3] [n3,s3,r1/?:/M{in-ax}]: with replicaID [?], applied preemptive snapshot in 0.004s
I161106 08:38:50.012605 14740 storage/replica_command.go:3242  change replicas: read existing descriptor range_id:1 start_key:"" end_key:"\377\377" replicas:<node_id:1 store_id:1 replica_id:1 > replicas:<node_id:2 store_id:2 replica_id:2 > next_replica_id:3 
I161106 08:38:50.482016 15260 storage/raft_transport.go:432  raft transport stream to node 1 established
I161106 08:38:50.615953 15196 storage/replica.go:2010  [s1,r1/1:/M{in-ax}] proposing ADD_REPLICA {NodeID:3 StoreID:3 ReplicaID:3} for range 1: [{NodeID:1 StoreID:1 ReplicaID:1} {NodeID:2 StoreID:2 ReplicaID:2} {NodeID:3 StoreID:3 ReplicaID:3}]
I161106 08:38:50.848904 15332 storage/raft_transport.go:432  raft transport stream to node 1 established
I161106 08:38:51.013855 14740 storage/replica_command.go:2360  initiating a split of this range at key "m" [r2]
I161106 08:38:51.491457 14740 storage/replica_raftstorage.go:446  [s1,r2/1:{"m"-/Max}] generated snapshot 5c0057e4 for range 2 at index 10 in 135.799µs.
I161106 08:38:51.507981 15447 storage/replica_raftstorage.go:592  [s4] [n4,s4,r2/?:{-}]: with replicaID [?], applying preemptive snapshot at index 10 (id=5c0057e4, encoded size=16, 1 rocksdb batches, 0 log entries)
I161106 08:38:51.520164 15447 storage/replica_raftstorage.go:595  [s4] [n4,s4,r2/?:{"m"-/Max}]: with replicaID [?], applied preemptive snapshot in 0.012s
I161106 08:38:51.521730 14740 storage/store.go:3103  streamed snapshot: kv pairs: 28, log entries: 0
I161106 08:38:51.572492 14740 storage/replica_command.go:3242  change replicas: read existing descriptor range_id:2 start_key:"m" end_key:"\377\377" replicas:<node_id:1 store_id:1 replica_id:1 > replicas:<node_id:2 store_id:2 replica_id:2 > replicas:<node_id:3 store_id:3 replica_id:3 > next_replica_id:4 
W161106 08:38:51.870045 15398 storage/stores.go:218  range not contained in one range: [/Meta2/Max,"m\x00"), but have [/Min,"m")
I161106 08:38:52.191093 15405 storage/replica.go:2010  [s1,r2/1:{"m"-/Max}] proposing ADD_REPLICA {NodeID:4 StoreID:4 ReplicaID:4} for range 2: [{NodeID:1 StoreID:1 ReplicaID:1} {NodeID:2 StoreID:2 ReplicaID:2} {NodeID:3 StoreID:3 ReplicaID:3} {NodeID:4 StoreID:4 ReplicaID:4}]
I161106 08:38:52.316806 14740 storage/replica_raftstorage.go:446  [s1,r2/1:{"m"-/Max}] generated snapshot 65176fec for range 2 at index 14 in 105.699µs.
I161106 08:38:52.341701 14740 storage/store.go:3103  streamed snapshot: kv pairs: 30, log entries: 4
I161106 08:38:52.343561 15375 storage/replica_raftstorage.go:592  [s5] [n5,s5,r2/?:{-}]: with replicaID [?], applying preemptive snapshot at index 14 (id=65176fec, encoded size=16, 1 rocksdb batches, 4 log entries)
I161106 08:38:52.345519 15375 storage/replica_raftstorage.go:595  [s5] [n5,s5,r2/?:{"m"-/Max}]: with replicaID [?], applied preemptive snapshot in 0.002s
I161106 08:38:52.381329 14740 storage/replica_command.go:3242  change replicas: read existing descriptor range_id:2 start_key:"m" end_key:"\377\377" replicas:<node_id:1 store_id:1 replica_id:1 > replicas:<node_id:2 store_id:2 replica_id:2 > replicas:<node_id:3 store_id:3 replica_id:3 > replicas:<node_id:4 store_id:4 replica_id:4 > next_replica_id:5 
I161106 08:38:52.508894 15556 storage/raft_transport.go:432  raft transport stream to node 1 established
I161106 08:38:53.227845 15622 storage/replica.go:2010  [s1,r2/1:{"m"-/Max}] proposing ADD_REPLICA {NodeID:5 StoreID:5 ReplicaID:5} for range 2: [{NodeID:1 StoreID:1 ReplicaID:1} {NodeID:2 StoreID:2 ReplicaID:2} {NodeID:3 StoreID:3 ReplicaID:3} {NodeID:4 StoreID:4 ReplicaID:4} {NodeID:5 StoreID:5 ReplicaID:5}]
I161106 08:38:53.600586 15650 storage/raft_transport.go:432  raft transport stream to node 1 established
I161106 08:38:55.428658 15664 storage/raft_transport.go:432  raft transport stream to node 2 established
I161106 08:38:55.747404 15723 storage/raft_transport.go:432  raft transport stream to node 3 established
I161106 08:38:56.474549 14851 storage/replica_proposal.go:334  [s2,r2/2:{"m"-/Max}] new range lease replica {2 2 2} 1970-01-01 00:00:15.30000014 +0000 UTC 1.800000002s following replica {1 1 1} 1970-01-01 00:00:00 +0000 UTC 15.30000014s [physicalTime=1970-01-01 00:00:19.800000145 +0000 UTC]
I161106 08:38:56.499749 14763 storage/replica_proposal.go:334  [s1,r2/1:{"m"-/Max}] new range lease replica {1 1 1} 1970-01-01 00:00:17.100000142 +0000 UTC 3.600000004s following replica {2 2 2} 1970-01-01 00:00:15.30000014 +0000 UTC 1.800000002s [physicalTime=1970-01-01 00:00:19.800000145 +0000 UTC]
I161106 08:38:57.147499 14912 storage/replica.go:2057  [s3,r2/3:{"m"-/Max}] not quiescing: 1 pending commands
I161106 08:38:57.160042 14850 storage/replica_proposal.go:334  [s2,r1/2:{/Min-"m"}] new range lease replica {2 2 2} 1970-01-01 00:00:15.30000014 +0000 UTC 5.400000006s following replica {3 3 3} 1970-01-01 00:00:09.900000134 +0000 UTC 5.400000006s [physicalTime=1970-01-01 00:00:23.400000149 +0000 UTC]
I161106 08:38:57.190580 14914 storage/replica_proposal.go:334  [s3,r1/3:{/Min-"m"}] new range lease replica {3 3 3} 1970-01-01 00:00:09.900000134 +0000 UTC 5.400000006s following replica {1 1 1} 1970-01-01 00:00:00 +0000 UTC 9.900000134s [physicalTime=1970-01-01 00:00:23.400000149 +0000 UTC]
I161106 08:38:57.813767 14851 storage/replica.go:2057  [s2,r1/2:{/Min-"m"}] not quiescing: 1 pending commands
I161106 08:38:57.968633 14772 storage/replica_proposal.go:334  [s1,r1/1:{/Min-"m"}] new range lease replica {1 1 1} 1970-01-01 00:00:42.30000017 +0000 UTC 3.600000004s following replica {2 2 2} 1970-01-01 00:00:15.30000014 +0000 UTC 27.00000003s [physicalTime=1970-01-01 00:00:45.000000173 +0000 UTC]
I161106 08:38:58.776619 15922 storage/raft_transport.go:432  raft transport stream to node 5 established
I161106 08:38:58.824619 15825 storage/raft_transport.go:432  raft transport stream to node 3 established
I161106 08:38:58.837735 15824 storage/raft_transport.go:432  raft transport stream to node 2 established
I161106 08:38:58.846099 15914 storage/raft_transport.go:432  raft transport stream to node 4 established
I161106 08:38:59.355569 16002 storage/raft_transport.go:432  raft transport stream to node 5 established
I161106 08:38:59.371042 15969 storage/raft_transport.go:432  raft transport stream to node 4 established
I161106 08:38:59.456097 14761 storage/replica_proposal.go:383  [s1,r2/1:{"m"-/Max}] range [n1,s1,r2/1:{"m"-/Max}]: transferring raft leadership to replica ID 4
I161106 08:38:59.463731 14761 storage/replica_proposal.go:383  [s1,r2/1:{"m"-/Max}] range [n1,s1,r2/1:{"m"-/Max}]: transferring raft leadership to replica ID 4
I161106 08:38:59.599187 14977 storage/replica_proposal.go:334  [s4,r2/4:{"m"-/Max}] new range lease replica {4 4 4} 1970-01-01 00:00:20.700000146 +0000 UTC 27.00000003s following replica {1 1 1} 1970-01-01 00:00:17.100000142 +0000 UTC 3.600000004s [physicalTime=1970-01-01 00:00:46.800000175 +0000 UTC]
I161106 08:39:00.012948 15995 storage/raft_transport.go:432  raft transport stream to node 4 established
I161106 08:39:00.517051 14740 storage/replica_command.go:3242  [replicate,s4,r2/4:{"m"-/Max}] change replicas: read existing descriptor range_id:2 start_key:"m" end_key:"\377\377" replicas:<node_id:1 store_id:1 replica_id:1 > replicas:<node_id:2 store_id:2 replica_id:2 > replicas:<node_id:3 store_id:3 replica_id:3 > replicas:<node_id:4 store_id:4 replica_id:4 > replicas:<node_id:5 store_id:5 replica_id:5 > next_replica_id:6 
I161106 08:39:00.685200 14850 storage/replica_proposal.go:334  [s2,r1/2:{/Min-"m"}] new range lease replica {2 2 2} 1970-01-01 00:00:47.700000176 +0000 UTC 1.800000002s following replica {1 1 1} 1970-01-01 00:00:42.30000017 +0000 UTC 5.400000006s [physicalTime=1970-01-01 00:00:48.600000177 +0000 UTC]
E161106 08:39:00.773844 14807 storage/node_liveness.go:141  [hb] failed liveness heartbeat: result is ambiguous
W161106 08:39:00.800945 15933 storage/replica.go:1749  [s1,r1/1:{/Min-"m"}] ignoring cancelled context for command [txn: 695dc09b], BeginTransaction [/System/NodeLiveness/4,/Min), ConditionalPut [/System/NodeLiveness/4,/Min), EndTransaction [/System/NodeLiveness/4,/Min)
E161106 08:39:00.803502 14687 storage/node_liveness.go:141  [hb] failed liveness heartbeat: result is ambiguous
W161106 08:39:00.864933 15901 storage/replica.go:1749  [s1,r1/1:{/Min-"m"}] ignoring cancelled context for command [txn: 45b85185], BeginTransaction [/System/NodeLiveness/1,/Min), ConditionalPut [/System/NodeLiveness/1,/Min), EndTransaction [/System/NodeLiveness/1,/Min)
W161106 08:39:00.865117 15981 storage/replica.go:1749  [s1,r1/1:{/Min-"m"}] ignoring cancelled context for command [txn: eac2e7d0], BeginTransaction [/System/NodeLiveness/5,/Min), ConditionalPut [/System/NodeLiveness/5,/Min), EndTransaction [/System/NodeLiveness/5,/Min)
E161106 08:39:00.865393 14400 storage/node_liveness.go:141  [hb] failed liveness heartbeat: result is ambiguous
E161106 08:39:00.902248 14866 storage/node_liveness.go:141  [hb] failed liveness heartbeat: result is ambiguous
W161106 08:39:00.902499 15931 storage/replica.go:1749  [s1,r1/1:{/Min-"m"}] ignoring cancelled context for command [txn: c03d9c99], BeginTransaction [/System/NodeLiveness/2,/Min), ConditionalPut [/System/NodeLiveness/2,/Min), EndTransaction [/System/NodeLiveness/2,/Min)
E161106 08:39:01.150012 14751 storage/node_liveness.go:141  [hb] failed liveness heartbeat: result is ambiguous
W161106 08:39:01.153245 15930 storage/replica.go:1749  [s1,r1/1:{/Min-"m"}] ignoring cancelled context for command [txn: 8c8ba4e6], BeginTransaction [/System/NodeLiveness/3,/Min), ConditionalPut [/System/NodeLiveness/3,/Min), EndTransaction [/System/NodeLiveness/3,/Min)
W161106 08:39:01.904634 16071 storage/replica.go:1265  [s1,r1/1:{/Min-"m"}] context canceled while in command queue: [txn: b859f28f], BeginTransaction [/System/NodeLiveness/1,/Min), ConditionalPut [/System/NodeLiveness/1,/Min), EndTransaction [/System/NodeLiveness/1,/Min)
I161106 08:39:02.637112 16121 storage/replica.go:2010  [s4,r2/4:{"m"-/Max}] proposing REMOVE_REPLICA {NodeID:1 StoreID:1 ReplicaID:1} for range 2: [{NodeID:5 StoreID:5 ReplicaID:5} {NodeID:2 StoreID:2 ReplicaID:2} {NodeID:3 StoreID:3 ReplicaID:3} {NodeID:4 StoreID:4 ReplicaID:4}]
I161106 08:39:02.935618 14740 storage/replica_command.go:3242  [replicate,s4,r2/4:{"m"-/Max}] change replicas: read existing descriptor range_id:2 start_key:"m" end_key:"\377\377" replicas:<node_id:1 store_id:1 replica_id:1 > replicas:<node_id:2 store_id:2 replica_id:2 > replicas:<node_id:3 store_id:3 replica_id:3 > replicas:<node_id:4 store_id:4 replica_id:4 > replicas:<node_id:5 store_id:5 replica_id:5 > next_replica_id:6 
W161106 08:39:02.970954 14763 storage/replica.go:537  [s1,r1/1:{/Min-"m"}] raftMu: mutex held by github.com/cockroachdb/cockroach/pkg/storage.(*Replica).handleRaftReady for 509.239477ms (>500ms):
goroutine 14763 [running]:
runtime/debug.Stack(0x1bf58d0, 0x1bf5910, 0x47)
	/usr/local/go/src/runtime/debug/stack.go:24 +0x79
github.com/cockroachdb/cockroach/pkg/util/syncutil.ThresholdLogger.func1(0x1e5a60b5)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/syncutil/timedmutex.go:65 +0xe1
github.com/cockroachdb/cockroach/pkg/util/syncutil.(*TimedMutex).Unlock(0xc42a31a078)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/syncutil/timedmutex.go:92 +0x80
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).handleRaftReady(0xc42a31a000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:2123 +0xee
github.com/cockroachdb/cockroach/pkg/storage.(*Store).processReady(0xc42a50e000, 0x1)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/store.go:3170 +0x13e
github.com/cockroachdb/cockroach/pkg/storage.(*raftScheduler).worker(0xc428cbed20, 0xc42a11e120)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/scheduler.go:230 +0x28c
github.com/cockroachdb/cockroach/pkg/storage.(*raftScheduler).Start.func2()
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/scheduler.go:181 +0x33
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunWorker.func1(0xc42a11e120, 0xc42a54c820)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:196 +0x7d
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunWorker
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:197 +0x66
W161106 08:39:03.457563 16023 storage/intent_resolver.go:314  [n2,s2,r1/2:{/Min-"m"}]: failed to push during intent resolution: failed to push "change-replica" id=846201cd key=/Local/Range/"m"/RangeDescriptor rw=true pri=0.02045049 iso=SERIALIZABLE stat=PENDING epo=0 ts=47.700000176,2 orig=46.800000175,268 max=46.800000175,268 wto=false rop=false
I161106 08:39:03.738057 16145 storage/replica.go:2010  [s4,r2/4:{"m"-/Max}] proposing REMOVE_REPLICA {NodeID:1 StoreID:1 ReplicaID:1} for range 2: [{NodeID:5 StoreID:5 ReplicaID:5} {NodeID:2 StoreID:2 ReplicaID:2} {NodeID:3 StoreID:3 ReplicaID:3} {NodeID:4 StoreID:4 ReplicaID:4}]
I161106 08:39:03.867301 14740 storage/replica_command.go:3242  [replicate,s4,r2/4:{"m"-/Max}] change replicas: read existing descriptor range_id:2 start_key:"m" end_key:"\377\377" replicas:<node_id:5 store_id:5 replica_id:5 > replicas:<node_id:2 store_id:2 replica_id:2 > replicas:<node_id:3 store_id:3 replica_id:3 > replicas:<node_id:4 store_id:4 replica_id:4 > next_replica_id:6 
I161106 08:39:03.883629 15541 storage/store.go:2960  [s1] [n1,s1,r2/1:{"m"-/Max}]: added to replica GC queue (peer suggestion)
I161106 08:39:03.893258 15541 storage/store.go:2960  [s1] [n1,s1,r2/1:{"m"-/Max}]: added to replica GC queue (peer suggestion)
I161106 08:39:04.834676 16309 storage/replica.go:2010  [s4,r2/4:{"m"-/Max}] proposing REMOVE_REPLICA {NodeID:3 StoreID:3 ReplicaID:3} for range 2: [{NodeID:5 StoreID:5 ReplicaID:5} {NodeID:2 StoreID:2 ReplicaID:2} {NodeID:4 StoreID:4 ReplicaID:4}]
I161106 08:39:04.963679 16286 util/stop/stopper.go:468  quiescing; tasks left:
1      storage/intent_resolver.go:383
I161106 08:39:04.973250 15932 storage/store.go:2960  [s3] [n3,s3,r2/3:{"m"-/Max}]: added to replica GC queue (peer suggestion)
E161106 08:39:04.979944 15932 storage/store.go:2958  [s3] [n3,s3,r2/3:{"m"-/Max}]: unable to add to replica GC queue: queue stopped
I161106 08:39:04.980079 16283 util/stop/stopper.go:468  quiescing; tasks left:
1      storage/queue.go:477
I161106 08:39:04.980221 16284 util/stop/stopper.go:468  quiescing; tasks left:
8      storage/client_test.go:501
W161106 08:39:04.980378 16279 storage/replica.go:1758  [s2,r1/2:{/Min-"m"}] shutdown cancellation of command [txn: a66bf92a], BeginTransaction [/System/NodeLiveness/2,/Min), ConditionalPut [/System/NodeLiveness/2,/Min), EndTransaction [/System/NodeLiveness/2,/Min)
I161106 08:39:04.980748 16284 util/stop/stopper.go:468  quiescing; tasks left:
7      storage/client_test.go:501
W161106 08:39:04.981126 16291 storage/replica.go:1758  [s2,r1/2:{/Min-"m"}] shutdown cancellation of command [txn: 1d850667], BeginTransaction [/System/NodeLiveness/4,/Min), ConditionalPut [/System/NodeLiveness/4,/Min), EndTransaction [/System/NodeLiveness/4,/Min)
I161106 08:39:04.981480 16284 util/stop/stopper.go:468  quiescing; tasks left:
6      storage/client_test.go:501
W161106 08:39:04.981843 16308 storage/replica.go:1758  [s2,r1/2:{/Min-"m"}] shutdown cancellation of command [txn: cfbc46f6], BeginTransaction [/System/NodeLiveness/3,/Min), ConditionalPut [/System/NodeLiveness/3,/Min), EndTransaction [/System/NodeLiveness/3,/Min)
I161106 08:39:04.982194 16284 util/stop/stopper.go:468  quiescing; tasks left:
5      storage/client_test.go:501
W161106 08:39:04.982543 16233 storage/replica.go:1758  [s2,r1/2:{/Min-"m"}] shutdown cancellation of command ResolveIntent [/Meta2/Max,/Min)
I161106 08:39:04.982955 16284 util/stop/stopper.go:468  quiescing; tasks left:
4      storage/client_test.go:501
W161106 08:39:04.985820 16152 storage/replica.go:1758  [s2,r1/2:{/Min-"m"}] shutdown cancellation of command ResolveIntent [/Meta2/Max,/Min)
I161106 08:39:04.986262 16284 util/stop/stopper.go:468  quiescing; tasks left:
3      storage/client_test.go:501
E161106 08:39:04.986792 14866 storage/node_liveness.go:141  [hb] failed liveness heartbeat: result is ambiguous
W161106 08:39:05.003176 16186 storage/intent_resolver.go:101  [s2] asynchronous resolveIntents failed: result is ambiguous
W161106 08:39:05.003701 16306 storage/replica.go:1758  [s2,r1/2:{/Min-"m"}] shutdown cancellation of command [txn: 3db7fb79], BeginTransaction [/System/NodeLiveness/1,/Min), ConditionalPut [/System/NodeLiveness/1,/Min), EndTransaction [/System/NodeLiveness/1,/Min)
I161106 08:39:05.004082 16284 util/stop/stopper.go:468  quiescing; tasks left:
2      storage/client_test.go:501
E161106 08:39:05.004655 14687 storage/node_liveness.go:141  [hb] failed liveness heartbeat: result is ambiguous
E161106 08:39:05.005510 14687 storage/node_liveness.go:141  [hb] failed liveness heartbeat: node unavailable; try another peer
E161106 08:39:05.005912 14400 storage/node_liveness.go:141  [hb] failed liveness heartbeat: result is ambiguous
E161106 08:39:05.006743 14400 storage/node_liveness.go:141  [hb] failed liveness heartbeat: node unavailable; try another peer
E161106 08:39:05.008825 14751 storage/node_liveness.go:141  [hb] failed liveness heartbeat: result is ambiguous
E161106 08:39:05.009705 14751 storage/node_liveness.go:141  [hb] failed liveness heartbeat: node unavailable; try another peer
I161106 08:39:05.016003 16284 util/stop/stopper.go:468  quiescing; tasks left:
1      storage/client_test.go:501
E161106 08:39:05.016131 14866 storage/node_liveness.go:141  [hb] failed liveness heartbeat: node unavailable; try another peer
W161106 08:39:05.016272 16267 storage/intent_resolver.go:337  [n4,s4,r2/4:{"m"-/Max}]: failed to resolve intents: result is ambiguous
W161106 08:39:05.017491 16209 storage/replica.go:1758  [s2,r1/2:{/Min-"m"}] shutdown cancellation of command [txn: b1d08d84], BeginTransaction [/System/NodeLiveness/5,/Min), ConditionalPut [/System/NodeLiveness/5,/Min), EndTransaction [/System/NodeLiveness/5,/Min)
E161106 08:39:05.018055 14655 storage/queue.go:568  [replicaGC] on [n1,s1,r2/1:{"m"-/Max}]: failed to send RPC: sending to all 3 replicas failed; last error: node unavailable; try another peer
I161106 08:39:05.019936 16282 util/stop/stopper.go:396  stop has been called, stopping or quiescing all running tasks
E161106 08:39:05.034498 14807 storage/node_liveness.go:141  [hb] failed liveness heartbeat: result is ambiguous
E161106 08:39:05.035393 14807 storage/node_liveness.go:141  [hb] failed liveness heartbeat: node unavailable; try another peer
E161106 08:39:05.035528 14866 storage/node_liveness.go:141  [hb] failed liveness heartbeat: node unavailable; try another peer
I161106 08:39:05.035651 16282 util/stop/stopper.go:396  stop has been called, stopping or quiescing all running tasks
W161106 08:39:05.101143 15262 storage/raft_transport.go:472  no handler found for store 2 in response range_id:1 from_replica:<node_id:1 store_id:1 replica_id:1 > to_replica:<node_id:2 store_id:2 replica_id:2 > union:<error:<message:"storage/raft_transport.go:258: unable to accept Raft message from {NodeID:2 StoreID:2 ReplicaID:2}: no handler registered for {NodeID:1 StoreID:1 ReplicaID:1}" transaction_restart:NONE origin_node:0 now:<wall_time:0 logical:0 > > > 
W161106 08:39:05.101391 15262 storage/raft_transport.go:472  no handler found for store 2 in response range_id:1 from_replica:<node_id:1 store_id:1 replica_id:1 > to_replica:<node_id:2 store_id:2 replica_id:2 > union:<error:<message:"storage/raft_transport.go:258: unable to accept Raft message from {NodeID:2 StoreID:2 ReplicaID:2}: no handler registered for {NodeID:1 StoreID:1 ReplicaID:1}" transaction_restart:NONE origin_node:0 now:<wall_time:0 logical:0 > > > 
E161106 08:39:05.109509 14751 storage/node_liveness.go:141  [hb] failed liveness heartbeat: node unavailable; try another peer
W161106 08:39:05.114686 15262 storage/raft_transport.go:472  no handler found for store 2 in response range_id:1 from_replica:<node_id:1 store_id:1 replica_id:1 > to_replica:<node_id:2 store_id:2 replica_id:2 > union:<error:<message:"storage/raft_transport.go:258: unable to accept Raft message from {NodeID:2 StoreID:2 ReplicaID:2}: no handler registered for {NodeID:1 StoreID:1 ReplicaID:1}" transaction_restart:NONE origin_node:0 now:<wall_time:0 logical:0 > > > 
W161106 08:39:05.114889 15262 storage/raft_transport.go:472  no handler found for store 2 in response range_id:1 from_replica:<node_id:1 store_id:1 replica_id:1 > to_replica:<node_id:2 store_id:2 replica_id:2 > union:<error:<message:"storage/raft_transport.go:258: unable to accept Raft message from {NodeID:2 StoreID:2 ReplicaID:2}: no handler registered for {NodeID:1 StoreID:1 ReplicaID:1}" transaction_restart:NONE origin_node:0 now:<wall_time:0 logical:0 > > > 
W161106 08:39:05.115083 15262 storage/raft_transport.go:472  no handler found for store 2 in response range_id:1 from_replica:<node_id:1 store_id:1 replica_id:1 > to_replica:<node_id:2 store_id:2 replica_id:2 > union:<error:<message:"storage/raft_transport.go:258: unable to accept Raft message from {NodeID:2 StoreID:2 ReplicaID:2}: no handler registered for {NodeID:1 StoreID:1 ReplicaID:1}" transaction_restart:NONE origin_node:0 now:<wall_time:0 logical:0 > > > 
W161106 08:39:05.115283 15262 storage/raft_transport.go:472  no handler found for store 2 in response range_id:1 from_replica:<node_id:1 store_id:1 replica_id:1 > to_replica:<node_id:2 store_id:2 replica_id:2 > union:<error:<message:"storage/raft_transport.go:258: unable to accept Raft message from {NodeID:2 StoreID:2 ReplicaID:2}: no handler registered for {NodeID:1 StoreID:1 ReplicaID:1}" transaction_restart:NONE origin_node:0 now:<wall_time:0 logical:0 > > > 
W161106 08:39:05.115472 15262 storage/raft_transport.go:472  no handler found for store 2 in response range_id:1 from_replica:<node_id:1 store_id:1 replica_id:1 > to_replica:<node_id:2 store_id:2 replica_id:2 > union:<error:<message:"storage/raft_transport.go:258: unable to accept Raft message from {NodeID:2 StoreID:2 ReplicaID:2}: no handler registered for {NodeID:1 StoreID:1 ReplicaID:1}" transaction_restart:NONE origin_node:0 now:<wall_time:0 logical:0 > > > 
W161106 08:39:05.115663 15262 storage/raft_transport.go:472  no handler found for store 2 in response range_id:1 from_replica:<node_id:1 store_id:1 replica_id:1 > to_replica:<node_id:2 store_id:2 replica_id:2 > union:<error:<message:"storage/raft_transport.go:258: unable to accept Raft message from {NodeID:2 StoreID:2 ReplicaID:2}: no handler registered for {NodeID:1 StoreID:1 ReplicaID:1}" transaction_restart:NONE origin_node:0 now:<wall_time:0 logical:0 > > > 
W161106 08:39:05.115854 15262 storage/raft_transport.go:472  no handler found for store 2 in response range_id:1 from_replica:<node_id:1 store_id:1 replica_id:1 > to_replica:<node_id:2 store_id:2 replica_id:2 > union:<error:<message:"storage/raft_transport.go:258: unable to accept Raft message from {NodeID:2 StoreID:2 ReplicaID:2}: no handler registered for {NodeID:1 StoreID:1 ReplicaID:1}" transaction_restart:NONE origin_node:0 now:<wall_time:0 logical:0 > > > 
W161106 08:39:05.116042 15262 storage/raft_transport.go:472  no handler found for store 2 in response range_id:1 from_replica:<node_id:1 store_id:1 replica_id:1 > to_replica:<node_id:2 store_id:2 replica_id:2 > union:<error:<message:"storage/raft_transport.go:258: unable to accept Raft message from {NodeID:2 StoreID:2 ReplicaID:2}: no handler registered for {NodeID:1 StoreID:1 ReplicaID:1}" transaction_restart:NONE origin_node:0 now:<wall_time:0 logical:0 > > > 
W161106 08:39:05.116244 15262 storage/raft_transport.go:472  no handler found for store 2 in response range_id:1 from_replica:<node_id:1 store_id:1 replica_id:1 > to_replica:<node_id:2 store_id:2 replica_id:2 > union:<error:<message:"storage/raft_transport.go:258: unable to accept Raft message from {NodeID:2 StoreID:2 ReplicaID:2}: no handler registered for {NodeID:1 StoreID:1 ReplicaID:1}" transaction_restart:NONE origin_node:0 now:<wall_time:0 logical:0 > > > 
W161106 08:39:05.116520 15262 storage/raft_transport.go:472  no handler found for store 2 in response range_id:1 from_replica:<node_id:1 store_id:1 replica_id:1 > to_replica:<node_id:2 store_id:2 replica_id:2 > union:<error:<message:"storage/raft_transport.go:258: unable to accept Raft message from {NodeID:2 StoreID:2 ReplicaID:2}: no handler registered for {NodeID:1 StoreID:1 ReplicaID:1}" transaction_restart:NONE origin_node:0 now:<wall_time:0 logical:0 > > > 
I161106 08:39:05.124438 16282 util/stop/stopper.go:396  stop has been called, stopping or quiescing all running tasks
W161106 08:39:05.124802 15768 storage/store.go:2964  [s3] got error from range 1, replica {2 2 2}: storage/raft_transport.go:258: unable to accept Raft message from {NodeID:3 StoreID:3 ReplicaID:3}: no handler registered for {NodeID:2 StoreID:2 ReplicaID:2}
W161106 08:39:05.126098 15768 storage/store.go:2964  [s3] got error from range 1, replica {2 2 2}: storage/raft_transport.go:258: unable to accept Raft message from {NodeID:3 StoreID:3 ReplicaID:3}: no handler registered for {NodeID:2 StoreID:2 ReplicaID:2}
W161106 08:39:05.126305 15768 storage/store.go:2964  [s3] got error from range 1, replica {2 2 2}: storage/raft_transport.go:258: unable to accept Raft message from {NodeID:3 StoreID:3 ReplicaID:3}: no handler registered for {NodeID:2 StoreID:2 ReplicaID:2}
W161106 08:39:05.126496 15768 storage/store.go:2964  [s3] got error from range 1, replica {2 2 2}: storage/raft_transport.go:258: unable to accept Raft message from {NodeID:3 StoreID:3 ReplicaID:3}: no handler registered for {NodeID:2 StoreID:2 ReplicaID:2}
W161106 08:39:05.126683 15768 storage/store.go:2964  [s3] got error from range 1, replica {2 2 2}: storage/raft_transport.go:258: unable to accept Raft message from {NodeID:3 StoreID:3 ReplicaID:3}: no handler registered for {NodeID:2 StoreID:2 ReplicaID:2}
W161106 08:39:05.127127 15916 storage/store.go:2964  [s4] got error from range 0, replica {2 2 0}: storage/raft_transport.go:258: unable to accept Raft message from {NodeID:4 StoreID:4 ReplicaID:0}: no handler registered for {NodeID:2 StoreID:2 ReplicaID:0}
I161106 08:39:05.129642 16282 util/stop/stopper.go:396  stop has been called, stopping or quiescing all running tasks
I161106 08:39:05.131642 16282 util/stop/stopper.go:396  stop has been called, stopping or quiescing all running tasks
I161106 08:39:05.133670 16282 util/stop/stopper.go:396  stop has been called, stopping or quiescing all running tasks
I161106 08:39:05.135089 14803 http2_server.go:276  transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:41783->127.0.0.1:56422: use of closed network connection
I161106 08:39:05.135614 14741 http2_server.go:276  transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:41123->127.0.0.1:35841: use of closed network connection
I161106 08:39:05.136281 14987 http2_server.go:276  transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:39429->127.0.0.1:46745: use of closed network connection
I161106 08:39:05.136757 16282 util/stop/stopper.go:396  stop has been called, stopping or quiescing all running tasks
I161106 08:39:05.137139 16282 util/stop/stopper.go:396  stop has been called, stopping or quiescing all running tasks
I161106 08:39:05.137488 14753 http2_server.go:276  transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:60836->127.0.0.1:57585: use of closed network connection
I161106 08:39:05.137757 16282 util/stop/stopper.go:396  stop has been called, stopping or quiescing all running tasks
I161106 08:39:05.138181 14824 http2_server.go:276  transport: http2Server.HandleStreams failed to read frame: read tcp 127.0.0.1:51338->127.0.0.1:51599: use of closed network connection
I161106 08:39:05.138428 16282 util/stop/stopper.go:396  stop has been called, stopping or quiescing all running tasks
I161106 08:39:05.138780 16282 util/stop/stopper.go:396  stop has been called, stopping or quiescing all running tasks
	client_raft_test.go:1138: Failed to achieve proper replication within 10 seconds
@cockroach-teamcity cockroach-teamcity added O-robot Originated from a bot. C-test-failure Broken test (automatically or manually discovered). labels Nov 6, 2016
@tamird
Copy link
Contributor

tamird commented Nov 6, 2016

Possibly #10409. @petermattis?

@petermattis
Copy link
Collaborator

The error certainly seems similar #10409. I can't reproduce the exact error after #10482, but I reproduced a similar problem. This new replication problem seems to be fixed by #10491.

@petermattis
Copy link
Collaborator

Duplicate of #10306.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot.
Projects
None yet
Development

No branches or pull requests

3 participants