New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: TestLogGrowthWhenRefreshingPendingCommands failed under stress #31581

Open
cockroach-teamcity opened this Issue Oct 18, 2018 · 1 comment

Comments

Projects
None yet
3 participants
@cockroach-teamcity
Member

cockroach-teamcity commented Oct 18, 2018

SHA: https://github.com/cockroachdb/cockroach/commits/310a04983cda8ab8d67cd401814341b9b7f8ce79

Parameters:

TAGS=
GOFLAGS=-race

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stressrace instead of stress and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
make stress TESTS=TestLogGrowthWhenRefreshingPendingCommands PKG=github.com/cockroachdb/cockroach/pkg/storage TESTTIMEOUT=5m STRESSFLAGS='-stderr=false -maxtime 20m -timeout 10m'

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=973601&tab=buildLog

I181018 08:38:14.368986 57083 storage/store_snapshot.go:664  [s1,r1/1:/M{in-ax}] streamed snapshot to (n4,s4):?: kv pairs: 57, log entries: 12, rate-limit: 2.0 MiB/sec, 4ms
I181018 08:38:14.371122 57777 storage/replica_raftstorage.go:804  [s4,r1/?:{-}] applying preemptive snapshot at index 22 (id=d1917bc7, encoded size=10667, 1 rocksdb batches, 12 log entries)
I181018 08:38:14.375923 57777 storage/replica_raftstorage.go:810  [s4,r1/?:/M{in-ax}] applied preemptive snapshot in 5ms [clear=0ms batch=0ms entries=3ms commit=0ms]
I181018 08:38:14.379734 57083 storage/replica_command.go:816  [s1,r1/1:/M{in-ax}] change replicas (ADD_REPLICA (n4,s4):4): read existing descriptor r1:/M{in-ax} [(n1,s1):1, (n2,s2):2, (n3,s3):3, next=4, gen=0]
I181018 08:38:14.389833 57083 storage/replica.go:3884  [s1,r1/1:/M{in-ax},txn=80a5dd1b] proposing ADD_REPLICA((n4,s4):4): updated=[(n1,s1):1 (n2,s2):2 (n3,s3):3 (n4,s4):4] next=5
I181018 08:38:14.403528 57083 storage/store_snapshot.go:621  [s1,r1/1:/M{in-ax}] sending preemptive snapshot df1e668f at applied index 24
I181018 08:38:14.404877 57083 storage/store_snapshot.go:664  [s1,r1/1:/M{in-ax}] streamed snapshot to (n5,s5):?: kv pairs: 60, log entries: 14, rate-limit: 2.0 MiB/sec, 3ms
I181018 08:38:14.406603 57939 storage/replica_raftstorage.go:804  [s5,r1/?:{-}] applying preemptive snapshot at index 24 (id=df1e668f, encoded size=11738, 1 rocksdb batches, 14 log entries)
I181018 08:38:14.410858 57939 storage/replica_raftstorage.go:810  [s5,r1/?:/M{in-ax}] applied preemptive snapshot in 4ms [clear=0ms batch=0ms entries=3ms commit=0ms]
I181018 08:38:14.414117 57083 storage/replica_command.go:816  [s1,r1/1:/M{in-ax}] change replicas (ADD_REPLICA (n5,s5):5): read existing descriptor r1:/M{in-ax} [(n1,s1):1, (n2,s2):2, (n3,s3):3, (n4,s4):4, next=5, gen=0]
I181018 08:38:14.429686 57083 storage/replica.go:3884  [s1,r1/1:/M{in-ax},txn=fe834d2e] proposing ADD_REPLICA((n5,s5):5): updated=[(n1,s1):1 (n2,s2):2 (n3,s3):3 (n4,s4):4 (n5,s5):5] next=6
I181018 08:38:14.431891 57348 storage/replica.go:3884  [s1,r1/1:/M{in-ax},txn=fe834d2e] proposing ADD_REPLICA((n5,s5):5): updated=[(n1,s1):1 (n2,s2):2 (n3,s3):3 (n4,s4):4 (n5,s5):5] next=6
I181018 08:38:14.437875 57348 storage/replica.go:3884  [s1,r1/1:/M{in-ax},txn=fe834d2e] proposing ADD_REPLICA((n5,s5):5): updated=[(n1,s1):1 (n2,s2):2 (n3,s3):3 (n4,s4):4 (n5,s5):5] next=6
--- FAIL: TestLogGrowthWhenRefreshingPendingCommands (1.41s)
=== RUN   TestLogGrowthWhenRefreshingPendingCommands/proposeOnFollower=true
W181018 08:38:15.140417 58091 storage/store.go:1490  [s4,r1/4:/M{in-ax}] could not gossip node liveness: periodic gossip is disabled
W181018 08:38:15.143237 58090 storage/store.go:1490  [s4,r1/4:/M{in-ax}] could not gossip system config: periodic gossip is disabled
W181018 08:38:15.188459 58179 storage/store.go:1490  [s5,r1/5:/M{in-ax}] could not gossip system config: periodic gossip is disabled
W181018 08:38:15.188548 58180 storage/store.go:1490  [s5,r1/5:/M{in-ax}] could not gossip node liveness: periodic gossip is disabled
W181018 08:38:15.288841 57883 storage/raft_transport.go:282  unable to accept Raft message from (n1,s1):1: no handler registered for (n5,s5):5
W181018 08:38:15.290198 57881 storage/store.go:3662  [s1,r1/1:/M{in-ax}] raft error: node 5 claims to not contain store 5 for replica (n5,s5):5: store 5 was not found
W181018 08:38:15.290472 57879 storage/raft_transport.go:584  while processing outgoing Raft queue to node 5: store 5 was not found:
W181018 08:38:15.349730 57626 storage/raft_transport.go:282  unable to accept Raft message from (n2,s2):2: no handler registered for (n1,s1):1
W181018 08:38:15.351891 57911 storage/store.go:3662  [s2,r1/2:/M{in-ax}] raft error: node 1 claims to not contain store 1 for replica (n1,s1):1: store 1 was not found
W181018 08:38:15.352408 57909 storage/raft_transport.go:584  while processing outgoing Raft queue to node 1: store 1 was not found:
I181018 08:38:15.362280 57738 gossip/gossip.go:1510  [n4] node has connected to cluster via gossip
    --- FAIL: TestLogGrowthWhenRefreshingPendingCommands/proposeOnFollower=true (0.22s)
    	client_raft_test.go:1263: raft log size grew to -23 KiB

@cockroach-teamcity cockroach-teamcity added this to the 2.2 milestone Oct 18, 2018

tschottdorf added a commit to tschottdorf/cockroach that referenced this issue Oct 18, 2018

storage: deflake TestLogGrowthWhenRefreshingPendingCommands
Harmless bug in the test in which an occasionally negative int was cast to a uint64.

Fixes cockroachdb#31581.

Release note: None
@tschottdorf

This comment has been minimized.

Show comment
Hide comment
@tschottdorf

tschottdorf Oct 18, 2018

Member

Harmless-looking test flake. Fix in #31583

Member

tschottdorf commented Oct 18, 2018

Harmless-looking test flake. Fix in #31583

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment