Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: TestReplicateRogueRemovedNode is flaky #2880

Closed
tbg opened this issue Oct 21, 2015 · 5 comments
Closed

storage: TestReplicateRogueRemovedNode is flaky #2880

tbg opened this issue Oct 21, 2015 · 5 comments
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered).

Comments

@tbg
Copy link
Member

tbg commented Oct 21, 2015

The following test appears to have failed:

#8544:

E1021 19:20:38.010347 954 storage/client_test.go:497  error reading "a" from engine 1: value type is not INT: UNKNOWN
E1021 19:20:39.011054 954 storage/client_test.go:497  error reading "a" from engine 1: value type is not INT: UNKNOWN
I1021 19:20:40.014157 954 storage/engine/rocksdb.go:132  closing in-memory rocksdb instance
I1021 19:20:40.014367 954 storage/engine/rocksdb.go:132  closing in-memory rocksdb instance
I1021 19:20:40.014498 954 storage/engine/rocksdb.go:132  closing in-memory rocksdb instance
--- FAIL: TestReplicateRogueRemovedNode (3.39s)
    <autogenerated>:32: storage/client_raft_test.go:1361: condition failed to evaluate within 3s: storage/client_test.go:511: expected [16 0 0], got [16 0 5]
=== RUN   TestReplicateReAddAfterDown
I1021 19:20:40.020060 954 multiraft/multiraft.go:524  node 1 starting
I1021 19:20:40.020440 954 storage/replica.go:1244  gossiping cluster id  from store 1, range 1
I1021 19:20:40.022304 954 raft/raft.go:409  [group 1] 1 became follower at term 5
I1021 19:20:40.022546 954 raft/raft.go:218  [group 1] newRaft 1 [peers: [1], term: 5, commit: 10, applied: 10, lastindex: 10, lastterm: 5]
I1021 19:20:40.022794 954 multiraft/multiraft.go:918  node 1 campaigning because initial confstate is [1]
I1021 19:20:40.023125 954 raft/raft.go:488  [group 1] 1 is starting a new election at term 5
I1021 19:20:40.023253 954 raft/raft.go:422  [group 1] 1 became candidate at term 6
I1021 19:20:40.023341 954 raft/raft.go:471  [group 1] 1 received vote from 1 at term 6
--
1      storage/replica.go:1514
I1021 19:20:44.143610 954 storage/engine/rocksdb.go:132  closing in-memory rocksdb instance
--- PASS: TestStoreRangeSystemSplits (0.72s)
=== RUN   Example_rebalancing
--- PASS: Example_rebalancing (0.87s)
FAIL
FAIL    github.com/cockroachdb/cockroach/storage    24.062s
=== RUN   TestBatchBasics
I1021 19:20:16.256442 934 storage/engine/rocksdb.go:132  closing in-memory rocksdb instance
--- PASS: TestBatchBasics (0.01s)
=== RUN   TestBatchGet
I1021 19:20:16.260942 934 storage/engine/rocksdb.go:132  closing in-memory rocksdb instance
--- PASS: TestBatchGet (0.01s)
=== RUN   TestBatchMerge
I1021 19:20:16.272093 934 storage/engine/rocksdb.go:132  closing in-memory rocksdb instance
--- PASS: TestBatchMerge (0.01s)
=== RUN   TestBatchProto

Please assign, take a look and update the issue accordingly.

@tbg tbg added the C-test-failure Broken test (automatically or manually discovered). label Oct 21, 2015
@bdarnell bdarnell changed the title Test failure in CI build 8544 "replica recreated during deletion" in TestReplicateRogueRemovedNode Oct 21, 2015
@bdarnell
Copy link
Contributor

This could just be a transient timeout but the logs show that it hit the newly-added "replica recreated during deletion" log line, which is suspicious.

@bdarnell bdarnell self-assigned this Oct 21, 2015
@tbg
Copy link
Member Author

tbg commented Oct 21, 2015

speaking of possible transient timeouts: https://circleci.com/gh/tschottdorf/cockroach/1517#tests

bdarnell added a commit to bdarnell/cockroach that referenced this issue Oct 22, 2015
Fixes a race between ProposeRaftCommand and the replicaGCQueue.

Closes cockroachdb#2880
@tamird
Copy link
Contributor

tamird commented Oct 28, 2015

This is still an issue https://circleci.com/gh/cockroachdb/cockroach/8723

--- FAIL: TestReplicateRogueRemovedNode (1.34s)
    <autogenerated>:32: storage/client_raft_test.go:1321: condition failed to evaluate within 1s: storage/client_test.go:514: expected [16 0 5], got [16 5 5]

EDIT: it's not clear that this issue is itself still present, but just that the test referenced in this issue fails.

@bdarnell
Copy link
Contributor

Another failure in this test (with a different error): https://circleci.com/gh/cockroachdb/cockroach/8764

@tamird tamird changed the title "replica recreated during deletion" in TestReplicateRogueRemovedNode storage: TestReplicateRogueRemovedNode is flaky Nov 13, 2015
@bg451
Copy link
Contributor

bg451 commented Nov 17, 2015

Here's another test failure, this time with a nil pointer dereference https://circleci.com/gh/cockroachdb/cockroach/9395

bdarnell added a commit to bdarnell/cockroach that referenced this issue Nov 25, 2015
I cannot reproduce a failure in this test with `make stress` after 5000
iterations.

Fixes cockroachdb#2880
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered).
Projects
None yet
Development

No branches or pull requests

4 participants