Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix race in membership and ring consistency
Without this change, membership and ring could be not in sync. Scenario: A - current node. B - alive (according to C). B - suspect (according to D). C - thinks B is faulty. D - thinks B is alive. At the same time: 1. C gossips A that B is faulty. A removes B to the membership list. 2. D gossips A that B is alive. A adds B from the membership list. Changes in the ring: 3. The lock on m.members is released, so updates on Ring due to (1) and (2) can execute in any order. In our example, the operations are reversed: 3.1. A adds B to the ring list due to (2). 3.2. A removes B to the ring list due to (1). Now B is in the membership list, but not in the ring. This is the second iteration of the change. First iteration did not introduce a new lock, but a new invariant: a goroutine needs to lock both memberlist and disseminator, lock memberlist first, then disseminator. However, the invariant caused requestAdminStats to fail on travis-ci, and I couldn't find out why.
- Loading branch information