forked from apache/cassandra-python-driver
-
Notifications
You must be signed in to change notification settings - Fork 50
Open
Description
Can be observed in dtest cluster_replacement_test.py::TestClusterReplacement::test_rack_loss_recovery, which does this:
- down two nodes
- execute ALTER KEYSPACE in the background
- execute removenode on the downed nodes
If you comment out step 3, the ALTER times out, because schema agreement wait never completes.
There is a check in the driver, which is supposed to skip down nodes:
if peer and peer.is_up is not False:
versions[schema_ver].add(endpoint)but with WhiteListRoundRobinPolicy, all nodes except the node of the connection are ignored by policy's distance() function. So down and up events are ignored for those nodes. All peers except the contact node have is_up == None, and schema agreement waits for them to catch up (which they won't).
The problem probably happens with other policies as well, e.g. which ignore remote DCs.
I think the right fix for schema agreement wait is to do it on the server side.
Metadata
Metadata
Assignees
Labels
No labels