Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Ensure all Raft nodes are terminated before CP member terminates #16022
It could happen that a Raft node can leak and stay in ACTIVE state after
Even though we have been bashing about our test failures in our Windows
Thank you Windows for having principles for file deletion.
It could happen that a Raft node can leak and stay in ACTIVE state after a CP member terminates, because of a race between Hazelcast member shutdown and Raft node termination logic. This commit fixes this problem by deleting a Raft node from the internal state of RaftService only after the Raft node has completed its termination. By this way, the Hazelcast member shutdown / termination blocks until all Raft nodes are terminated. Even though we have been bashing about our test failures in our Windows environment, it turned out that Windows helped us to discover another edge case in the never-ending battle of Hazelcast member shutdowns and CP group destroys. Windows does not allow a file to be deleted while it is still open (probably this behaviour could be changed) and if a Raft node has leaked in ACTIVE status, it causes failure during deletion of CP group directories in the following loops of the CP group destroy tests. Thank you Windows for having principles for file deletion. Fixes hazelcast/hazelcast-enterprise#3219