Consider AbstractJoiner's permanent blacklist in split-brain handling [HZ-2065] #24830

JamesHazelcast · 2023-06-15T21:03:40Z

For context: Hazelcast uses a Joiner implementation that handles the joining of Nodes to Clusters; all current Joiners extend AbstractJoiner which features a map of blacklisted addresses, blacklistedAddresses. The key of the map is the address, and the value is a boolean, which shows whether the blacklist is "permanent" (true) or not. A blacklist is applied in 2 situations: after failing a connection while not joined to a cluster (non-permanent), or during a ClusterMismatchOp (permanent). The latter is only fired when a Node tries to join a cluster, but is not compatible with the cluster, and so is rejected.

The TcpIpJoiner joiner employs the blacklist quite heavily, however it currently does not use it in the searchForOtherClusters method. This method is only called by the SplitBrainHandler task, which runs on loop (default of 5m delay, 2m interval). Due to this, when running multiple clusters sharing the same member list, it is possible to encounter a situation such as this:

Member list is defined as "127.0.0.1" for all Nodes.

Node A1 starts and forms cluster "cluster_A" as master
Node B1 starts and forms cluster "cluster_B" as master
Node A2 starts and joins cluster "cluster_A"

Node B1, as master, runs `SplitBrainHandler` every 2 minutes.
This finds A2's address; B1 sends a merge validation op
A2 is not master, so offloads to ClusterJoinManager#answerWhoisMasterQuestion
This calls #ensureValidConfiguration, results in ClusterMismatchOp
ClusterMismatchOp blacklists A2's address on the first encounter

In this scenario, even though the first attempt from B1's split brain handler blacklists A2's address, because there are no blacklist checks within TcpIpJoiner#searchForOtherClusters, the attempts continue with every execution of the SplitBrainHandler - this results in numerous unnecessary warnings in the logs for members B1 and A2.

This commit resolves scenarios like this by removing permanently blacklisted addresses from the collection of possibleAddresses output by TcpIpJoiner#searchForOtherClusters. Now master nodes will not attempt to send SplitBrainJoinMessages to other cluster nodes after blacklisting its address on the first encounter.

However, introducing this blacklist check for SplitBrainHandler adds some further considerations, such as what would happen in this scenario:

Member list is defined as "127.0.0.1" for all Nodes.

Node A1 starts on port 5701 and forms cluster "cluster_A" as master
Node A2 starts on port 5702 and joins cluster "cluster_A"

Node B1 starts on port 5703 and forms cluster "cluster_B" as master
Node B2 starts on port 5704 and joins cluster "cluster_B"

*Some time elapses, both A1/B2 have run SplitBrainHandler tasks*

Node A2 is shutdown.
Node B3 starts on port 5702 and joins cluster "cluster_B"

B1 (master) has B3's address blacklisted from when A2 was on port 5702
B1 will now never send SplitBrainJoinMessages to B3 due to this

In this scenario, B3 is able to join "cluster_B" successfully as it seeks the master itself and so the blacklist is not applied. This commit resolves the situation by removing Node addresses from the permanent blacklist within the TcpIpJoiner#onMemberAdded method.

As stated above, addresses are only permanently blacklisted in a ClusterMismatchOp, so if a Node has successfully triggered onMemberAdded, there has clearly been a topology change meaning this address is no longer incompatible. Therefore it should be safe to add this permanent blacklist removal functionality.

Fixes https://hazelcast.atlassian.net/browse/HZ-2065

For context: Hazelcast uses a `Joiner` implementation that handles the joining of Nodes to Clusters; all current Joiners extend `AbstractJoiner` which features a map of blacklisted addresses, `blacklistedAddresses`. The key of the map is the address, and the value is a boolean, which shows whether the blacklist is "permanent" (true) or not. A blacklist is applied in 2 situations: after failing a connection while not joined to a cluster (non-permanent), or during a `ClusterMismatchOp` (permanent). The latter is only fired when a Node tries to join a cluster, but is not compatible with the cluster, and so is rejected. The `TcpIpJoiner` joiner employs the blacklist quite heavily, however it currently does not use it in the `searchForOtherClusters` method. This method is only called by the `SplitBrainHandler` task, which runs on loop (default of 5m delay, 2m interval). Due to this, when running multiple clusters sharing the same member list, it is possible to encounter a situation such as this: ``` Member list is defined as "127.0.0.1" for all Nodes. Node A1 starts and forms cluster "cluster_A" as master Node B1 starts and forms cluster "cluster_B" as master Node A2 starts and joins cluster "cluster_A" Node B1, as master, runs `SplitBrainHandler` every 2 minutes. This finds A2's address; B1 sends a merge validation op A2 is not master, so offloads to ClusterJoinManager#answerWhoisMasterQuestion This calls #ensureValidConfiguration, results in ClusterMismatchOp ClusterMismatchOp blacklists A2's address on the first encounter ``` In this scenario, even though the first attempt from B1's split brain handler blacklists A2's address, because there are no blacklist checks within `TcpIpJoiner#searchForOtherClusters`, the attempts continue with every execution of the `SplitBrainHandler` - this results in numerous unnecessary warnings in the logs for members B1 and A2. This commit resolves scenarios like this by removing permanently blacklisted addresses from the collection of `possibleAddresses` output by `TcpIpJoiner#searchForOtherClusters`. Now master nodes will not attempt to send `SplitBrainJoinMessage`s to other cluster nodes after blacklisting its address on the first encounter. However, introducing this blacklist check for `SplitBrainHandler` adds some further considerations, such as what would happen in this scenario: ``` Member list is defined as "127.0.0.1" for all Nodes. Node A1 starts on port 5701 and forms cluster "cluster_A" as master Node A2 starts on port 5702 and joins cluster "cluster_A" Node B1 starts on port 5703 and forms cluster "cluster_B" as master Node B2 starts on port 5704 and joins cluster "cluster_B" *Some time elapses, both A1/B2 have run SplitBrainHandler tasks* Node A2 is shutdown. Node B3 starts on port 5702 and joins cluster "cluster_B" B1 (master) has B3's address blacklisted from when A2 was on port 5702 B1 will now never send SplitBrainJoinMessages to B3 due to this ``` In this scenario, B3 is able to join "cluster_B" successfully as it seeks the master itself and so the blacklist is not applied. This commit resolves the situation by removing Node addresses from the permanent blacklist within the `TcpIpJoiner#onMemberAdded` method. As stated above, addresses are only permanently blacklisted in a `ClusterMismatchOp`, so if a Node has successfully triggered `onMemberAdded`, there has clearly been a topology change meaning this address is no longer incompatible. Therefore it should be safe to add this permanent blacklist removal functionality. Fixes https://hazelcast.atlassian.net/browse/HZ-2065

kwart

Could you add also a regression test?

To conduct this test thoroughly, it was necessary to extract part of the `TcpIpJoiner#searchForOtherClusters` function into a separate function `TcpIpJoiner#getFilteredPossibleAddresses` - this function reduces the results of `AbstractJoiner#getPossibleAddresses` (applying filters such as removing known and blacklisted addresses), and does not change any functionality, it just exposes this list more easily.

JamesHazelcast · 2023-06-21T13:57:10Z

Could you add also a regression test?

Good point Josef thank you, I forgot about that! Added: 9d9dc2a

JamesHazelcast · 2023-06-21T15:02:20Z

run-lab-run

vbekiaris

nice fix & cleanup in searchForOtherClusters. Added a suggestion to use a smaller config in the test so hazelcast instance creation is a bit cheaper (on the CI machine, thread creation adds up :)).

hazelcast/src/test/java/com/hazelcast/cluster/SplitBrainHandlerTest.java

…HZ-2065] [5.3.z] (#24830) (#24888) Backport of: #24830 For context: Hazelcast uses a `Joiner` implementation that handles the joining of Nodes to Clusters; all current Joiners extend `AbstractJoiner` which features a map of blacklisted addresses, `blacklistedAddresses`. The key of the map is the address, and the value is a boolean, which shows whether the blacklist is "permanent" (true) or not. A blacklist is applied in 2 situations: after failing a connection while not joined to a cluster (non-permanent), or during a `ClusterMismatchOp` (permanent). The latter is only fired when a Node tries to join a cluster, but is not compatible with the cluster, and so is rejected. The `TcpIpJoiner` joiner employs the blacklist quite heavily, however it currently does not use it in the `searchForOtherClusters` method. This method is only called by the `SplitBrainHandler` task, which runs on loop (default of 5m delay, 2m interval). Due to this, when running multiple clusters sharing the same member list, it is possible to encounter a situation such as this: ``` Member list is defined as "127.0.0.1" for all Nodes. Node A1 starts and forms cluster "cluster_A" as master Node B1 starts and forms cluster "cluster_B" as master Node A2 starts and joins cluster "cluster_A" Node B1, as master, runs `SplitBrainHandler` every 2 minutes. This finds A2's address; B1 sends a merge validation op A2 is not master, so offloads to ClusterJoinManager#answerWhoisMasterQuestion This calls #ensureValidConfiguration, results in ClusterMismatchOp ClusterMismatchOp blacklists A2's address on the first encounter ``` In this scenario, even though the first attempt from B1's split brain handler blacklists A2's address, because there are no blacklist checks within `TcpIpJoiner#searchForOtherClusters`, the attempts continue with every execution of the `SplitBrainHandler` - this results in numerous unnecessary warnings in the logs for members B1 and A2. This commit resolves scenarios like this by removing permanently blacklisted addresses from the collection of `possibleAddresses` output by `TcpIpJoiner#searchForOtherClusters`. Now master nodes will not attempt to send `SplitBrainJoinMessage`s to other cluster nodes after blacklisting its address on the first encounter. However, introducing this blacklist check for `SplitBrainHandler` adds some further considerations, such as what would happen in this scenario: ``` Member list is defined as "127.0.0.1" for all Nodes. Node A1 starts on port 5701 and forms cluster "cluster_A" as master Node A2 starts on port 5702 and joins cluster "cluster_A" Node B1 starts on port 5703 and forms cluster "cluster_B" as master Node B2 starts on port 5704 and joins cluster "cluster_B" *Some time elapses, both A1/B2 have run SplitBrainHandler tasks* Node A2 is shutdown. Node B3 starts on port 5702 and joins cluster "cluster_B" B1 (master) has B3's address blacklisted from when A2 was on port 5702 B1 will now never send SplitBrainJoinMessages to B3 due to this ``` In this scenario, B3 is able to join "cluster_B" successfully as it seeks the master itself and so the blacklist is not applied. This commit resolves the situation by removing Node addresses from the permanent blacklist within the `TcpIpJoiner#onMemberAdded` method. As stated above, addresses are only permanently blacklisted in a `ClusterMismatchOp`, so if a Node has successfully triggered `onMemberAdded`, there has clearly been a topology change meaning this address is no longer incompatible. Therefore it should be safe to add this permanent blacklist removal functionality. Fixes https://hazelcast.atlassian.net/browse/HZ-2065

JamesHazelcast added Type: Defect Team: Core Source: Internal PR or issue was opened by an employee Module: Cluster Add to Release Notes labels Jun 15, 2023

JamesHazelcast added this to the 5.4.0 milestone Jun 15, 2023

JamesHazelcast self-assigned this Jun 15, 2023

JamesHazelcast requested review from kwart and vbekiaris June 16, 2023 09:08

JamesHazelcast marked this pull request as ready for review June 16, 2023 09:08

kwart reviewed Jun 20, 2023

View reviewed changes

JamesHazelcast requested a review from kwart June 21, 2023 13:57

vbekiaris approved these changes Jun 21, 2023

View reviewed changes

hazelcast/src/test/java/com/hazelcast/cluster/SplitBrainHandlerTest.java Outdated Show resolved Hide resolved

Shrink configs used in SplitBrainHandlerTest

e861df9

kwart approved these changes Jun 22, 2023

View reviewed changes

JamesHazelcast merged commit 8fb2847 into hazelcast:master Jun 22, 2023
8 checks passed

JamesHazelcast mentioned this pull request Jun 22, 2023

Consider AbstractJoiner's permanent blacklist in split-brain handling [HZ-2065] [5.3.z] (#24830) #24888

Merged

JamesHazelcast deleted the fix/5.4/hz-2065 branch October 9, 2023 12:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider AbstractJoiner's permanent blacklist in split-brain handling [HZ-2065] #24830

Consider AbstractJoiner's permanent blacklist in split-brain handling [HZ-2065] #24830

JamesHazelcast commented Jun 15, 2023

kwart left a comment

JamesHazelcast commented Jun 21, 2023

JamesHazelcast commented Jun 21, 2023

vbekiaris left a comment

Consider AbstractJoiner's permanent blacklist in split-brain handling [HZ-2065] #24830

Consider AbstractJoiner's permanent blacklist in split-brain handling [HZ-2065] #24830

Conversation

JamesHazelcast commented Jun 15, 2023

kwart left a comment

Choose a reason for hiding this comment

JamesHazelcast commented Jun 21, 2023

JamesHazelcast commented Jun 21, 2023

vbekiaris left a comment

Choose a reason for hiding this comment