[Platform] Incorrect masters selection leads to universe creation failures #9391

SergeyPotachev · 2021-07-20T21:02:38Z

The problem is in PlacementInfoUtil.selectMasters(). It selects a number of masters to make the universe balanced (not under-replicated). As example, we can request to select 3 masters for a universe with 9 nodes and RF=3 and receive 4 or even 5 nodes marked as master as a result.
The junit test for such problem is here:

  @Test
  public void testSelectMasters_9nodes3regionsMixed() {
    List<NodeDetails> nodes = new ArrayList<NodeDetails>();
    nodes.add(ApiUtils.getDummyNodeDetails(1, NodeDetails.NodeState.ToBeAdded, false, true,
        "onprem", "31df", "us-2a", null));
    nodes.add(ApiUtils.getDummyNodeDetails(2, NodeDetails.NodeState.ToBeAdded, true, true, "onprem",
        "a2c5", "ap-1a", null));
    nodes.add(ApiUtils.getDummyNodeDetails(3, NodeDetails.NodeState.ToBeAdded, true, true, "onprem",
        "55ce", "eu-1a", null));
    nodes.add(ApiUtils.getDummyNodeDetails(4, NodeDetails.NodeState.ToBeAdded, true, true, "onprem",
        "31df", "us-2a", null));
    nodes.add(ApiUtils.getDummyNodeDetails(5, NodeDetails.NodeState.ToBeAdded, false, true,
        "onprem", "a2c5", "ap-1a", null));
    nodes.add(ApiUtils.getDummyNodeDetails(6, NodeDetails.NodeState.ToBeAdded, false, true,
        "onprem", "31df", "us-2a", null));
    nodes.add(ApiUtils.getDummyNodeDetails(7, NodeDetails.NodeState.ToBeAdded, false, true,
        "onprem", "55ce", "eu-1a", null));
    nodes.add(ApiUtils.getDummyNodeDetails(8, NodeDetails.NodeState.ToBeAdded, false, true,
        "onprem", "55ce", "eu-1a", null));
    nodes.add(ApiUtils.getDummyNodeDetails(9, NodeDetails.NodeState.ToBeAdded, false, true,
        "onprem", "a2c5", "ap-1a", null));

    PlacementInfoUtil.selectMasters(nodes, 3);
    List<NodeDetails> masters = nodes.stream().filter(node -> node.isMaster)
        .collect(Collectors.toList());
    assertEquals(3, masters.size());
  }

Also we need to return back some logging of the selected masters (removed from this function earlier).

cc @Arnav15

The text was updated successfully, but these errors were encountered:

SergeyPotachev · 2021-08-11T08:26:57Z

Marking this issue as priority/high as it can affect customers with universes having significant number of nodes and turn such universes into inoperable state after the Edit Universe operation.
cc @hsiaosu-yb

…erse creation failures Summary: 1. PLAT-364: Previously, in some rare cases, selectMasters() was able to return more masters placed than required (> RF). This leaded to the `EditUniverse` operation failure. Noticed on one of customer's universe. 2. PLAT-1825: "[Platform] Platform should balance master placement according to node count in each zone (#9620)". The issue is that our platform doesn't re-allocate masters if a number of nodes is changed in some zones. In this diff I'm introducing different logic implemented in the `selectMasters` function and also adding more logic inside EditUniverse itself. Some details about the new logic: 1. Each region should have at least one master; 2. Even if some other zones have more nodes, we still prefer to have at least one master in each zone (but not more than RF); 3. After rules 1 and 2 are complied and we still have some masters unallocated, we are allocating them in proportion to their total node amount in each zone. 4. Leader master is always preserved by the function. The function doesn't track a case when the leader master is changed right during the function work. But anyway, this corner case leads to further problems (in EditUniverse) also in very rare cases. (so the probability of such problems is "Rare^2") More details: 1. The function doesn't require a number of zones to be less or equal to RF (number of AZs could be larger than RF, in such case only the smallest AZs stay without master); 2. Only active nodes are processed (see NodeDetails::isActive() - states Live, ToBeAddded, etc, but not ToBeRemoved as ex,); Test Plan: Test scenarios should cover different cases of universes expansion. Like: - **Check that masters stay the same in case of even increase of nodes count in each zone:** RF=3, create a universe with 1 nodes in 3 different zones (az1 = 1 node, az2 = 1 node, az3 = 1 node); edit the universe - add one node in each universe; check that the updated universe has the same masters as after the creation; - **Check that masters are correctly redistributed:** RF=5, create a universe 1-3-1; edit the universe - add 5 nodes to az1 and 5 nodes to az3, so the universe should be 6-3-6; check that az1 and a3 have 2 masters, az2 - only one master; check that one of masters from az1 and az3 is the same as it was after creation. - **Check that master is stopped on existing node and moved to the new node:** Rf=3, universe 2, 2, 1 -> 2, 2, 4; - **Check that master is reassigned from one existing node to another one:** Rf=5, universe 2, 2, 4 -> 4, 2, 4. Check some other scenarios like RF=5, 1-2-3 -> 3-2-1; RF=5, 6-2-1 -> 2-1-6, RF=7, 2-2-3 -> 6-6-2, etc. Each test scenario should have one additional step at its finish - for each node we need to check which masters are written in the processes configuration files: - Select node in the platform UI (universe -> Nodes tab), Actions -> Connect (copy); - In console/terminal connect the` yugaware` host/container and paste the command to connect to the node; - List running processes: `ps aux | grep yugabyte`; You'll see something like: ``` yugabyte 11375 0.3 0.5 677228 78052 ? Sl 00:12 2:54 /home/yugabyte/master/bin/yb-master --flagfile **/home/yugabyte/master/conf/server.conf** yugabyte 11454 3.7 0.4 1643016 74844 ? Sl 00:12 27:56 /home/yugabyte/tserver/bin/yb-tserver --flagfile **/home/yugabyte/tserver/conf/server.conf** ``` - Check that for nodes without master we don't have master process running; for other nodes check that both processes are running; - List a content of the highlighted configuration files: ``` [yugabyte@yb-15-rahul-xcluster-producer-n1 ~]$ cat /home/yugabyte/master/conf/server.conf --placement_cloud=gcp --placement_region=us-west1 --placement_zone=us-west1-a --max_log_size=256 --server_broadcast_addresses= --fs_data_dirs=/mnt/d0 >>> --master_addresses=10.150.4.238:7100,10.150.4.239:7100,10.150.4.242:7100 --rpc_bind_addresses=10.150.4.239:7100 --webserver_port=7000 --webserver_interface=10.150.4.239 --placement_uuid=b9fa87b7-01aa-4889-82e1-f2b0a6ac9c88 --replication_factor=3 --cql_proxy_bind_address=10.150.4.239:9042 --callhome_collection_level=medium --enable_ysql=true --use_cassandra_authentication=false --metric_node_name=yb-15-rahul-xcluster-producer-n1 --ysql_enable_auth=false --cluster_uuid=4aa0687b-6996-467b-afc6-4ef201a4a9a6 --pgsql_proxy_bind_address=10.150.4.239:5433 --undefok=enable_ysql --txn_table_wait_min_ts_count=3 --start_cql_proxy=true ``` Check that correct masters are present in the marked line. The same for another process: ``` [yugabyte@yb-15-rahul-xcluster-producer-n1 ~]$ cat /home/yugabyte/tserver/conf/server.conf --placement_cloud=gcp --placement_region=us-west1 --placement_zone=us-west1-a --max_log_size=256 --server_broadcast_addresses= --fs_data_dirs=/mnt/d0 --rpc_bind_addresses=10.150.4.239:9100 >>> --tserver_master_addrs=10.150.4.238:7100,10.150.4.239:7100,10.150.4.242:7100 --webserver_port=9000 --webserver_interface=10.150.4.239 --cql_proxy_bind_address=10.150.4.239:9042 --redis_proxy_bind_address=10.150.4.239:6379 --placement_uuid=b9fa87b7-01aa-4889-82e1-f2b0a6ac9c88 --replication_factor=3 --callhome_collection_level=medium --enable_ysql=true --use_cassandra_authentication=false --metric_node_name=yb-15-rahul-xcluster-producer-n1 --ysql_enable_auth=false --cluster_uuid=4aa0687b-6996-467b-afc6-4ef201a4a9a6 --pgsql_proxy_bind_address=10.150.4.239:5433 --undefok=enable_ysql --txn_table_wait_min_ts_count=3 --start_cql_proxy=true --pgsql_proxy_webserver_port=13000 --start_redis_proxy=false --cql_proxy_webserver_port=12000 ``` Reviewers: amalyshev, sanketh Reviewed By: amalyshev, sanketh Subscribers: jenkins-bot, yugaware Differential Revision: https://phabricator.dev.yugabyte.com/D13236

SergeyPotachev added kind/bug This issue is a bug area/platform Yugabyte Platform labels Jul 20, 2021

hsu880 added this to Backlog in Platform Jul 20, 2021

hsu880 added this to the 2.7.x milestone Jul 20, 2021

SergeyPotachev added the priority/high High Priority label Aug 11, 2021

SergeyPotachev self-assigned this Sep 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Platform] Incorrect masters selection leads to universe creation failures #9391

[Platform] Incorrect masters selection leads to universe creation failures #9391

SergeyPotachev commented Jul 20, 2021 •

edited

Loading

SergeyPotachev commented Aug 11, 2021 •

edited

Loading

[Platform] Incorrect masters selection leads to universe creation failures #9391

[Platform] Incorrect masters selection leads to universe creation failures #9391

Comments

SergeyPotachev commented Jul 20, 2021 • edited Loading

SergeyPotachev commented Aug 11, 2021 • edited Loading

SergeyPotachev commented Jul 20, 2021 •

edited

Loading

SergeyPotachev commented Aug 11, 2021 •

edited

Loading