Fix: libpacemaker: Don't shuffle anonymous clone instances unnecessarily #2313

This commit adds a new pcmk__rsc_node_e bitfield enum containing values for allocated, current, and pending. This indicates the criterion used to look up a resource's location (e.g., where is it now vs. where is it allocated?). After a compatibility break, native_location() could use these flags instead of an int. That would require making this enum public. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Use the pcmk__rsc_node_e enum as a flags argument to the pe_node_attr_calculated() function. Pass pcmk__rsc_node_current as the flags argument for existing calls, as this mimics the existing behavior. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

lookup_promotion_score() should get a container's promotion score from the host to which it's allocated (if it's been allocated), rather than the host on which it's running. pe_node_attribute_calculated() now accepts a flags variable using these a pcmk__rsc_node_e enum to specify whether the value should come from the allocated_to host or the first running_on host. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Maximum allowed length of a stringified score is strlen("-INFINITY"). We can use this instead of hard-coded integer array sizes with score2char_stack(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Currently, bundle colocations aren't considered when determining promotion order. Bundle colocations apply to the bundle and its containers; the clone wrapper is not aware of them. This commit adds four tests for positive bundle colocations: - bundle (promoted) with ip, where ip has an INFINITY location score - bundle (promoted) with ip, where ip has a positive non-INFINITY location score - ip with bundle (promoted), where ip has an INFINITY location score - ip with bundle (promoted), where ip has a positive non-INFINITY location score Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Currently, bundle colocations aren't considered when determining promotion order. Bundle colocations apply to the bundle and its containers; the clone wrapper is not aware of them. This commit grabs the bundle's colocations into a working table. Then it iterates over the clone's children, finds each child's bundle node, and transfers node weights from the bundle node's host to the clone wrapper's copy of the bundle node object. This came up incidentally during an investigation of how bundles are processed when scheduling promotions. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Currently, when a promoted instance of an anonymous clone is stopped and about to be recovered, the instances get shuffled. Instance 0, which is running in non-promoted state on another node, gets moved to the to-be-started-and-promoted node. Then instance 1 starts on the non-promoted node. This causes an unnecessary restart on the non-promoted node. This will be fixed in an upcoming commit. This commit adds six tests: - clone-anon-recover (correct) - clone-group-anon-recover (correct) - promotable-anon-recover-promoted (incorrect) - promotable-anon-recover-non-promoted (correct) - promotable-group-anon-recover-promoted (questionable) - correct in that it doesn't shuffle instances, but maybe incorrect in that it promotes a different node after one monitor failure on the promoted node - promotable-group-anon-recover-non-promoted (correct) Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Currently, there are some circumstances under which running anonymous clones may be shuffled around the cluster. The exact requirements to reproduce the issue are unclear. In the case of this test CIB, the issue disappears if: - the colocation constraint between the Filesystem and clvm resources is removed, or - certain ones of the INFINITY location constraints for the Filesystem resource is removed. This will be fixed in an upcoming commit. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Currently, anonymous clone instances may be shuffled under certain conditions, causing an unnecessary resource downtime when an instance is moved away from its current running node. For example, this can happen when a stopped promotable instance is scheduled to promote and the stickiness is lower than the promotion score (see the promotable-anon-recover-promoted test). Instance 0 gets allocated first and goes to the node that will be promoted, causing it to relocate if it's already running somewhere else. There are also some other corner cases that can trigger shuffling, like the one in the clone-anon-no-shuffle-constraints test. The fix is to allocate an instance to its current node during pre-allocation if that node is going to receive an instance at all. Previously, if instance:0 was running on node1 and got pre-allocated to node2 due to node2 having a higher weight, we backed out and immediately gave up on pre-allocating instance:0. Now, if instance:0 is running on node1 and gets pre-allocated to node2, we increment the "reserved" counter (to ensure we don't allocate the max number of instances without node2 getting one), and we make node2 unavailable. If allocated + reserved < max, we try pre-allocating instance:0 again with node2 out of the picture. This commit also updates several tests that contain unnecessary instance moves, and it updates scores files that changed due to the fix. Resolves: RHBZ#1931023 Signed-off-by: Reid Wahl <nrwahl@protonmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: libpacemaker: Don't shuffle anonymous clone instances unnecessarily #2313

Fix: libpacemaker: Don't shuffle anonymous clone instances unnecessarily #2313

Commits on Apr 21, 2021