-
Notifications
You must be signed in to change notification settings - Fork 336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: libpacemaker: Don't shuffle anonymous clone instances unnecessarily #2313
Commits on Apr 21, 2021
-
Refactor: libpe_status: Add pcmk__rsc_node_e enum
This commit adds a new pcmk__rsc_node_e bitfield enum containing values for allocated, current, and pending. This indicates the criterion used to look up a resource's location (e.g., where is it now vs. where is it allocated?). After a compatibility break, native_location() could use these flags instead of an int. That would require making this enum public. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Configuration menu - View commit details
-
Copy full SHA for 1782352 - Browse repository at this point
Copy the full SHA 1782352View commit details -
Refactor: libpacemaker, libpe_status: pe_node_attribute_calculated enum
Use the pcmk__rsc_node_e enum as a flags argument to the pe_node_attr_calculated() function. Pass pcmk__rsc_node_current as the flags argument for existing calls, as this mimics the existing behavior. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Configuration menu - View commit details
-
Copy full SHA for b9e56ab - Browse repository at this point
Copy the full SHA b9e56abView commit details -
Fix: libpacemaker, libpe_status: Get container attr from allocated node
lookup_promotion_score() should get a container's promotion score from the host to which it's allocated (if it's been allocated), rather than the host on which it's running. pe_node_attribute_calculated() now accepts a flags variable using these a pcmk__rsc_node_e enum to specify whether the value should come from the allocated_to host or the first running_on host. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Configuration menu - View commit details
-
Copy full SHA for b223509 - Browse repository at this point
Copy the full SHA b223509View commit details -
Refactor: libpacemaker, libpe_status: Add PCMK__SCORE_MAX_LEN
Maximum allowed length of a stringified score is strlen("-INFINITY"). We can use this instead of hard-coded integer array sizes with score2char_stack(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Configuration menu - View commit details
-
Copy full SHA for b658f27 - Browse repository at this point
Copy the full SHA b658f27View commit details -
Refactor: libpacemaker: Functionalize sections of promotion_order()
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Configuration menu - View commit details
-
Copy full SHA for fb50c8a - Browse repository at this point
Copy the full SHA fb50c8aView commit details -
Test: scheduler: Apply bundle colocations in promotion order
Currently, bundle colocations aren't considered when determining promotion order. Bundle colocations apply to the bundle and its containers; the clone wrapper is not aware of them. This commit adds four tests for positive bundle colocations: - bundle (promoted) with ip, where ip has an INFINITY location score - bundle (promoted) with ip, where ip has a positive non-INFINITY location score - ip with bundle (promoted), where ip has an INFINITY location score - ip with bundle (promoted), where ip has a positive non-INFINITY location score Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Configuration menu - View commit details
-
Copy full SHA for c7f8400 - Browse repository at this point
Copy the full SHA c7f8400View commit details -
Fix: libpacemaker: Apply bundle colocations in promotion order
Currently, bundle colocations aren't considered when determining promotion order. Bundle colocations apply to the bundle and its containers; the clone wrapper is not aware of them. This commit grabs the bundle's colocations into a working table. Then it iterates over the clone's children, finds each child's bundle node, and transfers node weights from the bundle node's host to the clone wrapper's copy of the bundle node object. This came up incidentally during an investigation of how bundles are processed when scheduling promotions. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Configuration menu - View commit details
-
Copy full SHA for 1a4f92a - Browse repository at this point
Copy the full SHA 1a4f92aView commit details -
Test: scheduler: Clone placement when recovering promoted instance
Currently, when a promoted instance of an anonymous clone is stopped and about to be recovered, the instances get shuffled. Instance 0, which is running in non-promoted state on another node, gets moved to the to-be-started-and-promoted node. Then instance 1 starts on the non-promoted node. This causes an unnecessary restart on the non-promoted node. This will be fixed in an upcoming commit. This commit adds six tests: - clone-anon-recover (correct) - clone-group-anon-recover (correct) - promotable-anon-recover-promoted (incorrect) - promotable-anon-recover-non-promoted (correct) - promotable-group-anon-recover-promoted (questionable) - correct in that it doesn't shuffle instances, but maybe incorrect in that it promotes a different node after one monitor failure on the promoted node - promotable-group-anon-recover-non-promoted (correct) Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Configuration menu - View commit details
-
Copy full SHA for 1f401d2 - Browse repository at this point
Copy the full SHA 1f401d2View commit details -
Test: scheduler: Don't shuffle running anon clones due to constraints
Currently, there are some circumstances under which running anonymous clones may be shuffled around the cluster. The exact requirements to reproduce the issue are unclear. In the case of this test CIB, the issue disappears if: - the colocation constraint between the Filesystem and clvm resources is removed, or - certain ones of the INFINITY location constraints for the Filesystem resource is removed. This will be fixed in an upcoming commit. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Configuration menu - View commit details
-
Copy full SHA for 4e536f6 - Browse repository at this point
Copy the full SHA 4e536f6View commit details -
Fix: libpacemaker: Don't shuffle anonymous clone instances
Currently, anonymous clone instances may be shuffled under certain conditions, causing an unnecessary resource downtime when an instance is moved away from its current running node. For example, this can happen when a stopped promotable instance is scheduled to promote and the stickiness is lower than the promotion score (see the promotable-anon-recover-promoted test). Instance 0 gets allocated first and goes to the node that will be promoted, causing it to relocate if it's already running somewhere else. There are also some other corner cases that can trigger shuffling, like the one in the clone-anon-no-shuffle-constraints test. The fix is to allocate an instance to its current node during pre-allocation if that node is going to receive an instance at all. Previously, if instance:0 was running on node1 and got pre-allocated to node2 due to node2 having a higher weight, we backed out and immediately gave up on pre-allocating instance:0. Now, if instance:0 is running on node1 and gets pre-allocated to node2, we increment the "reserved" counter (to ensure we don't allocate the max number of instances without node2 getting one), and we make node2 unavailable. If allocated + reserved < max, we try pre-allocating instance:0 again with node2 out of the picture. This commit also updates several tests that contain unnecessary instance moves, and it updates scores files that changed due to the fix. Resolves: RHBZ#1931023 Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Configuration menu - View commit details
-
Copy full SHA for ca476b6 - Browse repository at this point
Copy the full SHA ca476b6View commit details