e2e-pool: Start real pool provisions after inventory added#2080
Conversation
|
/lgtm Nice detective work |
|
/test e2e |
Goal: reduce e2e-pool wallclock time by ~35m. Problem Statement: When ClusterPool inventory (ClusterDeploymentCustomization) testing was added to e2e-pool (4fddbe7 / openshift#1672), it triggered ClusterPool's staleness algorithm such that we were actually wasting a whole cluster while waiting for the real pool to become ready. Grab a cup of coffee... To make the flow of the test a little bit easier, we were creating the real pool, then using its definition to generate the fake pool definition -- which does not have inventory -- and then adding inventory to the real pool. But if you add or change a pool's inventory, we mark all its clusters stale. So because of the flow above, when we initially created the real pool without inventory, it started provisioning a cluster. Then when we updated it (mere seconds later, if that), that cluster immediately became stale. Now, the way we decided to architect replacement of stale clusters, we prioritize _having claimable clusters_ over _all clusters being current_. Thus in this scenario we were actually ending up waiting until the stale cluster was fully provisioned before deleting it and starting over with the (inventory-affected) cluster. Solution: Create the real pool with an initial `size=0`. Scale it up to `size=1` _after_ adding the inventory.
653387b to
c891224
Compare
|
@dlom I did a silly, I can't have comments in the middle of a backslash-escaped multi-line shell command. |
|
Good to know, I didn't know that was illegal /lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 2uasimojo, dlom The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@2uasimojo: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Goal: reduce e2e-pool wallclock time by ~35m.
Problem Statement: When ClusterPool inventory
(ClusterDeploymentCustomization) testing was added to e2e-pool (4fddbe7 / #1672), it triggered ClusterPool's staleness algorithm such that we were actually wasting a whole cluster while waiting for the real pool to become ready. Grab a cup of coffee...
To make the flow of the test a little bit easier, we were creating the real pool, then using its definition to generate the fake pool definition -- which does not have inventory -- and then adding inventory to the real pool.
But if you add or change a pool's inventory, we mark all its clusters stale. So because of the flow above, when we initially created the real pool without inventory, it started provisioning a cluster. Then when we updated it (mere seconds later, if that), that cluster immediately became stale.
Now, the way we decided to architect replacement of stale clusters, we prioritize having claimable clusters over all clusters being current. Thus in this scenario we were actually ending up waiting until the stale cluster was fully provisioned before deleting it and starting over with the (inventory-affected) cluster.
Solution: Create the real pool with an initial
size=0. Scale it up tosize=1after adding the inventory.