Skip to content

Conversation

@elmiko
Copy link

@elmiko elmiko commented Oct 27, 2025

this change refactors how cloud providers are created so that providers can inject a custom scale down processor. it also adds an upgrade processor for cluster-api to allow skipping machinedeployments that are undergoing upgrade.

refactor core.AutoscalerOptions in a new package

This change helps to prevent circular dependencies between the core and
builder packages as we start to pass the AutoscalerOptions to the cloud
provider builder functions.

refactor NewCloudProvider to accept AutoscalerOptions

this changes the options input to the cloud provider builder function so
that the full autoscaler options are passed. This is being proposed so
that cloud providers will have new options for injecting behavior into
the core parts of the autoscaler.

add RegisterCombinedScaledDownCandidateProcessor

util function to help cloud providers in adding additional combined
scale down processors.

add clusterapi scale down upgrade processor

This change adds a custom scale down node processor for cluster api to
reject nodes that are undergoing upgrade.

ensure cloud provider is initalized last

this change moves the cloud provider initialization to the end of the
initializeDefaultOptions function to ensure that all other options are
prepared before the cloud provider. Due to the cloud provider now
receiving the full AutoscalerOptions struct, we need to ensure that all
the data is available.

refactor gpu_processor_test to remove cyclic dependency

this change removes the import from the gce module in favor of using the
string value directly.
@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 27, 2025
@openshift-ci-robot
Copy link

@elmiko: This pull request references Jira Issue OCPBUGS-63605, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected dependent Jira Issue OCPBUGS-63603 to be in one of the following states: VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA), but it is ASSIGNED instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

this change refactors how cloud providers are created so that providers can inject a custom scale down processor. it also adds an upgrade processor for cluster-api to allow skipping machinedeployments that are undergoing upgrade.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@elmiko elmiko changed the title OCPBUGS-63605: refactor cloud provider options [release-4.18] OCPBUGS-63605: refactor cloud provider options Oct 27, 2025
@elmiko
Copy link
Author

elmiko commented Oct 27, 2025

/test e2e-aws-periodic-pre

@openshift-ci openshift-ci bot requested review from enxebre and frobware October 27, 2025 20:08
@openshift-ci
Copy link

openshift-ci bot commented Oct 27, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign joelspeed for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@elmiko
Copy link
Author

elmiko commented Oct 28, 2025

/test e2e-aws-periodic-pre

1 similar comment
@elmiko
Copy link
Author

elmiko commented Oct 28, 2025

/test e2e-aws-periodic-pre

@elmiko
Copy link
Author

elmiko commented Oct 29, 2025

i'm a little concerned about the failures in e2e-aws-periodic-pre. it looks like slow machine creation might be causing the failures.

/retest

@elmiko
Copy link
Author

elmiko commented Oct 30, 2025

/retest

@elmiko
Copy link
Author

elmiko commented Oct 30, 2025

still seeing scale out errors, but i think it's slow infra.

/retest

@elmiko
Copy link
Author

elmiko commented Oct 31, 2025

/retest

@openshift-ci
Copy link

openshift-ci bot commented Oct 31, 2025

@elmiko: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-hypershift cb3c965 link true /test e2e-hypershift
ci/prow/e2e-aws-operator cb3c965 link true /test e2e-aws-operator
ci/prow/e2e-aws-periodic-pre cb3c965 link false /test e2e-aws-periodic-pre

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants