CNTRLPLANE-2262: Enable Azure scale-from-zero in CI#79770
Conversation
Add scale-from-zero configuration to the Azure self-managed install path and include TestNodePoolAutoscalingScaleFromZero in the e2e-azure-self-managed test regex. Temporarily override hypershift-operator and hypershift-tests images with pre-built images from quay.io/jjaggars/ to test the full scale-from-zero flow via pj-rehearse before merging the companion hypershift PR #8337.
|
@jhjaggars: This pull request references CNTRLPLANE-2262 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughThis PR enables Azure scale-from-zero autoscaling for self-managed Hypershift deployments. CI image builds are reconfigured to use prebuilt images from ChangesAzure Scale-from-Zero Support
🎯 2 (Simple) | ⏱️ ~12 minutes
Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error)
✅ Passed checks (14 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Skipping CI for Draft Pull Request. |
|
[REHEARSALNOTIFIER]
A total of 689 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs. A full list of affected jobs can be found here Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
/pj-rehearse e2e-azure-self-managed |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ci-operator/step-registry/hypershift/install/hypershift-install-commands.sh`:
- Around line 125-132: The current lookup for SUBSCRIPTION_ID swallows errors
via "|| true" so scale-from-zero can be silently disabled; change the logic in
the block that sets SUBSCRIPTION_ID/SCALE_FROM_ZERO_CREDS/EXTRA_ARGS to fail
fast: remove the suppression so the az account show call returns a non-zero
status on error, and add an explicit check that if
/etc/hypershift-ci-jobs-self-managed-azure/credentials.json exists (or when
scale-from-zero is expected) and SUBSCRIPTION_ID is empty, emit an error to
stderr and exit non-zero; otherwise continue to create SCALE_FROM_ZERO_CREDS
(using HYPERSHIFT_AZURE_LOCATION) and append the
--scale-from-zero-provider/--scale-from-zero-creds flags to EXTRA_ARGS as
before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: c53907b8-3f56-4d65-a9ec-de99fc19a924
⛔ Files ignored due to path filters (3)
ci-operator/jobs/openshift/hypershift/openshift-hypershift-main-periodics.yamlis excluded by!ci-operator/jobs/**ci-operator/jobs/openshift/hypershift/openshift-hypershift-main-postsubmits.yamlis excluded by!ci-operator/jobs/**ci-operator/jobs/openshift/hypershift/openshift-hypershift-main-presubmits.yamlis excluded by!ci-operator/jobs/**
📒 Files selected for processing (2)
ci-operator/config/openshift/hypershift/openshift-hypershift-main.yamlci-operator/step-registry/hypershift/install/hypershift-install-commands.sh
| SUBSCRIPTION_ID=$(az account show --query id -o tsv 2>/dev/null || true) | ||
| if [ -n "${SUBSCRIPTION_ID}" ]; then | ||
| SCALE_FROM_ZERO_CREDS=$(mktemp) | ||
| jq --arg sub "${SUBSCRIPTION_ID}" --arg loc "${HYPERSHIFT_AZURE_LOCATION:-centralus}" \ | ||
| '. + {subscriptionId: $sub, location: $loc}' \ | ||
| /etc/hypershift-ci-jobs-self-managed-azure/credentials.json > "${SCALE_FROM_ZERO_CREDS}" | ||
| EXTRA_ARGS="${EXTRA_ARGS} --scale-from-zero-provider azure --scale-from-zero-creds ${SCALE_FROM_ZERO_CREDS}" | ||
| fi |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="$(fd -a 'hypershift-install-commands.sh$' ci-operator/step-registry | head -n1)"
echo "Inspecting ${file}"
nl -ba "${file}" | sed -n '121,135p'
rg -n -C2 'az account show|\\|\\| true|scale-from-zero-provider azure' "${file}"Repository: openshift/release
Length of output: 228
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="ci-operator/step-registry/hypershift/install/hypershift-install-commands.sh"
if [ ! -f "$file" ]; then
echo "Missing $file" >&2
exit 1
fi
echo "== Lines 121-135 =="
sed -n '121,135p' "$file" | cat -n
echo "== Relevant rg matches =="
rg -n -C2 'az account show|^\s*.*\|\|\s*true\s*$|scale-from-zero-provider azure' "$file"Repository: openshift/release
Length of output: 1631
Fail fast (or explicitly error) when Azure subscription ID lookup fails for scale-from-zero
az account show ... 2>/dev/null || true suppresses subscription lookup errors, and the SFZ flags (--scale-from-zero-provider azure --scale-from-zero-creds ...) are only appended when SUBSCRIPTION_ID is non-empty—so a failure silently disables scale-from-zero even when credentials.json exists. Make this failure explicit (e.g., remove the || true/stderr suppression and/or emit an error + exit when SFZ is expected but the subscription id can’t be resolved).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@ci-operator/step-registry/hypershift/install/hypershift-install-commands.sh`
around lines 125 - 132, The current lookup for SUBSCRIPTION_ID swallows errors
via "|| true" so scale-from-zero can be silently disabled; change the logic in
the block that sets SUBSCRIPTION_ID/SCALE_FROM_ZERO_CREDS/EXTRA_ARGS to fail
fast: remove the suppression so the az account show call returns a non-zero
status on error, and add an explicit check that if
/etc/hypershift-ci-jobs-self-managed-azure/credentials.json exists (or when
scale-from-zero is expected) and SUBSCRIPTION_ID is empty, emit an error to
stderr and exit non-zero; otherwise continue to create SCALE_FROM_ZERO_CREDS
(using HYPERSHIFT_AZURE_LOCATION) and append the
--scale-from-zero-provider/--scale-from-zero-creds flags to EXTRA_ARGS as
before.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: jhjaggars The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@jhjaggars: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
@openshift-ci[bot]: your |
|
@jhjaggars: job(s): e2e-azure-self-managed either don't exist or were not found to be affected, and cannot be rehearsed |
Summary
--scale-from-zero-provider azureto the hypershift install step for Azure self-managed clustersTestNodePoolAutoscalingScaleFromZeroto thee2e-azure-self-managedCI test regexhypershift-operatorandhypershift-testsimages with pre-built images fromquay.io/jjaggars/to validate the full scale-from-zero flow via pj-rehearseDetails
The install script constructs a scale-from-zero credentials file by merging the existing Azure SP credentials with
subscriptionId(fromaz account show) andlocation(fromHYPERSHIFT_AZURE_LOCATIONenv var).The image overrides are temporary for rehearsal testing only. Before merge, the
imagessection will be reverted to useDockerfileandDockerfile.e2efrom the hypershift repo.Dependencies
Companion PR: openshift/hypershift#8337
Test plan
/pj-rehearse e2e-azure-self-managedTestNodePoolAutoscalingScaleFromZeropasses on Azure🤖 Generated with Claude Code
Summary by CodeRabbit
This PR enables Azure scale-from-zero autoscaling support in the CI infrastructure for the hypershift project. It makes two key sets of changes:
CI Configuration Changes:
The PR modifies the hypershift CI pipeline for Azure self-managed clusters to:
quay.io/jjaggars/hypershift-operator:azure-sfzandquay.io/jjaggars/hypershift-tests:azure-sfz) instead of building from source, allowing validation of scale-from-zero functionality through pj-rehearse before the upstream hypershift code is availableTestNodePoolAutoscalingScaleFromZerotest case to the e2e-azure-self-managed test regex, ensuring this test runs during CIInstallation Script Enhancement:
The hypershift installation script now conditionally enables scale-from-zero for Azure self-managed clusters by:
--scale-from-zero-provider azureand corresponding credentials to the hypershift install commandThese changes allow the CI infrastructure to validate the scale-from-zero autoscaling feature for Azure clusters. The temporary image overrides are intended to be reverted before merge, once the companion PR (openshift/hypershift#8337) is integrated.