-
Notifications
You must be signed in to change notification settings - Fork 529
CNTRLPLANE-2209: Add self-managed Azure HCP enhancement proposal #1904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
a39bad9 to
15d621e
Compare
|
@bryan-cox: This pull request references CNTRLPLANE-2209 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.21.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
15d621e to
9aa012f
Compare
| deployments. While SNO could theoretically serve as a management cluster for | ||
| HyperShift, this is not a target use case for self-managed Azure. | ||
|
|
||
| ### Implementation Details/Notes/Constraints |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I would expect detailed info what components need to be updated for self-managed HCP on Azure and how. For me personally, the most important information would be that everyone in a hosted control plane who needs to talk to the Azure cloud must to use a token minter sidecar because of XYZ and how CPO tells its operands to do so.
You (HyperShift team) need to do a lot of work to support new hypershift install, hypershift create cluster azure etc., but there are no details about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "Component Changes" section now provides the details you're looking for. The key points:
-
CPO tells operands to use token minter sidecar because hosted control plane components run in the management cluster but need to authenticate to Azure using the guest cluster's service accounts. The token-minter sidecar mints tokens from the guest cluster's API server (via kubeconfig) and writes them to a shared volume that the main container reads.
-
Why token minter is needed: Standard workload identity relies on projected service account tokens, but HCP components are in a different cluster than where their service accounts are defined. The token-minter bridges this gap.
-
HyperShift CLI work: The
hypershift create cluster azurecommand already supports user-provided infrastructure. Thehypershift create infra azureandhypershift create credentials azurecommands are planned for Tech Preview to simplify setup.
The "Hosted Control Plane Components Requiring Token Minter Sidecar" table and "CSI Driver Token Minter Configuration" sections detail the specific components and how CPO configures them.
AI-assisted response via Claude Code
| ## Proposal | ||
|
|
||
| Self-managed Azure HyperShift extends the existing HyperShift architecture to | ||
| support Azure as a platform where users manage all infrastructure themselves. | ||
| The implementation leverages existing HyperShift patterns while adding | ||
| Azure-specific infrastructure provisioning guidance and workload identity | ||
| integration. | ||
|
|
||
| The deployment consists of three main phases: | ||
|
|
||
| 1. **Azure Workload Identity Setup**: Create managed identities for OpenShift | ||
| components, configure the OIDC issuer, and establish federated credentials. | ||
|
|
||
| 2. **Management Cluster Setup**: Install the HyperShift operator on an existing | ||
| OpenShift cluster in Azure, optionally configure External DNS, and prepare | ||
| the cluster to host control planes. | ||
|
|
||
| 3. **Hosted Cluster Creation**: Provision Azure infrastructure for the hosted | ||
| cluster, deploy the control plane, create worker node VMs, and integrate | ||
| workload identities. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copying from the template:
Enumerate all of the proposed changes at a high level, including all of the components that need to be modified and how they will be different. Include the reason for each choice in the design and implementation that is proposed here.
So, what components need to be modified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see there is a list of components bellow now.
9aa012f to
2e370aa
Compare
|
@jsafrane Thank you for the feedback on component details. I've updated the proposal to address your comments: Re: Comment at line 134 (Component list)
Re: Comment at line 284 (Token minter sidecar details)
The enhancement now documents how CPO configures operands to use workload identity federation for Azure API access. AI-assisted response via Claude Code |
c4589ea to
64b90d5
Compare
|
/test ? |
|
@bryan-cox: The following commands are available to trigger required jobs: Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/test markdownlint |
d86d881 to
0db22b3
Compare
|
/test markdownlint |
0db22b3 to
bd19190
Compare
|
/test markdownlint |
|
@bryan-cox: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| - `hypershift create infra azure`: New infrastructure provisioning command | ||
| for creating Azure resources (VNets, subnets, NSGs, storage accounts). | ||
| - `hypershift create credentials azure`: New command for generating workload | ||
| identity credentials and federated credential configurations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen these in docs yet - is this supposed to simplify OIDC and identities setup? Is this aimed for TechPreview?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hypershift create infra azure and hypershift create credentials azure commands are planned for Tech Preview to simplify the OIDC and workload identity setup. For Dev Preview, users will follow the manual setup documented at https://hypershift.pages.dev/how-to/azure/self-managed-azure-index/. The goal for Tech Preview is to provide a more streamlined experience similar to what we offer for AWS.
AI-assisted response via Claude Code
| 1. Projects a Kubernetes service account token to a well-known file path | ||
| 2. Configures Azure SDK environment variables for workload identity | ||
| authentication | ||
| 3. Enables components to obtain Azure access tokens without long-lived |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just leaving a note here: Azure CSI drivers still don't use "real" short-term credential setup (SMB), and we don't have ETA yet: kubernetes-sigs/azurefile-csi-driver#1737 (comment)
But it should not block this enhancement, and I'd expect this to be rather hidden fix without behavior change - but something we might want to focus on when testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the note. I've added a callout in the CSI Driver section about this upstream limitation (kubernetes-sigs/azurefile-csi-driver#1737). This is good to keep in mind for testing but as you noted, shouldn't block the enhancement and will be transparent to users when upstream support is available.
AI-assisted response via Claude Code
| - A projected service account token volume | ||
| - Environment variables: `AZURE_CLIENT_ID`, `AZURE_TENANT_ID`, | ||
| `AZURE_FEDERATED_TOKEN_FILE` | ||
| - The appropriate managed identity client ID for its Azure RBAC permissions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - The appropriate managed identity client ID for its Azure RBAC permissions | |
| - The appropriate workload identity client ID for its Azure RBAC permissions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Changed "managed identity" to "workload identity" for consistency.
AI-assisted response via Claude Code
|
|
||
| | Environment Variable | Value | | ||
| |---------------------|-------| | ||
| | `AZURE_CLIENT_ID` | Managed identity client ID from `HostedCluster.Spec.Platform.Azure.WorkloadIdentities.<component>.ClientID` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| | `AZURE_CLIENT_ID` | Managed identity client ID from `HostedCluster.Spec.Platform.Azure.WorkloadIdentities.<component>.ClientID` | | |
| | `AZURE_CLIENT_ID` | Workload identity client ID from `HostedCluster.Spec.Platform.Azure.WorkloadIdentities.<component>.ClientID` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Changed "Managed identity" to "Workload identity" for consistency.
AI-assisted response via Claude Code
| - Creates storage accounts for OIDC and image registry | ||
| - Creates managed identities for OpenShift components | ||
|
|
||
| 2. The platform engineer installs the OpenShift management cluster on Azure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have a statement on what type of management clusters are we expecting to support. So far we've been testing with standalone Azure OpenShift without Workload Idenitity - is that the only expected configuration we'll be supporting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Added a statement clarifying the supported management cluster configuration. For Dev Preview, the supported configuration is a standalone OpenShift cluster on Azure with Workload Identity (backed by federated managed identities).
AI-assisted response via Claude Code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correction to my previous reply: The supported management cluster configuration is a standalone OpenShift cluster, which can run on Azure or AWS. Updated the document to reflect this.
AI-assisted response via Claude Code
| 3. **Workload Identity Integration**: Azure Workload Identity Federation is used | ||
| for secure authentication, requiring: | ||
| - OIDC issuer configuration | ||
| - Managed identities for each OpenShift component |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Managed identities for each OpenShift component | |
| - Workload identities for each OpenShift component |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Changed "Managed identities" to "Workload identities" for consistency.
AI-assisted response via Claude Code
|
|
||
| - Azure subscription for CI/CD testing with appropriate quotas | ||
| - Test Azure infrastructure (VNets, storage accounts, DNS zones) for e2e tests | ||
| - Integration with existing HyperShift CI infrastructure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Azure Graph API access is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Added Azure Graph API access requirement to the Infrastructure Needed section.
AI-assisted response via Claude Code
bd19190 to
15a3f97
Compare
Define design for running HyperShift hosted control planes on self-managed Azure infrastructure. Covers deployment workflow, infrastructure prerequisites, and workload identity integration. Fixes: https://issues.redhat.com/browse/CNTRLPLANE-2209 Signed-off-by: Bryan Cox <brcox@redhat.com>
15a3f97 to
b7691c5
Compare
Define design for running HyperShift hosted control planes on
self-managed Azure infrastructure. Covers deployment workflow,
infrastructure prerequisites, and workload identity integration.
Fixes: https://issues.redhat.com/browse/CNTRLPLANE-2209
Signed-off-by: Bryan Cox brcox@redhat.com