aks: support BYO VNet for Automatic Managed System Pool clusters#33259
aks: support BYO VNet for Automatic Managed System Pool clusters#33259
Conversation
… crash
Add --system-node-subnet-id, --node-subnet-id, --disable-hosted-system
to 'az aks create'. When the subnet trio (system-node, node, apiserver)
is supplied on --sku automatic, the cluster is created with an MC
hosted_system_profile carrying BYO subnets; the Enabled flag is left
unset so the server decides the default. --disable-hosted-system
deterministically opts an Automatic cluster out of HOBO.
Validate the BYO VNet trio up front:
- Partial trio -> RequiredArgumentMissingError listing missing flags.
- Trio without --sku automatic -> RequiredArgumentMissingError.
- --disable-hosted-system + any subnet flag -> MutuallyExclusiveArgumentError.
- --disable-hosted-system without --sku automatic -> RequiredArgumentMissingError.
Fix 'az aks upgrade' / 'az aks scale' crash on HOBO clusters where
agent_pool_profiles can be None server-side ('NoneType is not iterable').
Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
|
Validation for Azure CLI Full Test Starting...
Thanks for your contribution! |
|
Hi @wenhug, |
|
Validation for Breaking Change Starting...
Thanks for your contribution! |
|
Thank you for your contribution! We will review the pull request and get back to you soon. |
|
The git hooks are available for azure-cli and azure-cli-extensions repos. They could help you run required checks before creating the PR. Please sync the latest code with latest dev branch (for azure-cli) or main branch (for azure-cli-extensions). pip install azdev --upgrade
azdev setup -c <your azure-cli repo path> -r <your azure-cli-extensions repo path>
|
Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds CLI support for BYO VNet on AKS Automatic SKU (HOBO) clusters via new az aks create flags, and hardens az aks upgrade against RP responses where agentPoolProfiles is null.
Changes:
- Introduce
--system-node-subnet-id/--sys-node-subnet-id,--node-subnet-id, and--disable-hosted-systemforaz aks create, plus validation and request shaping viahosted_system_profile. - Update help/params/validators to expose and validate the new subnet arguments.
- Prevent
az aks upgradefrom crashing wheninstance.agent_pool_profilesisNone.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/azure-cli/azure/cli/command_modules/acs/managed_cluster_decorator.py |
Adds new context getters, validation, and hosted_system_profile setup during create flow. |
src/azure-cli/azure/cli/command_modules/acs/custom.py |
Adds new aks_create parameters and guards aks_upgrade iterations over agent_pool_profiles. |
src/azure-cli/azure/cli/command_modules/acs/_validators.py |
Adds subnet ID validators for the new flags. |
src/azure-cli/azure/cli/command_modules/acs/_params.py |
Wires new flags into aks create argument registration. |
src/azure-cli/azure/cli/command_modules/acs/_help.py |
Documents the new create flags and intended usage. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
* Simplify --system-node-subnet-id registration (drop the --sys-node-subnet-id alias so the linter picks up help correctly). * Relax _get_apiserver_subnet_id CREATE-time check: don't require --vnet-subnet-id when BYO HOBO subnets are set, since system-node/node subnets replace vnet-subnet-id on --sku automatic. * Run _validate_byo_hobo_subnets up front in set_up_api_server_access_profile so the targeted "require --sku automatic" error beats the generic --apiserver-subnet-id messaging. * Also fix aks_scale against HOBO clusters where agent_pool_profiles is None (same crash Qizhe hit with aks_upgrade): guard with `or []` and return a user-friendly error for empty pools. * Add linter_exclusions entries for the three new parameters (missing_parameter_test_coverage) to keep azdev-linter green without recorded scenario tests at this stage. Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
When customers pass --system-node-subnet-id / --node-subnet-id / --apiserver-subnet-id on --sku automatic to bring their own VNet for HOBO, the CLI was producing a payload the RP rejected: 1. apiServerAccessProfile.enableVnetIntegration was not set, so the RP treated the cluster as default-VNet while subnetId was populated and returned ApiserverSubnetConfigError. Auto-wire enable_vnet_integration whenever the BYO HOBO subnet trio is present. 2. hostedSystemProfile.enabled was left unset, so the RP could not distinguish BYO HOBO from default mode. Set enabled=True when the subnet trio is provided. 3. agentPoolProfiles contained the default system pool, which the RP rejected because HOBO manages node pools itself. Clear agent_pool_profiles in BYO HOBO mode, matching the preview path. 4. outbound_type defaulted to managedNATGateway for Automatic SKU, which the RP disallows on BYO VNet. Keep the user's explicit value (or let it default to loadBalancer) when the BYO trio is provided. Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
HOBO (Automatic SKU Hosted Overlay System Pool) clusters have agent_pool_profiles=null on the RP side because node pools are server-managed. update_agentpool_profile was raising 'Encounter an unexpected error while getting agent pool profiles...' on any 'az aks update' against a HOBO cluster (including 'az aks update --sku base' for Automatic-to-Base downgrade). Skip that step when hostedSystemProfile.enabled is true. Also refines the Automatic-SKU outbound-type override: keep the existing 'default to ManagedNATGateway when no user value and no vnet subnet' behavior unchanged; the BYO-HOBO exemption added in the prior commit is already enough. Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Address Copilot review feedback on PR Azure#33259: - Clarify _validate_byo_hobo_subnets docstring: BYO VNet HOBO is triggered only by --system-node-subnet-id / --node-subnet-id. --apiserver-subnet-id keeps its existing general-purpose meaning for --enable-apiserver-vnet-integration flows on non-HOBO clusters, so it is deliberately not part of the trigger or the mutual-exclusion set. - Remove the unused 'any_trio_set' placeholder variable. Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Fix pylint W0212 (protected-access) reported by CI: the validator is called across classes (AKSManagedClusterCreateDecorator accessing AKSManagedClusterContext), so it should be a public method. Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Docstring still described the earlier 'enabled left unset' behavior, but the code now sets enabled=True on BYO VNet HOBO trio (required so the RP treats the request as BYO rather than default-VNet mode) and clears agent_pool_profiles because HOBO manages node pools server-side. Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Per review feedback: - Remove --disable-hosted-system flag entirely (PM decision). - Rename user-visible HOBO / Hosted Overlay System Pool terminology to Managed System Pool for Automatic cluster. - Drop associated getter, validator branch, param, linter exclusion, and related test cases. Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Rework short/long summaries for --system-node-subnet-id and --node-subnet-id so each flag clearly explains which pool it maps to (Managed System Pool vs user node pools) and states that the full three-subnet trio (including --apiserver-subnet-id) must belong to the same VNet and requires --sku automatic. Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
…eset - Rewrite --system-node-subnet-id and --node-subnet-id short summaries to follow the 'The ID of a subnet in an existing VNet to be used by ...' style already used for --vnet-subnet-id. - Rewrite the comment above the BYO-path 'agent_pool_profiles = None' assignment to explain the real reason: on an Automatic cluster with BYO VNet, the RP provisions the system pool from hosted_system_profile, so the CLI-synthesized default agent pool entry conflicts with the BYO trio and must be cleared. Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Give power users an explicit way to request a Managed System Pool on Automatic SKU clusters, independent of the region-level default toggle. - `--enable-hosted-system` sets `hosted_system_profile.enabled=True` and clears the CLI-synthesized default agent pool. This avoids the ghost-pool problem on non-BYO Automatic clusters in toggle-ON regions where the RP auto-enables HOBO but the CLI still ships a default pool. - The BYO VNet subnet trio implies `--enable-hosted-system`, so existing BYO flows keep working unchanged. - `--enable-hosted-system` is gated to `--sku automatic`. Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Signed-off-by: wenhug <50309350+wenhug@users.noreply.github.com>
Summary
Adds core
az aks createsupport for Automatic SKU Managed System Pool BYO VNet, plus fixes follow-up AKS commands against clusters whose server-sideagentPoolProfilesisnull.New
az aks createflags:--system-node-subnet-id--node-subnet-id--enable-hosted-systemBehavior:
--enable-hosted-systemis valid only with--sku automatic. It setshostedSystemProfile.enabled=trueand skips synthesizing the CLI defaultagentPoolProfiles, because the RP provisions the Managed System Pool server-side.--system-node-subnet-id,--node-subnet-id, and the existing--apiserver-subnet-idare supplied together on--sku automatic.hostedSystemProfile.systemNodeSubnetID,hostedSystemProfile.nodeSubnetID, andapiServerAccessProfile.subnetId; it also setsapiServerAccessProfile.enableVnetIntegration=true.loadBalanceroutbound instead of the normal Automatic no-VNetmanagedNATGatewaydefault. BYO subnets also satisfy the VNet requirement foruserAssignedNATGatewayanduserDefinedRouting;managedNATGatewayis rejected with BYO/custom VNet input.az aks upgrade,az aks scale, andaz aks updatenow handle Managed System Pool clusters withagentPoolProfiles=null; update also preserves the existing outbound type instead of reapplying the Automatic create default.Validation:
RequiredArgumentMissingErrorlisting missing flags.--sku automatic->RequiredArgumentMissingError.--enable-hosted-systemwithout--sku automatic->RequiredArgumentMissingError.Test plan
python -m pytest src/azure-cli/azure/cli/command_modules/acs/tests/latest/test_managed_cluster_decorator.py- 248 passedgit diff --checksrc/azure-cli/azure/cli/command_modules/acs/managed_cluster_decorator.pyaks-previewis not installed;az aks create -hshows the new core CLI flags from this PReastus2euapsucceeded withhostedSystemProfile.enabled=true,agentPoolProfiles=null,nodeProvisioningProfile.mode=Auto, andnetworkProfile.outboundType=loadBalanceraz aks nodepool listreturned[];az aks get-credentialssucceededaz aks scalereturned a friendly "no scalable node pools" error instead of crashingaz aks upgrade --node-image-only --yescompleted without aNoneTypecrashaz aks update --tags ...was accepted by the RP; the smoke-test tag is visible andoutboundTyperemainsloadBalancerwhile the RP operation continues