This repository has been archived by the owner on Oct 24, 2023. It is now read-only.
Upgrade - 1.15 to 1.16 upgrade fails on disconnected because the value of --pod-infra-container-image is not updated #3686
Labels
bug
Something isn't working
Describe the bug
Seen on Azure Stack disconnected environments with no outbound connectivity; upgrading a 1.15 cluster (deployed with akse 0.51) to 1.16 (with akse 0.54) fails with the master node going into not ready state. After further investigation in the affected node saw kubelet trying to use version 1.2.0 of the pause image, which is not included in vhd version 2020.07.24. Given the no outbound connectivity kubelet goes into loop trying to download the required image.
The included pause image in vhd 2020.07.24 is 1.4.0; I would've expected the upgrade process to change --pod-infra-container-image from 1.2.0 to 1.4.0 as 1.2.0 is not present on the vhd.
Steps To Reproduce
Expected behavior
The cluster to be upgraded successfully either by updating the --pod-infra-container-image to the version included in the new vhd or the new vhd to include the image version from the source cluster.
AKS Engine version
0.54
Kubernetes version
1.16.13
Additional context
Maybe related to the image problem, during this scenario we saw an earlier failure where the delete of addon manager pods get stuck during upgrade, the above repro was done using an aks engine binary with the fix from the following PR: #3685
The text was updated successfully, but these errors were encountered: