Error: failed to list objects for the "infrastructure.cluster.x-k8s.io/v1alpha4, Kind=AWSClusterControllerIdentity" during upgrade to CAPI 1.0 #2955

wmgroot · 2021-11-15T23:14:27Z

/kind bug

What steps did you take and what happened:
I am trying to upgrade a cluster from CAPI 0.3 to CAPI 1.0.

I've successfully updated our test cluster from 0.3 to 0.4 by following these steps

Download clusterctl 0.4.4
Run clusterctl upgrade plan and clusterctl upgrade apply
Ensure all CRDs are now using alpha4 instead of alpha3
Verify by rolling all nodes in the cluster with an AWSMachineTemplate update (succeeded)

clusterctl-0.4.4 upgrade apply --contract v1alpha4
Checking cert-manager version...
Deleting cert-manager Version="v1.1.0"
Installing cert-manager Version="v1.5.3"
Waiting for cert-manager to be available...
Performing upgrade...
Scaling down Provider="cluster-api" Version="v0.3.22" Namespace="capi-system"
Scaling down Provider="bootstrap-kubeadm" Version="v0.3.22" Namespace="capi-kubeadm-bootstrap-system"
Scaling down Provider="control-plane-kubeadm" Version="v0.3.22" Namespace="capi-kubeadm-control-plane-system"
Scaling down Provider="infrastructure-aws" Version="v0.6.7" Namespace="capa-system"
Deleting Provider="cluster-api" Version="v0.3.22" Namespace="capi-system"
Installing Provider="cluster-api" Version="v0.4.4" TargetNamespace="capi-system"
Deleting Provider="bootstrap-kubeadm" Version="v0.3.22" Namespace="capi-kubeadm-bootstrap-system"
Installing Provider="bootstrap-kubeadm" Version="v0.4.4" TargetNamespace="capi-kubeadm-bootstrap-system"
Deleting Provider="control-plane-kubeadm" Version="v0.3.22" Namespace="capi-kubeadm-control-plane-system"
Installing Provider="control-plane-kubeadm" Version="v0.4.4" TargetNamespace="capi-kubeadm-control-plane-system"
Deleting Provider="infrastructure-aws" Version="v0.6.7" Namespace="capa-system"
Installing Provider="infrastructure-aws" Version="v0.7.1" TargetNamespace="capa-system"

From there I performed the same steps using clusterctl 1.0.1, expecting to see beta1 replace alpha4.
However, I hit the following error while upgrading the aws infrastructure provider.

clusterctl-1.0.1 upgrade apply --contract v1beta1
Checking cert-manager version...
Cert-manager is already up to date
Performing upgrade...
Scaling down Provider="cluster-api" Version="v0.4.4" Namespace="capi-system"
Scaling down Provider="bootstrap-kubeadm" Version="v0.4.4" Namespace="capi-kubeadm-bootstrap-system"
Scaling down Provider="control-plane-kubeadm" Version="v0.4.4" Namespace="capi-kubeadm-control-plane-system"
Scaling down Provider="infrastructure-aws" Version="v0.7.1" Namespace="capa-system"
Deleting Provider="cluster-api" Version="v0.4.4" Namespace="capi-system"
Error: failed to list objects for the "infrastructure.cluster.x-k8s.io/v1alpha4, Kind=AWSClusterControllerIdentity" GroupVersionKind: conversion webhook for infrastructure.cluster.x-k8s.io/v1alpha3, Kind=AWSClusterControllerIdentity failed: Post "https://capa-webhook-service.capa-system.svc:443/convert?timeout=30s": dial tcp 44.145.89.35:443: connect: connection refused

What did you expect to happen:
I expected an upgrade from 0.4.4 to 1.0.1 to apply cleanly.

Anything else you would like to add:
I did not do anything to clean up the old alpha3 CRDs in this cluster before attempting the upgrade to 1.0.

Environment:

Cluster-api-provider-aws version: 0.7.1
Kubernetes version: (use kubectl version): 1.21.5
OS (e.g. from /etc/os-release): ubuntu capi AMI

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2021-11-15T23:14:34Z

@wmgroot: This issue is currently awaiting triage.

If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sedefsavas · 2021-11-15T23:17:35Z

@randomvariable any ideas why this error shows up:
Error: failed to list objects for the "infrastructure.cluster.x-k8s.io/v1alpha4, Kind=AWSClusterControllerIdentity" GroupVersionKind: conversion webhook for infrastructure.cluster.x-k8s.io/v1alpha3, Kind=AWSClusterControllerIdentity failed: Post "https://capa-webhook-service.capa-system.svc:443/convert?timeout=30s": dial tcp 44.145.89.35:443: connect: connection refused

RBAC allows listing AWSClusterControllerIdentity in those versions.

sedefsavas · 2021-11-16T17:02:20Z

There is a fix that will be in the next cluster-api release for this in clusterctl side: kubernetes-sigs/cluster-api#5681

wmgroot · 2021-11-16T17:18:56Z

When was the capa-webhook-service introduced?

I can see that it does not exist on our clusters running capi 0.3, provider-aws 0.6.7.

kubectl get svc -n capa-system
NAME                                      TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)    AGE
capa-controller-manager-metrics-service   ClusterIP   44.144.190.132   <none>        8443/TCP   54d

However, it does exist after attempting to run the upgrade from capi 0.4 to 1.0

kubectl get svc -n capa-system
NAME                                      TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
capa-controller-manager-metrics-service   ClusterIP   44.145.17.252   <none>        8443/TCP   23h
capa-webhook-service                      ClusterIP   44.145.89.35    <none>        443/TCP    23h

I believe the error is a direct result of the Pods for this Service not being available, the timeout occurs because no Pod exists to serve the request made to capa-webhook-service. It is not clear to me if these pods should be created as part of the upgrade process for 0.4 -> 1.0, or if they should have existed after upgrading from 0.3 -> 0.4.

kubectl get pod -n capa-system
No resources found in capa-system namespace.

wmgroot · 2021-11-16T17:19:27Z

Just saw your note, thanks for the update.

sedefsavas · 2021-12-06T12:20:22Z

Closing as this issue should be fixed with kubernetes-sigs/cluster-api#5684
We have v1alpha3 --> v1beta1 e2e test passing.

k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-priority labels Nov 15, 2021

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Nov 15, 2021

sedefsavas added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Nov 15, 2021

k8s-ci-robot removed the needs-priority label Nov 15, 2021

sedefsavas added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 15, 2021

sedefsavas added this to the v1.1.0 milestone Nov 15, 2021

sedefsavas modified the milestones: v1.1.0, v1.2.0 Nov 19, 2021

sedefsavas closed this as completed Dec 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: failed to list objects for the "infrastructure.cluster.x-k8s.io/v1alpha4, Kind=AWSClusterControllerIdentity" during upgrade to CAPI 1.0 #2955

Error: failed to list objects for the "infrastructure.cluster.x-k8s.io/v1alpha4, Kind=AWSClusterControllerIdentity" during upgrade to CAPI 1.0 #2955

wmgroot commented Nov 15, 2021

k8s-ci-robot commented Nov 15, 2021

sedefsavas commented Nov 15, 2021

sedefsavas commented Nov 16, 2021

wmgroot commented Nov 16, 2021

wmgroot commented Nov 16, 2021

sedefsavas commented Dec 6, 2021

Error: failed to list objects for the "infrastructure.cluster.x-k8s.io/v1alpha4, Kind=AWSClusterControllerIdentity" during upgrade to CAPI 1.0 #2955

Error: failed to list objects for the "infrastructure.cluster.x-k8s.io/v1alpha4, Kind=AWSClusterControllerIdentity" during upgrade to CAPI 1.0 #2955

Comments

wmgroot commented Nov 15, 2021

k8s-ci-robot commented Nov 15, 2021

sedefsavas commented Nov 15, 2021

sedefsavas commented Nov 16, 2021

wmgroot commented Nov 16, 2021

wmgroot commented Nov 16, 2021

sedefsavas commented Dec 6, 2021