[BUG][CAPR] rancher-provisioning-capi-patch-sa job failing due to lack of exclusion from PSA enforcement #42719

thaneunsoo · 2023-09-08T19:55:31Z

Rancher Server Setup

Rancher version: v2.7-head 787c056
Installation option (Docker install/Helm Chart): Helm
- If Helm Chart, Kubernetes Cluster and version (RKE1, RKE2, k3s, EKS, etc):

Information about the Cluster

Kubernetes version: v1.26.7+rke2r1
Cluster Type (Local/Downstream):
- If downstream, what type of cluster? (Custom/Imported or specify provider for Hosted/Infrastructure Provider):
  AWS node driver

User Information

What is the role of the user logged in? (Admin/Cluster Owner/Cluster Member/Project Owner/Project Member/Custom)
- If custom, define the set of permissions: Admin

Describe the bug

Downstream cluster is unable to provision and cluster is stuck at waiting for viable init node. I don't see the machines getting created on the AWS portal and the latest log message is Creating server [fleet-default/auto-aws-kkswl-pool0-01d54a4c-4kr6x] of kind (Amazonec2Machine) for machine auto-aws-kkswl-pool0-74f9b78f74xcl6q9-fjnch in infrastructure provider

To Reproduce

Provision RKE2 AWS node driver cluster

Result
Cluster gets stuck while provisioning with the status waiting for viable init node

Expected Result

Cluster is able to provision successfully

Screenshots

The text was updated successfully, but these errors were encountered:

slickwarren · 2023-09-09T00:11:52Z

This is not resolved for us.
We looked into our automation and nothing looks out of the ordinary. Here's what we know:

this only fails for rke1 local clusters (rke2 on the same 1.26.7 k8s version deploys just fine)
if you manually set the rancher-provisioning-capi-patch-sa 's namespace to enforce:psa:privileged and the cattle-fleet-local-system to enforce:psa:privileged , both rancher-provisioning-capi-patch-sa job and the fleet-agent job are successful
without doing this, the cluster never resolves and is basically unusable

slickwarren · 2023-09-09T00:15:25Z

tested versions: (all using rke v1.4.8)
(Tim): v2.7.7-rc4
Caleb: v2.7-head and v2.8-head

aiyengar2 · 2023-09-11T16:36:30Z

I would suspect that the issue here is that changes that were made to introduce the new CAPI chart may not have been coordinated with changes to FeatureAppNS list, which should track all feature chart / system namespaces to exclude PSA enforcements from them.

cattle-fleet-local-system and cattle-provisioning-capi-system (the CAPI Provisioning Namespace that is set as the chart's release namespace) do not appear to be in this list, so the fix should simply be to add those namespaces to the list.

slickwarren · 2023-09-11T16:42:24Z

using these namspaces in the pod security configuration, I was able to resolve the issue for fleet. It appears that cattle-provisioning-capi-system is missing from this list though.

snasovich · 2023-09-12T19:59:03Z

/forwardport v2.8.0

thaneunsoo · 2023-09-13T20:30:00Z

Test Environment:

Rancher version: v2.7-head 1d25044
Rancher cluster type: HA
Docker version: 20.10

Downstream cluster type: RKE2 node driver cluster

Testing:

Tested this issue with the following steps:

Provision RKE2 AWS node driver cluster

Result
The job no longer fails but the cluster still isn't able to come up.

@Oats87 Should I close this ticket now that the job isn't faililng and open a different issue? The cluster went to delete after about 5 minutes and is now just stuck in deleting

thaneunsoo · 2023-09-14T17:23:47Z

My bad @Oats87 this was an issue with our jenkins job which was fixed here #42743

Closing this issue as fixed.

thaneunsoo added kind/bug Issues that are defects reported by users or that we know have reached a real release status/release-blocker labels Sep 8, 2023

thaneunsoo added this to the 2024-Q1-v2.7x milestone Sep 8, 2023

thaneunsoo closed this as completed Sep 8, 2023

slickwarren reopened this Sep 9, 2023

slickwarren mentioned this issue Sep 11, 2023

add all exempt namespaces for psc in rke1 local cluster automation #42739

Merged

Oats87 changed the title ~~[BUG][RKE2] Downstream cluster is stuck on waiting for viable init node~~ [BUG][CAPR] rancher-provisioning-capi-patch-sa job failing due to lack of exclusion from PSA enforcement Sep 11, 2023

Oats87 added the [zube]: Working label Sep 11, 2023

Oats87 self-assigned this Sep 11, 2023

zube bot removed the [zube]: To Triage label Sep 11, 2023

Oats87 mentioned this issue Sep 11, 2023

[release/v2.7] Add cattle-provisioning-capi-system and cattle-fleet-local-system to featureAppNs list #42744

Merged

Oats87 added the [zube]: Review label Sep 11, 2023

zube bot removed the [zube]: Working label Sep 11, 2023

Oats87 added the [zube]: To Test label Sep 11, 2023

zube bot removed the [zube]: Review label Sep 11, 2023

snasovich modified the milestones: 2024-Q1-v2.7x, v2.7.7 Sep 12, 2023

rancherbot mentioned this issue Sep 12, 2023

[Forwardport v2.8] [BUG][CAPR] rancher-provisioning-capi-patch-sa job failing due to lack of exclusion from PSA enforcement #42777

Closed

Sahota1225 assigned slickwarren Sep 12, 2023

thaneunsoo closed this as completed Sep 14, 2023

zube bot added [zube]: Done and removed [zube]: To Test labels Sep 14, 2023

jiaqiluo mentioned this issue Oct 13, 2023

Update the namespace exception list at multiple places rancher/rancher-docs#926

Merged

daviswill2 mentioned this issue Nov 14, 2023

rancher-provisioning-capi-patch-sa job failing due to lack of exclusion from PSA enforcement rancher/qa-tasks#1049

Closed

zube bot removed the [zube]: Done label Dec 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG][CAPR] rancher-provisioning-capi-patch-sa job failing due to lack of exclusion from PSA enforcement #42719

[BUG][CAPR] rancher-provisioning-capi-patch-sa job failing due to lack of exclusion from PSA enforcement #42719

thaneunsoo commented Sep 8, 2023

slickwarren commented Sep 9, 2023

slickwarren commented Sep 9, 2023 •

edited

aiyengar2 commented Sep 11, 2023 •

edited

slickwarren commented Sep 11, 2023 •

edited

snasovich commented Sep 12, 2023

thaneunsoo commented Sep 13, 2023

thaneunsoo commented Sep 14, 2023

[BUG][CAPR] rancher-provisioning-capi-patch-sa job failing due to lack of exclusion from PSA enforcement #42719

[BUG][CAPR] rancher-provisioning-capi-patch-sa job failing due to lack of exclusion from PSA enforcement #42719

Comments

thaneunsoo commented Sep 8, 2023

slickwarren commented Sep 9, 2023

slickwarren commented Sep 9, 2023 • edited

aiyengar2 commented Sep 11, 2023 • edited

slickwarren commented Sep 11, 2023 • edited

snasovich commented Sep 12, 2023

thaneunsoo commented Sep 13, 2023

Test Environment:

Testing:

thaneunsoo commented Sep 14, 2023

slickwarren commented Sep 9, 2023 •

edited

aiyengar2 commented Sep 11, 2023 •

edited

slickwarren commented Sep 11, 2023 •

edited