Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Etcd restore does not work on an RKE2 cluster #42895

Closed
vivek-shilimkar opened this issue Sep 21, 2023 · 17 comments
Closed

[BUG] Etcd restore does not work on an RKE2 cluster #42895

vivek-shilimkar opened this issue Sep 21, 2023 · 17 comments
Assignees
Labels
area/capr/rke2 RKE2 Provisioning issues involving CAPR dependency-rke2 Indicates that the rancher issue has a dependency to an RKE2 issue kind/bug Issues that are defects reported by users or that we know have reached a real release kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement QA/XS status/release-blocker team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support team/infracloud team/rke2
Milestone

Comments

@vivek-shilimkar
Copy link
Member

Rancher Server Setup

  • Rancher version: 2.8-head commit id: d101c27
  • Installation option (Docker install/Helm Chart): Docker Install

Information about the Cluster

  • Kubernetes version: 1.27.5+rke2r1 to v1.26.8+rke2r1 RKE2
  • Cluster Type (Local/Downstream): AWS Node driver cluster

User Information

  • What is the role of the user logged in? (Admin/Cluster Owner/Cluster Member/Project Owner/Project Member/Custom) Standard

Describe the bug
[BUG] Etcd restore does not work on an RKE2 cluster

To Reproduce

  • Deploy a downstream RKE2 node driver cluster on 1.26 RKE2 version

  • Take an etcd snapshot

  • Upgrade to 1.27 RKE2 version

  • Restore using All options - config, k8s and etcd option to the snapshot taken previously

  • Cluster is stuck in Updating state error: [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for probes: etcd

  • rancher prov logs:

[INFO ] provisioning done
--
4:52:48 pm | [INFO ] configuring bootstrap node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-7rkvw: waiting for plan to be applied
4:52:54 pm | [INFO ] configuring bootstrap node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-7rkvw: waiting for probes: kubelet
4:53:30 pm | [INFO ] configuring bootstrap node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-7rkvw: waiting for probes: etcd
4:53:36 pm | [INFO ] configuring bootstrap node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-7rkvw: waiting for kubelet to update
4:53:46 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-q89sr,rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm
4:54:28 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: waiting for probes: kubelet
4:54:46 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: waiting for probes: etcd, kubelet
4:54:50 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: waiting for probes: etcd
4:54:56 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: waiting for kubelet to update
4:55:04 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-b4cb4,rke2-backup-restore-cp-8675c69865x58z9h-zstx8
4:56:10 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: waiting for plan to be applied
4:56:16 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-b4cb4,rke2-backup-restore-cp-8675c69865x58z9h-zstx8
4:56:20 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: waiting for probes: kubelet
4:56:44 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: waiting for probes: kube-apiserver, kubelet
4:56:54 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: waiting for probes: kube-apiserver, kube-controller-manager, kube-scheduler, kubelet
4:56:58 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: waiting for probes: kube-apiserver, kube-controller-manager, kube-scheduler
4:57:12 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: waiting for probes: kube-apiserver, kube-controller-manager
4:57:18 pm | [INFO ] configuring worker node(s) rke2-backup-restore-wk-56df7d58b5xb6ffp-4dhrj,rke2-backup-restore-wk-56df7d58b5xb6ffp-5kcbn,rke2-backup-restore-wk-56df7d58b5xb6ffp-ltbld
4:57:34 pm | [INFO ] configuring worker node(s) rke2-backup-restore-wk-56df7d58b5xb6ffp-5kcbn,rke2-backup-restore-wk-56df7d58b5xb6ffp-ltbld
4:57:54 pm | [INFO ] configuring worker node(s) rke2-backup-restore-wk-56df7d58b5xb6ffp-ltbld: waiting for plan to be applied
4:57:58 pm | [INFO ] configuring worker node(s) rke2-backup-restore-wk-56df7d58b5xb6ffp-ltbld: waiting for probes: kubelet
4:58:08 pm | [INFO ] configuring worker node(s) rke2-backup-restore-wk-56df7d58b5xb6ffp-ltbld: waiting for kubelet to update
4:58:46 pm | [INFO ] rke2-backup-restore-wk-56df7d58b5xb6ffp-4dhrj,rke2-backup-restore-wk-56df7d58b5xb6ffp-5kcbn,rke2-backup-restore-wk-56df7d58b5xb6ffp-ltbld
4:58:48 pm | [INFO ] provisioning done
5:01:26 pm | [INFO ] refreshing etcd restore state
5:01:28 pm | [INFO ] waiting to stop rke2 services on node [rke2-backup-restore-cp-bfd6beba-nz8l2]
5:01:30 pm | [INFO ] waiting to stop rke2 services on node [rke2-backup-restore-wk-e548aa18-h5k2c]
5:01:32 pm | [INFO ] waiting for etcd restore
5:02:16 pm | [INFO ] waiting for etcd restore probes
5:02:54 pm | [INFO ] waiting for etcd restore
5:05:02 pm | [INFO ] waiting for etcd restore probes
5:05:16 pm | [INFO ] refreshing etcd restore state
5:05:18 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-7rkvw,rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm
5:05:34 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for plan to be applied
5:05:36 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for probes: etcd, kubelet
5:05:46 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-b4cb4,rke2-backup-restore-cp-8675c69865x58z9h-zstx8
5:05:56 pm | [INFO ] configuring control plane node(s) rke2-backup-restore-cp-8675c69865x58z9h-zstx8: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for plan to be applied
5:05:58 pm | [INFO ] configuring etcd node(s) rke2-backup-restore-etcd-5fb5f775c6x9hzcw-rwkwm: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for probes: etcd


Note:

  • On an RKE1 cluster, this use case works. No issues seen.
  • On an rke2 cluster - Cluster upgrade from 1.26.8+rke2r1 to 1.27.5+rke2r1 works but the restore to snapshot taken on 1.26 fails.
@vivek-shilimkar vivek-shilimkar added kind/bug Issues that are defects reported by users or that we know have reached a real release kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement QA/XS area/capr/rke2 RKE2 Provisioning issues involving CAPR status/release-blocker team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support labels Sep 21, 2023
@vivek-shilimkar vivek-shilimkar added this to the v2.8.0 milestone Sep 21, 2023
@vivek-shilimkar vivek-shilimkar self-assigned this Sep 21, 2023
@vivek-shilimkar
Copy link
Member Author

Similar issue was observed earlier - #40005

@Oats87
Copy link
Contributor

Oats87 commented Sep 21, 2023

Need more logs.

Do you have logs from the rancher-system-agent and rke2-server from the restoring etcd node?

@vivek-shilimkar
Copy link
Member Author

I did not had a logs from rancher-system-agent and rke2-server. I tried reproducing it on v2.8-head (675a0e4) but I couldn't. However, this time also the cluster restoration was not successful. The cluster remained in updating state. Let's just wait for the Alpha-1 or RC release of the v2.8 and I'll retest the backup-restore scenario.

@Sahota1225
Copy link
Contributor

@vivek-shilimkar Since 2.8 Alpha-1 is available, can we retest the backup-restore scenario?

@felipe-colussi
Copy link
Contributor

felipe-colussi commented Oct 5, 2023

Testing on the latest version d101c27fe5eb6bfd1992165b2f08c9fc02a2a55f I was able to do the update and then restore it with no problems, tested on AWS and DO.

Also in 2.8 Alpha-2 I wasn't able to reproduce it.

Additional note: on alpha i noticed that the fleet-agent of the downstream cluster was on crashing loopback. For entering into a nill map, that wasn't happening on head. .

Reason: The error don't happen on a single node cluster.

@felipe-colussi
Copy link
Contributor

felipe-colussi commented Oct 6, 2023

Using a cluster with:
3 ETC (50GB storage, t3.large)
3 Workers (16gb storage, t3.large)
2 CP (16 gb storage, t3.large)

I was able to get the error.

At the beginning 2 ETCD +1 CP were restored, one ETC Was stuck failing to connect to the server:

rke2 journalctl:

Oct 06 13:43:30 felipe-test2-etcd-da41e26e-d6784 rke2[8987]: time="2023-10-06T13:43:30Z" level=warning msg="Failed to get apiserver address from etcd: context deadline exceeded"
Oct 06 13:43:32 felipe-test2-etcd-da41e26e-d6784 rke2[8987]: time="2023-10-06T13:43:32Z" level=info msg="Waiting for apiserver addresses"

After ~45min the ETCD was able to reconcile itself.
The other CP was stucked with calico problems.
The workers weren't able to start RKE2 (nor even create the bin for k8s).

Looking into the nodes I noticed that dev/root was almost full for all nodes instantiated with 16gb, that may be causing the problem. I'll retry with larger storage.


Observation: I do believe that the 16gb could have been the cause of the rke binary no to be created, even after updating to larger nodes the problem persist.

@felipe-colussi
Copy link
Contributor

felipe-colussi commented Oct 6, 2023

The upgrade behavior is also strange.

I tried to create a new cluster using the 3-3-2 all of them with more storage. After doing the back-up and the upgrade I noticed that all etcd nodes were missing the kubeconfig file: /etc/rancher/rke2/rke2.yaml.

Observation: After talking with Jake, this behavior is expected.

After trying to restore the ETCD those nodes didn't start the rke2-server.

The error is not being consistently reproducible , I tryed to do a 1-1-1 cluster and on that during the upgrade the CP node got stuck rke2-server wasn't starting cause a problem on the CA not being authorized to the IP that it was trying to use.

@felipe-colussi
Copy link
Contributor

Did some extra tries, wile trying to do the restore before the Kubernetes update the same problem happens, but the clusters that fail are the worker ones.

@felipe-colussi
Copy link
Contributor

felipe-colussi commented Oct 9, 2023

Some extra information:

This problem is related to restoring the ETCd on K8s 1.26.9 and 1.26.8 on a multi node cluster.

Doing just an ETCd backup and Restore (ETCd only) is enough to reproduce it.

The problem dosen't happen wile restoring K8s 1.27.

Next tests that i'll do today:

  • Test it on K8s 1.25: Need some extra testing, i got the problem on 1.25.13 and not on 1.25.12
  • Check if the error only happen on RKE2 or if it also happen on K3s: RKE2 Only
  • Do an ETCd restore on a RKE2 cluster without rancher.
  • Test it on a stable rancher version, probably 2.7.6: Reproducible.

@igomez06
Copy link
Contributor

@felipe-colussi I tested this on v2.7.8 an a v1.25.13+rke2r1 rke2 cluster, and got that same error. Also I ran the same test on k3s v1.25.13+k3s1 and it worked fine.

@Josh-Diamond
Copy link
Contributor

Josh-Diamond commented Oct 10, 2023

@felipe-colussi I tested the following scenario on v2.7.7 and no issues were seen:

  1. Fresh install of Rancher v2.7.7
  2. Provision a downstream RKE2 w/ k8s v1.25.13+rke2r1
  3. Once active, take a snapshot
  4. Once active, upgrade cluster to v1.26.8+rke2r1
  5. Once active, restore to snapshot taken in step 3
  6. Verified - cluster successfully restores to etcd snapshot

@Sahota1225 Sahota1225 added the dependency-rke2 Indicates that the rancher issue has a dependency to an RKE2 issue label Oct 11, 2023
@felipe-colussi
Copy link
Contributor

felipe-colussi commented Oct 17, 2023

Merged #43158

The PR fixes the problem were the restores got stuck forever with "Waiting for probe: calico".

Even after this PR we still have the following known problems (wile using RKE2 on 1.26.8, 1.26.9, 1.25.13)¹:

1. Wile doing etcd restores (only etcd or all 3): Etcd node get stuck with:

Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for probes: etcd
In this case rancher reconcile itself. It takes up to ~30 mins to do so.

2. Wile Upgrading to 1.27.6 there is a chance that a worker node get stuck with:

Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for kubelet to update
In this case it reconcile itself, during tests it took up to ~45 mins.

3. Wile restoring an ETCD (only etcd or all 3) to 1.26.8 and 1.26.9² there is a chance of it getting stuck forever with:
Waiting for probes: kube-controller-manager

¹ probably also happens on 1.25.14 but wasn't intensively tested.
² During my tests this was an exception and was only saw on the 1.26.x versions, the one that were tested the most, so there is a chance that it can happen on 1.25.x.

@Josh-Diamond
Copy link
Contributor

I ran etcd snap/restore checks yesterday on a few Rancher server versions, in an effort to better help determine the scope of the recent rke2 snap/restore failures seen, and the results are below:

On v2.8-c4847070c39209a65029aa3a43347e4d9bac1d12-head - (this was on a commit before Felipe's fix was put in):

  • k8s 1.27.6 snap/restore successful - ✅
  • k8s 1.26.9 snap/restore unsuccessful - Waiting for probes: calico on worker node - ❌
  • k8s 1.25.14 snap/restore unsuccessful - Waiting for probes: kube-controller-manager on etcd+cp node - ❌

On v2.7.9-rc2:

  • k8s 1.26.8 snap/restore unsuccessful - Waiting for probes: calico on worker node - ❌
  • k8s 1.25.13 snap/restore unsuccessful - Waiting for probes: calico on worker node - ❌

On v2.7-918fb36b3edaa8f305ee5b9d0c6c51ec52813bda-head:

  • k8s 1.26.8 snap/restore unsuccessful - Waiting for probes: kube-controller-manager on etc+cp node - ❌
  • k8s 1.25.13 snap/restore unsuccessful - Waiting for probes: calico on worker node - ❌

On Rancher v2.8-7b319e9aa9d877ce1eb12f2afa19588afb7440bd-head - (with Felipe's fix):

  • k8s v1.27.6+rke2r1 snap/restore successful - ✅
  • k8s v1.26.9+rke2r1 snap/restore successful - ✅
  • k8s v1.25.14+rke2r1 snap/restore unsuccessful - Waiting for probes: kube-controller-manager on etcd+cp node - ❌

Note: Calico issue on worker nodes no longer encountered w/ Felipe's fix


In an effort to determine the frequency of kube-controller-manager issue still seen, the following results were gathered from testing on Rancher v2.8-7b319e9aa9d877ce1eb12f2afa19588afb7440bd-head - (with Felipe's fix):

v1.26.9+rke2r1 -

  • ❌ - Waiting for probes: kube-controller-manager on one etcd+cp machine
  • ❌ - Waiting for probes: kube-controller-manager on one etcd+cp machine
  • ❌ - Waiting for probes: kube-controller-manager on one etcd+cp machine
  • ❌ - Waiting for probes: kube-controller-manager on one etcd+cp machine
  • ❌ - Waiting for probes: kube-controller-manager on one etcd+cp machine

v1.25.14+rke2r1 -

  • ✅ - snap/restore successful
  • ✅ - snap/restore successful
  • ✅ - snap/restore successful
  • ❌ - Waiting for probes: kube-controller-manager on one etcd+cp machine
  • ❌ - Waiting for probes: kube-controller-manager on one etcd+cp machine

@galal-hussein
Copy link
Contributor

As a quick workaround for this issue we found out that a restart to rke2-server after restore resolves the problem by restarting the kubelet and containerd, it seems that kubelet is stuck with restarting the kube-controller-manager pod after it exits and mistakenly reporting that its in ready state, so as a workaround for this problem the plan can trigger a restart to rke2-server.service systemctl restart rke2-server.service after a certain timeout period.

@brandond
Copy link
Contributor

There is a report of the same behavior of static pods not starting, also on a rancher managed cluster, at rancher/rke2#4864. In this case I believe the issue was triggered by an upgrade to a newer patch release of RKE2, not a cluster restore.

@mdrahman-suse
Copy link

mdrahman-suse commented Oct 25, 2023

Validated the issue with Rancher 2.8.0-rc1

  • Single node running Rancher with Docker install
  • 3 etcd only, 2 cp-only, 2 worker config

Issue Replication

  • v1.27.6+rke2r1 ❌
  • v1.26.9+rke2r1 ❌
  • v1.25.14+rke2r1 ✅
[INFO ] provisioning done
[INFO ] waiting for etcd snapshot on node mdrke2125rep-pool1-75bb6a9e-gcml7, waiting for etcd snapshot on node mdrke2125rep-pool1-75bb6a9e-w766b, waiting for etcd snapshot on node mdrke2125rep-pool1-75bb6a9e-pfb8m
[INFO ] waiting for etcd snapshot on node mdrke2125rep-pool1-75bb6a9e-w766b, waiting for etcd snapshot on node mdrke2125rep-pool1-75bb6a9e-pfb8m
[INFO ] configuring bootstrap node(s) mdrke2125rep-pool1-7ff6bf9cf6xpkxlb-2zq74: waiting for plan to be applied
[INFO ] waiting for etcd snapshot creation management plane restart
[INFO ] refreshing etcd create state
[INFO ] provisioning done
[INFO ] shutting down cluster
[INFO ] cluster shutdown complete, running etcd restore
[INFO ] waiting for etcd restore
[INFO ] waiting for etcd restore probes
[INFO ] waiting for etcd restore
[INFO ] waiting for etcd restore probes
[INFO ] configuring etcd node(s) mdrke2125rep-pool1-7ff6bf9cf6xpkxlb-2zq74,mdrke2125rep-pool1-7ff6bf9cf6xpkxlb-7wvn4
[INFO ] configuring etcd node(s) mdrke2125rep-pool1-7ff6bf9cf6xpkxlb-7wvn4: waiting for plan to be applied
[INFO ] configuring etcd node(s) mdrke2125rep-pool1-7ff6bf9cf6xpkxlb-7wvn4: waiting for probes: etcd, kubelet
[INFO ] configuring etcd node(s) mdrke2125rep-pool1-7ff6bf9cf6xpkxlb-7wvn4: waiting for probes: etcd
[INFO ] configuring control plane node(s) mdrke2125rep-pool2-58b7c6595cx6hnpx-4clpq,mdrke2125rep-pool2-58b7c6595cx6hnpx-9z752
[INFO ] configuring worker node(s) mdrke2125rep-pool3-7c44f9c9dcxzlnjd-6fzh9,mdrke2125rep-pool3-7c44f9c9dcxzlnjd-zvspc

Issue validation

  • v1.27.7-rc2+rke2r1 ✅
  • v1.26.10-rc2+rke2r1 ✅
  • v1.25.15-rc2+rke2r1 ✅

Testing

  • Create cluster with the specific versions
  • Take snapshots (multiple times)
  • Restore snapshot (multiple times)
  • Restore to older snapshot after upgrade
    • v1.25.15-rc2+rke2r1, Upgrade to v1.26.10-rc2+rke2r1 Then Restore to v1.25.15-rc2+rke2r1
    • v1.26.10-rc2+rke2r1, Upgrade to v1.27.7-rc2+rke2r1 Then Restore to v1.26.10-rc2+rke2r1
[INFO ] provisioning done
[INFO ] waiting for etcd snapshot on node mdrke2125fix-pool1-110053a3-k8mvb, waiting for etcd snapshot on node mdrke2125fix-pool1-110053a3-8jflr, waiting for etcd snapshot on node mdrke2125fix-pool1-110053a3-qv52n
[INFO ] configuring bootstrap node(s) mdrke2125fix-pool1-649965db6cx8d4kp-trp9b: waiting for plan to be applied
[INFO ] waiting for etcd snapshot creation management plane restart
[INFO ] provisioning done
[INFO ] waiting to stop rke2 services on node [mdrke2125fix-pool2-ac2bc6f3-c6mcc]
[INFO ] waiting for etcd restore
[INFO ] waiting for etcd restore probes
[INFO ] waiting for etcd restore
[INFO ] waiting for etcd restore probes
[INFO ] configuring etcd node(s) mdrke2125fix-pool1-649965db6cx8d4kp-b9tbf,mdrke2125fix-pool1-649965db6cx8d4kp-trp9b
[INFO ] configuring etcd node(s) mdrke2125fix-pool1-649965db6cx8d4kp-trp9b: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for plan to be applied
[INFO ] configuring etcd node(s) mdrke2125fix-pool1-649965db6cx8d4kp-trp9b: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for probes: etcd, kubelet
[INFO ] configuring etcd node(s) mdrke2125fix-pool1-649965db6cx8d4kp-trp9b: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for probes: etcd
[INFO ] configuring control plane node(s) mdrke2125fix-pool2-86c77f7678xncdkc-8bpq6,mdrke2125fix-pool2-86c77f7678xncdkc-krk29
[INFO ] configuring control plane node(s) mdrke2125fix-pool2-86c77f7678xncdkc-krk29: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for plan to be applied
[INFO ] configuring control plane node(s) mdrke2125fix-pool2-86c77f7678xncdkc-krk29: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for probes: calico, kube-apiserver, kube-controller-manager, kube-scheduler, kubelet
[INFO ] configuring control plane node(s) mdrke2125fix-pool2-86c77f7678xncdkc-krk29: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for probes: calico, kube-apiserver, kube-controller-manager, kube-scheduler
[INFO ] configuring control plane node(s) mdrke2125fix-pool2-86c77f7678xncdkc-krk29: waiting for probes: calico, kube-apiserver, kube-controller-manager, kube-scheduler
[INFO ] configuring control plane node(s) mdrke2125fix-pool2-86c77f7678xncdkc-krk29: waiting for probes: calico, kube-controller-manager, kube-scheduler
[INFO ] configuring control plane node(s) mdrke2125fix-pool2-86c77f7678xncdkc-krk29: waiting for probes: calico
[INFO ] configuring worker node(s) mdrke2125fix-pool3-84dbd5bbdfxlc8p6-8f68s,mdrke2125fix-pool3-84dbd5bbdfxlc8p6-vzqxq
[INFO ] configuring worker node(s) mdrke2125fix-pool3-84dbd5bbdfxlc8p6-vzqxq: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for plan to be applied
[INFO ] configuring worker node(s) mdrke2125fix-pool3-84dbd5bbdfxlc8p6-vzqxq: Node condition MemoryPressure is Unknown. Node condition DiskPressure is Unknown. Node condition PIDPressure is Unknown. Node condition Ready is Unknown., waiting for probes: calico, kubelet
[INFO ] configuring worker node(s) mdrke2125fix-pool3-84dbd5bbdfxlc8p6-vzqxq: Node condition Ready is False., waiting for probes: calico
[INFO ] configuring worker node(s) mdrke2125fix-pool3-84dbd5bbdfxlc8p6-vzqxq: waiting for probes: calico
[INFO ] refreshing etcd restore state
[INFO ] waiting for etcd restore
[INFO ] configuring bootstrap node(s) mdrke2125fix-pool1-649965db6cx8d4kp-vv4rt: waiting for plan to be applied
[INFO ] provisioning done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/capr/rke2 RKE2 Provisioning issues involving CAPR dependency-rke2 Indicates that the rancher issue has a dependency to an RKE2 issue kind/bug Issues that are defects reported by users or that we know have reached a real release kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement QA/XS status/release-blocker team/hostbusters The team that is responsible for provisioning/managing downstream clusters + K8s version support team/infracloud team/rke2
Projects
None yet
Development

No branches or pull requests