You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RKE2 currently only generates manifests for static pods on startup. If the configuration captured in the static pod manifest is not valid and the pods crashloop, RKE2 does not detect or handle this.
In K3s, many critical failures (such as failure of the etcd server to join the cluster as in #349) are detected via the embedded etcd server failing, which causes the K3s process to exit, be restarted by systemd, and retry the join operation from the beginning. In RKE2 these failures are masked by the indirection of the static pod executor model.
We should monitor critical static pods (such as the etcd server, apiserver, etc) and exit if they exit, as we do in K3s. This might be easiest if we configure the pods to not be restarted, and set up a goroutine to periodically check if they are running?
The text was updated successfully, but these errors were encountered:
RKE2 currently only generates manifests for static pods on startup. If the configuration captured in the static pod manifest is not valid and the pods crashloop, RKE2 does not detect or handle this.
In K3s, many critical failures (such as failure of the etcd server to join the cluster as in #349) are detected via the embedded etcd server failing, which causes the K3s process to exit, be restarted by systemd, and retry the join operation from the beginning. In RKE2 these failures are masked by the indirection of the static pod executor model.
We should monitor critical static pods (such as the etcd server, apiserver, etc) and exit if they exit, as we do in K3s. This might be easiest if we configure the pods to not be restarted, and set up a goroutine to periodically check if they are running?
The text was updated successfully, but these errors were encountered: