New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot join etcd-only node to an all-role initial server #4784
Comments
When we fix this, let's add an ADR to capture some requirements for supported role combinations (--disable-x flags) and ordering. I believe the only order we've tested so far is
|
I believe this issue is also affecting cluster creations in rancher when doing -only nodes. The following configs I saw fail in rancher:
I believe it is all related to this issue, so this will become critical for GA of k3s in Rancher. Note these same issues do NOT show up with the same configurations for rke2. |
@ShylajaDevadiga, @mdrahman-suse, and I just bisected a similar issue (same repeating |
@rancher-max can you see what you get with the most recent RC? |
It looks like it's still failing for me through rancher. Using config: What's interesting is that in this case, sometimes I see the 3 servers all roles come up and running and the others not, and sometimes I don't even see the 3 servers all roles come up (so nothing is up and running) |
Yep, this is why Rancher CI won't pass right now when we bumped K3s/RKE2 versions Do we know what version of K3s this broke in? |
I'm particularly suspect of #4246 Rancher provisioning tests stop passing after crossing the threshold of the inclusion of that version i.e. |
This is consistent with the behavior I see in Rancher 2.6.3 with k3s provisioning, so I believe @Oats87 is likely correct with the commit that broke this. |
I can take this on for the next cycle, since it seems related to the agent ready channel stuff that I added. |
Moving this to the next milestone since the team didn't have a chance to work on it for v1.22.5+k3s1 |
Did not get worked for the February release, let's reschedule to March. |
Validated using v1.23.5-rc1+k3s1Performed the following 5 scenarios:
Note that on scenario 5, there is an expected failure in starting k3s. It receives: |
The current state of etcd-only nodes requires a specific order:
If bringing up node3 first (with --cluster-init flag), then trying to join node1 (without cluster-init but with --server), node1 will fail to join and will not correctly start. Instead, it will loop with:
We should enhance this functionality to allow any configuration to work. Some valid scenarios would be:
Edit to add more valid scenarios:
6. node1=etcd-only, node2=etcd-only, node3=etcd-only, node4=cp-only, node5=worker
The text was updated successfully, but these errors were encountered: