-
-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update handling of custom cluster-cidr #902
Conversation
Changes on the `cluster_ipv4_cidr` variable in the kube.tf file were not completely reflected in the cluster config and the default pod ip range `10.42.0.0/16` conflicted with the creation of more than 41 agent nodepools.
Concerning #894 (comment) I think my assumptions from the referenced comment are right, that K3s should be in charge of IP Address Management (IPAM) and not the CNI, if we want to have a fully working integration into Hetzner with their CCM: https://github.com/hetznercloud/hcloud-cloud-controller-manager/blob/main/README.md?plain=1#L122
Btw, they also use
It should be the same story for Calico: https://docs.tigera.io/calico/latest/reference/configure-cni-plugins#using-host-local-ipam
This is definitely the reason why the RouteController makes total mess with the Hetzner Network Routes, because the CNI is decoupled from K3s IPAM. I am still not sure about the impact and with the default values ( I would suggest to fix it either in this PR or in a new one, that implicitly covers this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that fix @thomasprade, it's important!
@M4t7e afaik the @mysticaltech glad to help 👍 |
@thomasprade Yes it is and that seems to be the problem already. That explicit configuration leads to decoupling the CNI IPAM from K3s IPAM. Then they run independently from each other and only K3s IPAM will set the needed Hetzner network routes to the distinct pod networks for the nodes (you can see them if you open the Hetzner console, select your network and switch to "Routes" tab). I have tested your change with the current settings in Cilium and it does not hurt. This PR is already the first step to solve that decoupling problem, but it will not fix it. It just "looks" fixed, because of the similar IP assignment rules from K3s and Cilium (and maybe also Calico?). They still can diverge and if that happens, non-masqueraded setups (which are usually preferred due to performance reasons) can break even internally (this depends on many other settings). At least the current defaults lead to masquerading as far as I can see and test. So hopefully it's just a theoretical problem in most cases. I'm currently crafting something for Cilium to solve that decoupling issue and it looks already very promising. I'll treat this change as precondition, as I can only fix it after this is done. So I'm looking forward for the merge 🙂 |
@@ -66,7 +66,7 @@ locals { | |||
apply_k3s_selinux = ["/sbin/semodule -v -i /usr/share/selinux/packages/k3s.pp"] | |||
|
|||
install_k3s_server = concat(local.common_pre_install_k3s_commands, [ | |||
"curl -sfL https://get.k3s.io | INSTALL_K3S_SKIP_START=true INSTALL_K3S_SKIP_SELINUX_RPM=true INSTALL_K3S_CHANNEL=${var.initial_k3s_channel} INSTALL_K3S_EXEC='server ${var.k3s_exec_server_args}' sh -" | |||
"curl -sfL https://get.k3s.io | INSTALL_K3S_SKIP_START=true INSTALL_K3S_SKIP_SELINUX_RPM=true INSTALL_K3S_CHANNEL=${var.initial_k3s_channel} INSTALL_K3S_EXEC='server ${var.k3s_exec_server_args} --cluster-cidr=${var.cluster_ipv4_cidr} --service-cidr=${var.service_ipv4_cidr}' sh -" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to put that in the k3s server config files (specified in init.tf
and control_planes.tf
), together with the other config params?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried setting the two arguments as an additional parameter in the file provisioner in init.tf:11
, but this didn't work.
From what I understand of the server documentation and configuration documentation these options must be passed as an argument to the k3s server
command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting with K3s v1.19.1+k3s1 all CLI parameters can be set in a configuration file.
You can now configure K3s using a configuration file rather than environment variables or command line arguments
I think it is not enough only setting it in init.tf:11
, because this will only be on the first control plane node (first_control_plane
resource). It also needs to be in control_planes.tf:95
for the other control plane nodes. See also the comment from https://docs.k3s.io/installation/configuration#configuration-with-binary
It is important to match critical flags on your server nodes. For example, if you use the flag --disable servicelb or --cluster-cidr=10.200.0.0/16 on your master node, but don't set it on other server nodes, the nodes will fail to join.
Maybe @mysticaltech or somebody else could verify that as I am pretty new to the project, but this is what I understood from reverse engineering it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thomasprade What you did here is good, but just for consistency, if would be best to implement it indeed like @M4t7e mentioned in both init.tf and control-plane.tf, then we merge! 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll do it myself, it's super quick!
Released in v2.4.0 🚀 |
Concerning #894
Due to k3s' defaults for
cluster-cidr
(10.42.0.0/16) andservice-cidr
(10.43.0.0/16) the creation of more than 42 nodepools and their respective subnets conflicted with these IP ranges.Changes: