Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v1.4] Forward port changes around hostname-override #3140

Merged
merged 2 commits into from Jan 13, 2023

Conversation

kinarashah
Copy link
Member

@kinarashah kinarashah commented Jan 12, 2023

This PR is a forward port PR. Original PR: #3136

This PR addresses 2 issues.

Issue 1 - kubelet:
RKE removes hostname-override from kubelet args with RKE >=v1.3.5 for AWS cloud provider enabled clusters https://github.com/rancher/rke/blob/release/v1.4/cluster/plan.go#L497.
kubelet populates kubernetes.io/hostname label using hostname-override and RKE finds the node by relying on this behavior https://github.com/rancher/rke/blob/release/v1.4/k8s/node.go#L55. So this becomes an issue when --node-name/hostnameOverride is not the same as the kubernetes.io/hostname on node. rancher/rancher#37634

For example,

  1. Rancher >=v2.6.4 ec2 node provider clusters:
  • cluster provisions successfully, this is because rancher defaults to the private ip address of the node which is what shows up on the kubernetes.io/hostname label.
  1. Rancher >=v2.6.4 custom cluster or RKE standalone:
  • set --node-name in registration command to a value that doesn't match with the default hostname assigned for eg "hostname -f" or "kinara-ec2"
  • cluster provisioning fails with node xxxx not found error

Fix 1 - kubelet:
Revert the removal of hostname-override in kubelet args. RKE will continue finding the node by matching hostnameOverride (--node-name) to kubernetes.io/hostname label on the node. Node name will still be set by the cloud provider, if it's not consistent with the hostnameOverride then kubelet will show logs like below but this is expected behavior.

"Replacing cloudprovider-reported hostname with overridden hostname" cloudProviderHostname="[ip-xxx-xx-xx-xx.us](http://ip-172-31-6-245.us/)-west-2.compute.internal" overriddenHostname="kinara-ec2"

Issue 2 - kube-proxy:
RKE >= v1.3.5 sets hostname-override for kube-proxy to private DNS of the node by querying the ec2 metadata service. https://github.com/rancher/rke-tools/blob/master/entrypoint.sh#L17
This was introduced because hostname-override set for kube-proxy might not be the same as the name of node object in Kubernetes when cloud provider is enabled. For example, Rancher sets requestedHostname to nodepool-* for node driver provisioned clusters, but name of the node object is the private DNS of the node. This fixed issues like rancher/rancher#30363 and rancher/rancher#22416.

But the node name is not always guaranteed to be the private DNS of node returned by ec2 metadata (maybe when using custom dns or when hostname is set differently), causing issues for other setups. EKS recently fixed this to set it to spec.nodeName kubernetes/kubernetes#61486 (comment), but this is not feasible for RKE.

Fix 2 - kube-proxy:
Make the behavior optional. RKE will now set the hostname-override for kube-proxy to the instance metadata hostname only if useInstanceMetadataHostname is true. It will default to true for node driver clusters and false for custom clusters/RKE standalone. Users will have to set hostname-override correctly when false.

Issue: rancher/rancher#40147

@kinarashah kinarashah requested a review from a team January 12, 2023 17:36
snasovich
snasovich previously approved these changes Jan 12, 2023
@kinarashah kinarashah merged commit b4cf14e into rancher:release/v1.4 Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants