[v1.4] Forward port changes around hostname-override #3140
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is a forward port PR. Original PR: #3136
This PR addresses 2 issues.
Issue 1 - kubelet:
RKE removes
hostname-override
fromkubelet
args with RKE >=v1.3.5 for AWS cloud provider enabled clusters https://github.com/rancher/rke/blob/release/v1.4/cluster/plan.go#L497.kubelet
populateskubernetes.io/hostname
label usinghostname-override
and RKE finds the node by relying on this behavior https://github.com/rancher/rke/blob/release/v1.4/k8s/node.go#L55. So this becomes an issue when--node-name
/hostnameOverride
is not the same as thekubernetes.io/hostname
on node. rancher/rancher#37634For example,
kubernetes.io/hostname
label.--node-name
in registration command to a value that doesn't match with the default hostname assigned for eg"hostname -f"
or"kinara-ec2"
node xxxx not found error
Fix 1 - kubelet:
Revert the removal of
hostname-override
inkubelet
args. RKE will continue finding the node by matching hostnameOverride (--node-name) tokubernetes.io/hostname
label on the node. Node name will still be set by the cloud provider, if it's not consistent with the hostnameOverride then kubelet will show logs like below but this is expected behavior.Issue 2 - kube-proxy:
RKE >= v1.3.5 sets
hostname-override
for kube-proxy to private DNS of the node by querying the ec2 metadata service. https://github.com/rancher/rke-tools/blob/master/entrypoint.sh#L17This was introduced because
hostname-override
set for kube-proxy might not be the same as the name of node object in Kubernetes when cloud provider is enabled. For example, Rancher setsrequestedHostname
tonodepool-*
for node driver provisioned clusters, but name of the node object is the private DNS of the node. This fixed issues like rancher/rancher#30363 and rancher/rancher#22416.But the node name is not always guaranteed to be the private DNS of node returned by ec2 metadata (maybe when using custom dns or when hostname is set differently), causing issues for other setups. EKS recently fixed this to set it to
spec.nodeName
kubernetes/kubernetes#61486 (comment), but this is not feasible for RKE.Fix 2 - kube-proxy:
Make the behavior optional. RKE will now set the
hostname-override
forkube-proxy
to the instance metadata hostname only ifuseInstanceMetadataHostname
is true. It will default to true for node driver clusters and false for custom clusters/RKE standalone. Users will have to sethostname-override
correctly when false.Issue: rancher/rancher#40147