-
Notifications
You must be signed in to change notification settings - Fork 39k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Populate Node.Status.Addresses with Hostname #25532
Populate Node.Status.Addresses with Hostname #25532
Conversation
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
2 similar comments
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
@@ -2867,6 +2867,7 @@ func (kl *Kubelet) syncNetworkStatus() { | |||
|
|||
// Set addresses for the node. | |||
func (kl *Kubelet) setNodeAddress(node *api.Node) error { | |||
hostnameAddress := api.NodeAddress{Type: api.NodeHostName, Address: kl.GetHostname()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this hostname always resolvable from the master?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, not necessarily: // GetHostname Returns the hostname as the kubelet sees it.
Also, this can be overriden arbitrarily using --hostname-override
on the kubelet. Maybe there are cases where this behaviour is undesired, e.g.: cloud-provider sets a resolvable Nodename, but the Node's hostname is actually not resolvable by the master, however it is preferred b/c the Addresses array is populated now. contrived?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not contrived at all, imo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be helpful to map out all the combinations (with and without cloud provider, with and without hostname override, with and without the hostname being resolvable by the master), and show when the hostname override is honored and who is responsible for setting it. I also wonder if some cloud providers need to be updated to set this info if they are known to determine hostnames that won't resolve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt: Ok, trying to outline it:
$ git grep -A3 "func .* CurrentNodeName"
pkg/cloudprovider/providers/aws/aws.go:func (c *AWSCloud) CurrentNodeName(hostname string) (string, error) {
pkg/cloudprovider/providers/aws/aws.go- return c.selfAWSInstance.nodeName, nil
pkg/cloudprovider/providers/aws/aws.go-}
pkg/cloudprovider/providers/aws/aws.go-
--
pkg/cloudprovider/providers/fake/fake.go:func (f *FakeCloud) CurrentNodeName(hostname string) (string, error) {
pkg/cloudprovider/providers/fake/fake.go- return hostname, nil
pkg/cloudprovider/providers/fake/fake.go-}
pkg/cloudprovider/providers/fake/fake.go-
--
pkg/cloudprovider/providers/gce/gce.go:func (gce *GCECloud) CurrentNodeName(hostname string) (string, error) {
pkg/cloudprovider/providers/gce/gce.go- return hostname, nil
pkg/cloudprovider/providers/gce/gce.go-}
pkg/cloudprovider/providers/gce/gce.go-
--
pkg/cloudprovider/providers/mesos/mesos.go:func (c *MesosCloud) CurrentNodeName(hostname string) (string, error) {
pkg/cloudprovider/providers/mesos/mesos.go- return hostname, nil
pkg/cloudprovider/providers/mesos/mesos.go-}
pkg/cloudprovider/providers/mesos/mesos.go-
--
pkg/cloudprovider/providers/openstack/openstack.go:func (i *Instances) CurrentNodeName(hostname string) (string, error) {
pkg/cloudprovider/providers/openstack/openstack.go- return hostname, nil
pkg/cloudprovider/providers/openstack/openstack.go-}
pkg/cloudprovider/providers/openstack/openstack.go-
--
pkg/cloudprovider/providers/ovirt/ovirt.go:func (v *OVirtCloud) CurrentNodeName(hostname string) (string, error) {
pkg/cloudprovider/providers/ovirt/ovirt.go- return hostname, nil
pkg/cloudprovider/providers/ovirt/ovirt.go-}
pkg/cloudprovider/providers/ovirt/ovirt.go-
--
pkg/cloudprovider/providers/rackspace/rackspace.go:func (i *Instances) CurrentNodeName(hostname string) (string, error) {
pkg/cloudprovider/providers/rackspace/rackspace.go- // Beware when changing this, nodename == hostname assumption is crucial to
pkg/cloudprovider/providers/rackspace/rackspace.go- // apiserver => kubelet communication.
pkg/cloudprovider/providers/rackspace/rackspace.go- return hostname, nil
--
pkg/cloudprovider/providers/vsphere/vsphere.go:func (i *Instances) CurrentNodeName(hostname string) (string, error) {
pkg/cloudprovider/providers/vsphere/vsphere.go- return i.localInstanceID, nil
pkg/cloudprovider/providers/vsphere/vsphere.go-}
pkg/cloudprovider/providers/vsphere/vsphere.go-
Most map directly to hostname, VSphere and AWS don't.
- VSphere maps to VM name, not necessarily resolvable. Right now the VM name has to be configured to be a master-resolvable DNS name.
-
AWS gets its hostname from the API, using
instance.PrivateDnsName
, which is equal to an EC2 instance hostname:https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/aws/aws.go#L982
The current kubelet code will get a Node's hostname by executing uname -n
, unless --hostname-override
is set, then it will use this value.
In general I don't think this is wrong (actually it's the intended behavior IMO): It's better to require the Node's hostname to be master-resolvable than have a convention that the Nodename has to be resolvable.
However this could introduce a breaking change for clusters set up under the assumption: Nodename matters, hostname doesn't. To make sure this doesn't happen we can add the Type:Hostname
Address to the address array in the affected cloud-providers code already, and add it in the kubelet code only when it's not set by the cloud-provider (actually this sanity-check should be there anyway).
Edit: The latter is not feasible without extending the Kubelet object to pass down information whether the hostname was overridden or not. I'd suggest leaving the address w/ Type:Hostname
exclusively in the kubelet's domain, failing when the cloud provider sets it. The only breakage I can think of, right now, are VSphere clusters where the Hostname might not be resolvable, while the Nodename is.
ok to test |
Getting e2e on this asap will be helpful, I think. I like the refactoring of the connection info into a class. I didn't know about NodeHostName either - that's much cleaner than looking at the IPs! |
6874997
to
ea05212
Compare
The author of this PR is not in the whitelist for merge, can one of the admins add the 'ok-to-merge' label? |
965c638
to
5e1c104
Compare
This is a fairly significant change (operationally, not in LOC). I propose we try to land it in 1.4. @mkulke you'll probably have to ping me once 1.4 opens to remind me.. |
I agree with that - the risk is very high.
|
Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test". This message may repeat a few times in short succession due to jenkinsci/ghprb-plugin#292. Sorry. Otherwise, if this message is too spammy, please complain to ixdy. |
5e1c104
to
50c0a1b
Compare
// If pod has not been assigned a host, return an empty location | ||
return nil, nil, nil | ||
} | ||
nodeScheme, nodePort, nodeTransport, err := connInfo.GetConnectionInfo(ctx, nodeHost) | ||
connectionInfo, err := connInfo.GetConnectionInfo(ctx, nodeName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha! So the thing is that here connInfo is a REST object, not a "raw" HTTPKubeletClient object. Because the one will convert the the nodeName and the other will not. I'm going to verify that; but I propose changing REST to implement a different interface, given the behaviours are different (convert node name vs not).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it needs a different interface... the inputs are a nodeName, the outputs are the connection information. In the absence of this new address info on a node, the fallback will be to continue using the node name as the host.
I think this is great, and exactly the sort of thing that has made turn-up hard. (cc @mikedanese ) It is high risk, but we are at the stage where we can implement it I think. But I think we should just the same logic as we have in 2d85e4a in the pod provider. Arguably then this is actually a bug fix, in that it is inconsistent that the pods package resolve the kubelet differently from how the nodes packages resolves it. That is arguable, but the concrete consequence would be that we should not have to change the kubelet/node registration at all, and that we should prefer the InternalIP. That is what InternalIP is for, I believe. If there is a provider for whom this is unworkable, we have plenty of time to discover it, and even add yet-another-address-type if that is needed. |
} | ||
|
||
func (c *REST) GetConnectionInfo(ctx api.Context, nodeName string) (*client.ConnectionInfo, error) { | ||
hostname, err := c.getKubeletHost(nodeName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is forcing two lookups of the node object from etcd... (once for the host, once for the port)... just fetch once
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt: make sense, i'm retrieving both in a single query now.
This PR records the kubelet hostname (autodetected or manually specified with Tagging @kubernetes/api-review-team / @kubernetes/kube-api for approval. Previous discussion at #33718 (comment). This PR and #35497 are needed by 1.5 to preserve the ability to use a DNS name for master->node communications (typically for SSL cert purposes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include the hostname address in the nodeIP-filtered list, and needs sign-off from @kubernetes/api-review-team / @kubernetes/kube-api , LGTM otherwise
@mkulke, can you make the last couple changes, would like to get this merged early this week ahead of code-freeze |
This looks reasonable to me - breaking backwards compatibility with node name lookups in existing clusters is deeply dangerous, so this appears to correctly preserve legacy behavior. |
0cfba15
to
b7880e7
Compare
@liggitt: ok, i addressed the review comments, squashed and rebased. |
Jenkins GCE Node e2e failed for commit b7880e7. Full PR test history. The magic incantation to run this job again is |
Jenkins unit/integration failed for commit b7880e7. Full PR test history. The magic incantation to run this job again is |
LGTM. Needs a nod from @kubernetes/api-review-team for the use of the |
ref #9267 |
@liggitt I'm fine with this, and happy if it gets us closer to moving away from resolving node resource names. Please document the required level of resolvability of the hostname (e.g., must be resolvable by the apiserver). However, I haven't thought about what more we could do to determine whether it might break anyone, or all the different ways it might interact with Kubelet configuration. At minimum, this needs a much more detailed release note than just the title of this PR. Something to think about for the future: At some point we discussed adding info about which form of the address to use to verify node identity. |
Updated release-note with description of source and expectations for the address. @mkulke can you open a doc PR as well for this page: http://kubernetes.io/docs/admin/node/#node-addresses |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue |
This PR is supposed to address #22063
Currently
NodeName
has to be a resolvable dns address on the master to allow apiserver -> kubelet communication (exec, log, port-forward operations on a pod). In some situations this is unfortunate (see the discussions on the issue).The PR aims to do the following:
Type: Hostname
in theNode.Status.Addresses
array, the type is already defined, but was not used so far.This change is