Cloud Controller Manger doesn't query cloud provider for node name, causing the node to be removed #70897

yifan-gu · 2018-11-10T00:23:40Z

What happened:

Launch a node with container linux CoreOS-stable-1911.3.0 on aws, with customized ignition configs.
The hostname turns to be ip-10-3-18-1, instead of the full private dns ip-10-3-18-1.us-west-1.compute.internal because /etc/hostname is not set.
kubelet starts with --cloud-provider=external and skips this code path. So it's not able to set the private dns as the node name.
When CCM starts, it tries to read the node name from the node spec.
CCM calls GetInstanceProviderID() and fails.
CCM calls getNodeAddressesByProviderIDOrName() and fails too.
CCM then removes the node.

What you expected to happen:

Since now the kubelet runs with --cloud-provider=external, no one is executing that code path to query the cloud provider to get the node name anymore.
However this code path still needs to be executed by someone to get the correct node name from the cloud provider for the node.
I think the CCM might need to query the cloud provider for the full node hostname in case the hostname given by the kubelet is not the full hostname (in AWS case).

How to reproduce it (as minimally and precisely as possible):
Launch a container linux with non-empty ignition config in the user data, then the hostname won't be the full private-dns.
Then launch kubelet with --cloud-provider=external and CCM will reproduce the issue described above.

Anything else we need to know?:

Environment:

Kubernetes version:

Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:08:19Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}```
- Cloud provider or hardware configuration: 
AWS
- OS 
```NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1911.3.0
VERSION_ID=1911.3.0
BUILD_ID=2018-11-05-1815
PRETTY_NAME="Container Linux by CoreOS 1911.3.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Kernel:
Linux ip-10-3-20-13 4.14.78-coreos #1 SMP Mon Nov 5 17:42:07 UTC 2018 x86_64 Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz GenuineIntel GNU/Linux
Install tools:
Internal k8s installer tool based on terraform
Others:

This issue can be mitigated by telling the igntion config to set the /etc/hostname to the private-dns (by curl http://169.254.169.254/latest/meta-data/hostname). Or just use the coreos-metadata service.

/cc @Quentin-M

@kubernetes/sig-aws-misc
@andrewsykim

/kind bug

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2018-11-10T00:23:45Z

@yifan-gu: There are no sig labels on this issue. Please add a sig label by either:

mentioning a sig: @kubernetes/sig-<group-name>-<group-suffix>
e.g., @kubernetes/sig-contributor-experience-<group-suffix> to notify the contributor experience sig, OR
specifying the label manually: /sig <group-name>
e.g., /sig scalability to apply the sig/scalability label

Note: Method 1 will trigger an email to the group. See the group list.
The <group-suffix> in method 1 has to be replaced with one of these: bugs, feature-requests, pr-reviews, test-failures, proposals.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

soggiest · 2018-11-12T03:18:02Z

+1 I've also observed this happening if the hostname isn't set properly.

yifan-gu · 2018-12-04T19:17:20Z

Anyone looking into this?

andrewsykim · 2018-12-04T21:26:44Z

Hi @yifan-gu. It's expected that CCM is able to find a node either by its name or provider ID. It'll be hard to break this assumption unfortunately. If the hostname is not matching the node name then we have to expect the user to override the hostname with --hostname or use --provider-id.

fejta-bot · 2019-03-04T22:07:56Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2019-04-03T22:54:18Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

frittentheke · 2019-04-04T05:37:49Z

/remove-lifecycle rotten

fejta-bot · 2019-07-03T05:57:10Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

frittentheke · 2019-07-05T16:16:29Z

/remove-lifecycle stale

andrewsykim · 2019-07-08T16:54:32Z

Looks like you will need to set --provider-id on the kubelet for cases like this?

cc @cheftako @mcrute

fejta-bot · 2019-11-04T07:43:55Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

cheftako · 2019-11-04T18:16:52Z

/remove-lifecycle stale

cheftako · 2019-11-04T18:16:59Z

/lifecycle frozen

ehashman · 2021-07-15T22:22:50Z

This remains an issue on latest k8s. Updated code path in kubelet:

kubernetes/cmd/kubelet/app/server.go

Lines 992 to 994 in 53a7922

    
           if cloud == nil { 
        
           	return types.NodeName(hostname), nil 
        
           }

I think this also poses a problem for the out-of-tree migration. It seems to me that e.g. the OpenStack legacy cloud provider doesn't guarantee that the node name and hostname match:

kubernetes/staging/src/k8s.io/legacy-cloud-providers/openstack/openstack_instances.go

Lines 71 to 79 in 53a7922

    
           // CurrentNodeName implements Instances.CurrentNodeName 
        
           // Note this is *not* necessarily the same as hostname. 
        
           func (i *Instances) CurrentNodeName(ctx context.Context, hostname string) (types.NodeName, error) { 
        
           	md, err := getMetadata(i.opts.SearchOrder) 
        
           	if err != nil { 
        
           		return "", err 
        
           	} 
        
           	return types.NodeName(md.Name), nil 
        
           }

So, if one tries to upgrade a node from --cloud-provider=openstack to --cloud-provider=external, the old node name (not necesarily hostname) may not match the new node name (hostname), which will cause the kubelet to be unable to find its node.

From openshift/machine-config-operator#2401 (comment) it sounds like this also affects VMware and AWS depending on configuration. (perhaps @nckturner can confirm impact on AWS)

Wanted to check if this is on the external cloud provider migration radar and if there is a blessed migration path?

mdbooth · 2021-07-16T09:02:01Z

This issue affects both the AWS and OpenStack in-tree cloud providers, both of which return instance.Name in CurrentNodeName(), which is not necessarily the same as kubelet's default of the FQDN hostname.

We can work round this by setting the hostname to be whatever the in-tree cloud provider previously returned. However, this feels like a kludge because:

We're changing the hostname of an existing host which may have other consequences, some of which may be outside of our ability to predict if they involve, e.g. third-party drivers.
It makes AWS and OpenStack configuration snowflakes
It perpetuates the problem beyond the use of the in-tree provider which originally caused it
We're not solving the problem, just deferring it indefinitely

An idea I've seen is to request it from the CCM. However that may have a bootstrapping problem as kubelet communicates with the CCM via the API, so it would have to identify itself somehow in order to retrieve the correct information.

I wonder if a simpler idea might be to persist kubelet's node name as local state, e.g. in /etc/kubernetes/node. We could initialise that value appropriately if it doesn't exist, then pass its contents to kubelet via --hostname-override subsequently. We could ensure for upgrade that it is initialised with the current node name for existing nodes, but for all future nodes it is initialised with hostname. This would allow us to eventually remove the upgrade logic, and would also insulate us from cloud configuration changes which affect hostname (although nobody should be doing that anyway).

Or even simpler, launching kubelet with --hostname-override iff /etc/kubernetes/node exists. Then we can create this file only on AWS/OpenStack nodes which previously ran the in-tree cloud provider.

ehashman · 2021-07-16T17:15:10Z

/area provider/openstack

Please add additional ones as affected :)

mdbooth · 2021-07-16T17:30:28Z

/area provider/aws

andrewsykim · 2021-07-16T18:00:18Z

IMO the right approach is for all kubelets to set the --provider-id flag with the well-known provider ID format of the cloud provider. This way we never have to defer to the name and can rather check the unique ID of instances for existence. However, we can't always assume that users will set that flag and I would agree it's ideally something we can build into the system to do right.

mdbooth · 2021-07-16T18:01:47Z

IMO the right approach is for all kubelets to set the --provider-id flag with the well-known provider ID format of the cloud provider. This way we never have to defer to the name and can rather check the unique ID of instances for existence. However, we can't always assume that users will set that flag and I would agree it's ideally something we can build into the system to do right.

Would you use the provider id as the node name? If not, how would you use it to resolve the node name?

andrewsykim · 2021-07-16T18:28:37Z

Would you use the provider id as the node name? If not, how would you use it to resolve the node name?

The provider ID maps to the spec.providerID field on the node object. You can construct it using the well-known format based on the provider, typically something like aws://<instance-id> or simiar

nckturner · 2021-07-16T18:48:49Z

@andrewsykim does this mean that setting --provider-id is required before upgrade? And if a node's hostname doesn't match its nodename, then kubelets will fail to start, after their control plane has been upgraded to external CCM if that kubelet doesn't set --provider-id? To me that seems like breaking backwards compatibility guarantees, so if we can avoid it we should try to.

nckturner · 2021-07-16T18:52:16Z

However, we can't always assume that users will set that flag and I would agree it's ideally something we can build into the system to do right.

Nvm, as you mentioned above, users may or may not have this set. I am interested in this suggestion by @mdbooth which sounds like it could get us through the upgrade without breaking existing kubelets:

I wonder if a simpler idea might be to persist kubelet's node name as local state, e.g. in /etc/kubernetes/node.

andrewsykim · 2021-07-16T18:53:47Z

Just to be clear this was just for addressing the delete case, existing kubelets should not need --provider-id. And even new kubelets do not need --provider-id. However, having --provider-id set allows to handle the case where the node name is different from the instance name. Technically speaking, if we're just talking about existing kubelets that were using in-tree provider, they should have providerID already set, so you wouldn't need it to set it for them.

nckturner · 2021-07-16T18:56:53Z

So the above issue makes it sound like existing nodes will be deleted upon upgrade:

CCM then removes the node.

Am I misunderstanding when this might occur? I read it as: on upgrade, if hostname does not match nodename.

Edit: ok, the existing kubelet case makes sense.

ehashman · 2021-07-16T19:20:13Z

Nvm, as you mentioned above, users may or may not have this set. I am interested in this suggestion by @mdbooth which sounds like it could get us through the upgrade without breaking existing kubelets:

I wonder if a simpler idea might be to persist kubelet's node name as local state, e.g. in /etc/kubernetes/node.

This seems error-prone to me and I'd prefer we not add hacks in the kubelet to work around a cloud-provider issue.

andrewsykim · 2021-07-16T19:34:26Z

Since it's common in AWS for instances to be named after the private DNS name, I feel like the AWS CCM should just be updated to query nodes both by name instance name and it's private DNS name. Would that solve this issue?

ehashman · 2021-07-16T22:45:50Z

Since it's common in AWS for instances to be named after the private DNS name, I feel like the AWS CCM should just be updated to query nodes both by name instance name and it's private DNS name. Would that solve this issue?

That might solve it for AWS but this doesn't just affect AWS; see the examples above involving OpenStack. I am not sure if other cloud providers are also affected.

mdbooth · 2021-07-19T08:43:22Z

Since it's common in AWS for instances to be named after the private DNS name, I feel like the AWS CCM should just be updated to query nodes both by name instance name and it's private DNS name. Would that solve this issue?

I think we're mixing up the delete case you mentioned above. It sounds like this new issue has hijacked a similar but different issue. Sorry!

In the in-tree -> external CCM case, the external CCM is not involved in the bug at all, so unfortunately it's not useful for fixing anything. The issue is in the difference of behaviour between:

The in-tree cloud provider
The default behaviour of kubelet when there is no in-tree cloud provider

When kubelet does not have an in-tree cloud provider, its default behaviour is to assume that its Node object's name is the FQDN hostname of the host. This is not true if kubelet was previously using the AWS or OpenStack in-tree provider, which provided unqualified names. The result is that kubelet cannot find its Node object and does not start cleanly. For example, because kubelet stops pinging the Node object it becomes stale and the Node is marked NotReady. Static pods are not started (specifically the local coredns; that was a deep and fruitless rabbit hole). Probably other things I didn't get to. It's generally quite sad in ways that the CCM can't fix.

cheftako · 2021-07-19T18:48:16Z

/cc @leilajal

andrewsykim · 2021-07-19T18:53:48Z

When kubelet does not have an in-tree cloud provider, its default behaviour is to assume that its Node object's name is the FQDN hostname of the host. This is not true if kubelet was previously using the AWS or OpenStack in-tree provider, which provided unqualified names. The result is that kubelet cannot find its Node object and does not start cleanly. For example, because kubelet stops pinging the Node object it becomes stale and the Node is marked NotReady. Static pods are not started (specifically the local coredns; that was a deep and fruitless rabbit hole). Probably other things I didn't get to. It's generally quite sad in ways that the CCM can't fix.

Yes, this is an unfortunate result of the default naming convention when a kubelet is providerless and when it's using the AWS or OpenStack cloud provider. And unfortunately, you can't override hostname on AWS, see #54482.

I think this use-case is not common enough and the solution would be sufficiently complex that we wouldn't try to fix this, but open to hearing other suggestions / alternatives.

mdbooth · 2021-07-20T10:05:26Z

When kubelet does not have an in-tree cloud provider, its default behaviour is to assume that its Node object's name is the FQDN hostname of the host. This is not true if kubelet was previously using the AWS or OpenStack in-tree provider, which provided unqualified names. The result is that kubelet cannot find its Node object and does not start cleanly. For example, because kubelet stops pinging the Node object it becomes stale and the Node is marked NotReady. Static pods are not started (specifically the local coredns; that was a deep and fruitless rabbit hole). Probably other things I didn't get to. It's generally quite sad in ways that the CCM can't fix.

Yes, this is an unfortunate result of the default naming convention when a kubelet is providerless and when it's using the AWS or OpenStack cloud provider. And unfortunately, you can't override hostname on AWS, see #54482.

I think this use-case is not common enough and the solution would be sufficiently complex that we wouldn't try to fix this, but open to hearing other suggestions / alternatives.

Herein lies the problem. When upgrading from in-tree to CCM, which we plan to do some time soon, it will affect 100% of AWS/OpenStack users.

andrewsykim · 2021-07-20T15:39:31Z

Herein lies the problem. When upgrading from in-tree to CCM, which we plan to do some time soon, it will affect 100% of AWS/OpenStack users.

Worth clarifying that I was specifically referring to when kubelet goes from --cloud-provider=aws|openstack to no cloud provider at all. This issue does not apply when you go from --cloud-provider=aws|openstack to --cloud-provider=external because the CCM should know how to discover nodes using varying names (like hostname or private DNS)

mdbooth · 2021-07-20T16:34:11Z

Herein lies the problem. When upgrading from in-tree to CCM, which we plan to do some time soon, it will affect 100% of AWS/OpenStack users.

Worth clarifying that I was specifically referring to when kubelet goes from --cloud-provider=aws|openstack to no cloud provider at all. This issue does not apply when you go from --cloud-provider=aws|openstack to --cloud-provider=external because the CCM should know how to discover nodes using varying names (like hostname or private DNS)

To the best of my understanding, --cloud-provider=external and no cloud provider at all are the same to kubelet. It's the same code path. The CCM is not involved at all here. This is purely a kubelet problem. Kubelet doesn't call out to a CCM in the same way that it does, for example, to CNI or CSI. The CCM isn't running on the local node, and all communication is indirect via API objects. The CCM can't help kubelet find its node object.

andrewsykim · 2021-07-20T16:40:59Z

The CCM can't help kubelet find its node object.

The CCM can't help find kubelet find the node object, but it can give flexibility for kubelet to use different naming formats as supported by the cloud provider. So in the AWS case, if you switch from --cloud-provider=aws to --cloud-provider=external, and you want to preserve the existing node object before, you can set --hostname-override=<private-dns> with --cloud-provider=external.

mdbooth · 2021-07-20T21:08:17Z

The CCM can't help kubelet find its node object.

The CCM can't help find kubelet find the node object, but it can give flexibility for kubelet to use different naming formats as supported by the cloud provider. So in the AWS case, if you switch from --cloud-provider=aws to --cloud-provider=external, and you want to preserve the existing node object before, you can set --hostname-override=<private-dns> with --cloud-provider=external.

Excellent! I was thinking we could update the script which launches kubelet to look for a local file, e.g. /etc/kubernetes/node, and pass the contents of that file in --hostname-override if it exists. My hope is that we can have, e.g. the CCM operator add a hook... somewhere tbd... to create this file if required with cloud-specific contents. Do you think that would fly?

andrewsykim · 2021-07-20T21:55:42Z

Excellent! I was thinking we could update the script which launches kubelet to look for a local file, e.g. /etc/kubernetes/node, and pass the contents of that file in --hostname-override if it exists.

I think this is up to the cluster admin / operator to figure out since this is pertaining to how kubelet eventually ends up running on a given node. The kubelet currently has --hostname-override flag to account for various scenarios per cloud provider, it's just a matter of setting this flag to the correct value

mdbooth · 2021-07-27T14:23:07Z

@andrewsykim Sorry for confusing these 2 issues earlier, btw. Looks like we're going to go with a solution based on --hostname-override.

k8s-triage-robot · 2023-02-08T05:24:34Z

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

Confirm that this issue is still relevant with /triage accepted (org members only)
Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

k8s-ci-robot · 2023-02-08T05:24:37Z

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 10, 2018

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Nov 10, 2018

yifan-gu added sig/aws and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 10, 2018

liggitt added sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. area/cloudprovider labels Nov 11, 2018

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 4, 2019

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 3, 2019

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Apr 4, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 3, 2019

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 5, 2019

k8s-ci-robot added area/provider/aws Issues or PRs related to aws provider and removed sig/aws labels Aug 6, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 4, 2019

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 4, 2019

k8s-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Nov 4, 2019

k8s-ci-robot added the area/provider/openstack Issues or PRs related to openstack provider label Jul 16, 2021

mdbooth mentioned this issue Jul 27, 2021

Kubelet can't find Node after upgrade to external CCM on AWS/OpenStack openshift/machine-config-operator#2693

Closed

k8s-ci-robot removed the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Feb 8, 2023

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Feb 8, 2023

Cloud Controller Manger doesn't query cloud provider for node name, causing the node to be removed #70897

Cloud Controller Manger doesn't query cloud provider for node name, causing the node to be removed #70897

Comments

yifan-gu commented Nov 10, 2018 • edited

k8s-ci-robot commented Nov 10, 2018

soggiest commented Nov 12, 2018

yifan-gu commented Dec 4, 2018

andrewsykim commented Dec 4, 2018

fejta-bot commented Mar 4, 2019

fejta-bot commented Apr 3, 2019

frittentheke commented Apr 4, 2019

fejta-bot commented Jul 3, 2019

frittentheke commented Jul 5, 2019

andrewsykim commented Jul 8, 2019

fejta-bot commented Nov 4, 2019

cheftako commented Nov 4, 2019

cheftako commented Nov 4, 2019

ehashman commented Jul 15, 2021

mdbooth commented Jul 16, 2021

ehashman commented Jul 16, 2021

mdbooth commented Jul 16, 2021

andrewsykim commented Jul 16, 2021

mdbooth commented Jul 16, 2021

andrewsykim commented Jul 16, 2021

nckturner commented Jul 16, 2021

nckturner commented Jul 16, 2021

andrewsykim commented Jul 16, 2021 • edited

nckturner commented Jul 16, 2021 • edited

ehashman commented Jul 16, 2021

andrewsykim commented Jul 16, 2021

ehashman commented Jul 16, 2021

mdbooth commented Jul 19, 2021

cheftako commented Jul 19, 2021

andrewsykim commented Jul 19, 2021

mdbooth commented Jul 20, 2021

andrewsykim commented Jul 20, 2021 • edited

mdbooth commented Jul 20, 2021

andrewsykim commented Jul 20, 2021

mdbooth commented Jul 20, 2021

andrewsykim commented Jul 20, 2021 • edited

mdbooth commented Jul 27, 2021

k8s-triage-robot commented Feb 8, 2023

k8s-ci-robot commented Feb 8, 2023

yifan-gu commented Nov 10, 2018 •

edited

andrewsykim commented Jul 16, 2021 •

edited

nckturner commented Jul 16, 2021 •

edited

andrewsykim commented Jul 20, 2021 •

edited

andrewsykim commented Jul 20, 2021 •

edited