Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CCM] Cloud node controller will not remove nodes that no longer exists for cloud providers that require ProviderID #50985

Closed
jhorwit2 opened this issue Aug 20, 2017 · 8 comments · Fixed by #51087
Assignees
Labels
area/cloudprovider kind/bug Categorizes issue or PR as related to a bug. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.

Comments

@jhorwit2
Copy link
Contributor

jhorwit2 commented Aug 20, 2017

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug
/kind feature

/kind bug

What happened:

@wlan0 mentioned here that:

Secondly, there are some clouds whose providerID cannot be inferred from a remote location even within the cloud (e.g. openstack). This is why we wanted to have the ability to set a unique id while starting the kubelet.

Currently, MonitorNode() in the node controller only checks with the CCM if a node still exists by calling ExternalID(nodeName). ExternalID is supposed to return the provider id of a node by its node name which is not supported on every cloud. This means that any clouds who cannot infer the provider id by the node name from a remote location will never remove nodes that no longer exist.

What you expected to happen:

I'd expect the cloud node controller to ask the CCM if the instance still exists by node name and provider id just like the other methods in the cloud provider interface.

Other

It seems weird to me that we check if a node exists by calling ExternalID(nodeName). I feel we should add two more methods to the Instances interface for this purpose like:

  • Exists(nodeName types.NodeName) (bool, error)
  • ExistsByProviderID(providerID string) (bool, error)

I see in some places that ExternalID has deprecated; however, I can't find anywhere that says what it was deprecated in favor of. Sorry if this is a duplicate of an existing issue/proposal.

cc @wlan0 @andrewsykim @luxas

related: #48690

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Aug 20, 2017
@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Aug 20, 2017
@k8s-github-robot
Copy link

@jhorwit2
There are no sig labels on this issue. Please add a sig label by:

  1. mentioning a sig: @kubernetes/sig-<group-name>-<group-suffix>
    e.g., @kubernetes/sig-contributor-experience-<group-suffix> to notify the contributor experience sig, OR

  2. specifying the label manually: /sig <label>
    e.g., /sig scalability to apply the sig/scalability label

Note: Method 1 will trigger an email to the group. You can find the group list here and label list here.
The <group-suffix> in the method 1 has to be replaced with one of these: bugs, feature-requests, pr-reviews, test-failures, proposals

@jhorwit2
Copy link
Contributor Author

/sig cluster-lifecycle

@k8s-ci-robot k8s-ci-robot added the sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. label Aug 20, 2017
@k8s-github-robot k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Aug 20, 2017
@jhorwit2 jhorwit2 changed the title Cloud node controller will not remove nodes that no longer exists for cloud providers that require ProviderID [CCM] Cloud node controller will not remove nodes that no longer exists for cloud providers that require ProviderID Aug 20, 2017
@jhorwit2
Copy link
Contributor Author

/area cloudprovider

@FengyunPan
Copy link

add two more methods to the Instances interface for this purpose like:
Exists(nodeName types.NodeName) (bool, error)
ExistsByProviderID(providerID string) (bool, error)

@jhorwit2 I agree, Currently VSphere calls Exists() to check whether node exists. But other cloud providers call ExternalID(nodeName) which is deprecated.

@wlan0
Copy link
Member

wlan0 commented Aug 21, 2017

@jhorwit2 Thanks for finding this bug and filing this issue :) I'll be happy to review it if you make a PR

@wlan0
Copy link
Member

wlan0 commented Aug 21, 2017

/assign wlan0

@jhorwit2
Copy link
Contributor Author

@wlan0 i'll start work on the PR. 👍

@luxas
Copy link
Member

luxas commented Aug 22, 2017

cc @thockin ^

k8s-github-robot pushed a commit that referenced this issue Aug 26, 2017
…e-exists

Automatic merge from submit-queue (batch tested with PRs 51174, 51363, 51087, 51382, 51388)

Add InstanceExistsByProviderID to cloud provider interface for CCM

**What this PR does / why we need it**:

Currently, [`MonitorNode()`](https://github.com/kubernetes/kubernetes/blob/02b520f0a40be2056d91fc0661c2b4fdb2694c30/pkg/controller/cloud/nodecontroller.go#L240) in the node controller checks with the CCM if a node still exists by calling `ExternalID(nodeName)`. `ExternalID` is supposed to return the provider id of a node which is not supported on every cloud. This means that any clouds who cannot infer the provider id by the node name from a remote location will never remove nodes that no longer exist. 


**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes #50985

**Special notes for your reviewer**:

We'll want to create a subsequent issue to track the implementation of these two new methods in the cloud providers.

**Release note**:

```release-note
Adds `InstanceExists` and `InstanceExistsByProviderID` to cloud provider interface for the cloud controller manager
```

/cc @wlan0 @thockin @andrewsykim @luxas @jhorwit2

/area cloudprovider
/sig cluster-lifecycle
dims pushed a commit to dims/kubernetes that referenced this issue Feb 8, 2018
…cm-instance-exists

Automatic merge from submit-queue (batch tested with PRs 51174, 51363, 51087, 51382, 51388)

Add InstanceExistsByProviderID to cloud provider interface for CCM

**What this PR does / why we need it**:

Currently, [`MonitorNode()`](https://github.com/kubernetes/kubernetes/blob/02b520f0a40be2056d91fc0661c2b4fdb2694c30/pkg/controller/cloud/nodecontroller.go#L240) in the node controller checks with the CCM if a node still exists by calling `ExternalID(nodeName)`. `ExternalID` is supposed to return the provider id of a node which is not supported on every cloud. This means that any clouds who cannot infer the provider id by the node name from a remote location will never remove nodes that no longer exist. 


**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes kubernetes#50985

**Special notes for your reviewer**:

We'll want to create a subsequent issue to track the implementation of these two new methods in the cloud providers.

**Release note**:

```release-note
Adds `InstanceExists` and `InstanceExistsByProviderID` to cloud provider interface for the cloud controller manager
```

/cc @wlan0 @thockin @andrewsykim @luxas @jhorwit2

/area cloudprovider
/sig cluster-lifecycle
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cloudprovider kind/bug Categorizes issue or PR as related to a bug. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants