Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure a provider ID is set on a node if expected #87043

Merged

Conversation

@zjs
Copy link
Contributor

zjs commented Jan 9, 2020

What type of PR is this?
/kind bug

What this PR does / why we need it:

A transient issue might occur that causes an error to be returned by InstanceID(). When this is ignored, the external cloud provider taint will be removed and neither AddCloudNode() nor UpdateCloudNode() will try to set a provider ID in the future.

By returning the error we can ensure that the external cloud provider taint is not removed prematurely, allowing the operation to be retried (until the provider ID can be set).

Preserve support for external cloud providers that do not use IDs by continuing if a NotImplemented error is returned, making a distinction between lack of support for provider IDs and an actual error.

Introduce pair of unit tests that show a provider ID will eventually be set if an error is returned, unless that error is a NotImplemented, in which case the external cloud provider taint will be removed.

Which issue(s) this PR fixes:

Special notes for your reviewer:
I consider de-duplicating common logic between the unit tests I added and other tests in the file, but opted not to do that as changes to existing tests may make back-porting this change more difficult.

Does this PR introduce a user-facing change?:

Fixed a bug which could prevent a provider ID from ever being set for node if an error occurred determining the provider ID when the node was added.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jan 9, 2020

Welcome @zjs!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jan 9, 2020

Hi @zjs. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@zjs

This comment has been minimized.

Copy link
Contributor Author

zjs commented Jan 9, 2020

/sig cloudprovider
/assign andrewsykim

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jan 9, 2020

@zjs: The label(s) sig/cloudprovider cannot be applied, because the repository doesn't have them

In response to this:

/sig cloudprovider
/assign andrewsykim

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@andrewsykim

This comment has been minimized.

Copy link
Member

andrewsykim commented Jan 10, 2020

/ok-to-test

@zjs zjs force-pushed the zjs:topic/propagate-providerid-errors branch from da09f99 to d25d7ab Jan 10, 2020
Copy link
Member

andrewsykim left a comment

Minor comments, LGTM otherwise

/approve

pkg/controller/cloud/node_controller.go Outdated Show resolved Hide resolved
pkg/controller/cloud/node_controller.go Outdated Show resolved Hide resolved
@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Jan 10, 2020

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrewsykim, zjs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

A transient issue might occur that causes an error to be returned by
InstanceID(). When this is ignored, the external cloud provider taint
will be removed and neither AddCloudNode() nor UpdateCloudNode() will
try to set a provider ID in the future.

By returning the error we can ensure that the external cloud provider
taint is not removed prematurely, allowing the operation to be retried
(until the provider ID can be set).

Preserve support for external cloud providers that do not use IDs by
continuing if a NotImplemented error is returned, making a distinction
between lack of support for provider IDs and an actual error.

Introduce pair of unit tests that show a provider ID will eventually
be set if an error is returned, unless that error is a NotImplemented,
in which case the external cloud provider taint will be removed.
@zjs zjs force-pushed the zjs:topic/propagate-providerid-errors branch from d25d7ab to 2b55407 Jan 10, 2020
@andrewsykim

This comment has been minimized.

Copy link
Member

andrewsykim commented Jan 10, 2020

/lgtm

Thanks @zjs!

@k8s-ci-robot k8s-ci-robot added the lgtm label Jan 10, 2020
@k8s-ci-robot k8s-ci-robot merged commit 240782c into kubernetes:master Jan 10, 2020
16 checks passed
16 checks passed
cla/linuxfoundation zjs authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-kind Job succeeded.
Details
pull-kubernetes-e2e-kind-ipv6 Job succeeded.
Details
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-node-e2e-containerd Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
tide In merge pool.
Details
@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Jan 10, 2020
k8s-ci-robot added a commit that referenced this pull request Jan 31, 2020
…87043-origin-release-1.17

Automated cherry pick of #87043: Ensure a provider ID is set on a node if expected
k8s-ci-robot added a commit that referenced this pull request Jan 31, 2020
…87043-origin-release-1.16

Automated cherry pick of #87043: Ensure a provider ID is set on a node if expected
k8s-ci-robot added a commit that referenced this pull request Jan 31, 2020
…87043-origin-release-1.15

Automated cherry pick of #87043: Ensure a provider ID is set on a node if expected
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.