Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

don't start the cloud node controller if cloudprovider.Instances is not supported #82329

Merged
merged 1 commit into from
Sep 11, 2019

Conversation

m3ngyang
Copy link
Contributor

@m3ngyang m3ngyang commented Sep 4, 2019

What type of PR is this?
/kind bug
/kind cleanup

What this PR does / why we need it:
There are some tiny bugs in the cloud node controller:

  1. node_controller doesn't check cloud, kubeClient before using them.
  2. node_lifecycle_controller checks kubeClient after using it.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 4, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @m3ngyang. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 4, 2019
@m3ngyang
Copy link
Contributor Author

m3ngyang commented Sep 5, 2019

/assign @andrewsykim
PTAL~

)
if err != nil {
klog.Warningf("failed to start cloud node controller: %s", err)
return nil, false, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent with the other controllers here, I think this should be

return nil, false, nil

Otherwise the entire process exits on err.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make it consistent with other controllers as this will be important for future work making the controller initialization generic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed~

Copy link
Member

@andrewsykim andrewsykim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment, lgtm otherwise :)

@fedebongio
Copy link
Contributor

/assign @cheftako

}

if cloud == nil {
return nil, errors.New("no cloud provider provided")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something has gone very wrong to get here. Failure to init the cloud provider in the CCM is a fatal error...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, cloud has been checked before passing to the constructor, and kubeClient is created by ctx.ClientBuilder.ClientOrDie("node-controller")

func (b SimpleControllerClientBuilder) ClientOrDie(name string) clientset.Interface {
client, err := b.Client(name)
if err != nil {
klog.Fatal(err)
}
return client
}

so, we don't need to check cloud and kubeClient in the controller initializers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree we probably don't need those checks but probably doesn't hurt to keep them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emmm, I think it could be viewed as a cleanup for codes. There are many controllers consist of cloud and kubeClient, we don't need to check them every time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with me :)

@andrewsykim
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 6, 2019

eventBroadcaster := record.NewBroadcaster()
recorder := eventBroadcaster.NewRecorder(scheme.Scheme, v1.EventSource{Component: "cloud-node-controller"})
eventBroadcaster.StartLogging(klog.Infof)
if kubeClient != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a cloud.Instances() check similar to NewCloudNodeLifecycleController?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cloud.Instances() has been added.

@andrewsykim
Copy link
Member

/retitle don't start the cloud node controller if cloudprovider.Instances is not supported

@k8s-ci-robot k8s-ci-robot changed the title param check for cloud node controller don't start the cloud node controller if cloudprovider.Instances is not supported Sep 6, 2019
@andrewsykim
Copy link
Member

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 6, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrewsykim, m3ngyang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 6, 2019
@andrewsykim
Copy link
Member

/priority important-longterm

@k8s-ci-robot k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Sep 6, 2019
@m3ngyang
Copy link
Contributor Author

m3ngyang commented Sep 6, 2019

/test pull-kubernetes-integration

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Sep 7, 2019
@m3ngyang
Copy link
Contributor Author

m3ngyang commented Sep 7, 2019

I made a mistake about cloud check in NewCloudNodeLifecycleController. This constructor will be called by cloud-controller-manager and kube-controller-manager, the former checks cloud before passing it to the New function, but the latter does not. So we'd better not change pkg/controller/cloud/node_lifecycle_controller.go currently. @andrewsykim

@andrewsykim
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 9, 2019
@m3ngyang
Copy link
Contributor Author

m3ngyang commented Sep 9, 2019

Would this pr be included in 1.16? @andrewsykim

@andrewsykim
Copy link
Member

I think this is more a sanity check then it is a bug so I don't think we need to push this to v1.16, what do you think?

@m3ngyang
Copy link
Contributor Author

m3ngyang commented Sep 9, 2019

It makes sense. This pr can be merged after releasing 1.16.

@cheftako
Copy link
Member

cheftako commented Sep 9, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot merged commit 61b30b0 into kubernetes:master Sep 11, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.17 milestone Sep 11, 2019
@m3ngyang m3ngyang deleted the node-ctrl-check branch September 12, 2019 00:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants