Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding flexibility to CCM #93764

Merged
merged 1 commit into from Nov 9, 2020
Merged

Adding flexibility to CCM #93764

merged 1 commit into from Nov 9, 2020

Conversation

cici37
Copy link
Contributor

@cici37 cici37 commented Aug 6, 2020

What type of PR is this?
/kind feature

What this PR does / why we need it:

The Cloud Provider would be responsible for creating the outer CCM process which is https://github.com/kubernetes/kubernetes/blob/master/cmd/cloud-controller-manager/controller-manager.go and have the flexibility to specify the controllers and cloud provider config.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Aug 6, 2020
@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 6, 2020
@cici37 cici37 force-pushed the ccmwrap branch 2 times, most recently from 1fd6c06 to 046878c Compare August 6, 2020 22:56
@cici37
Copy link
Contributor Author

cici37 commented Aug 7, 2020

/retest

@cici37 cici37 changed the title [WIP]Demonstrate how cloud provider consume CCM Adding flexibility to CCM Aug 10, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 10, 2020
@cici37
Copy link
Contributor Author

cici37 commented Aug 10, 2020

/cc @aoxn from Alibaba cloud. Please feel free to see if the changes make sense.
/assign @cheftako

func main() {
rand.Seed(time.Now().UnixNano())

command := app.NewCloudControllerManagerCommand()
cloud, err := cloudprovider.InitCloudProvider(cloudProviderName, cloudConfigFile)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'd still want to call this after flag parsing since a provider will likely depend on flags for the cloud config file path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, it would be good to update this reference to parse the cloud config file from the flag option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be nice to show a split example. Have 1 parameter (provider name) hard coded to demonstrate they can get rid of these flag options and the other parameter (cloud config file) which as Andrew suggests demonstrates how to consume flag options.

cmd/cloud-controller-manager/app/controllermanager.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 13, 2020
@cici37 cici37 changed the title Adding flexibility to CCM [WIP]Adding flexibility to CCM Aug 31, 2020
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 31, 2020
@cici37 cici37 force-pushed the ccmwrap branch 2 times, most recently from 00a954c to 135ea31 Compare September 2, 2020 19:58
@cici37
Copy link
Contributor Author

cici37 commented Nov 3, 2020

/retest

1 similar comment
@cici37
Copy link
Contributor Author

cici37 commented Nov 3, 2020

/retest

@cici37 cici37 force-pushed the ccmwrap branch 2 times, most recently from 1ae6f29 to 375f956 Compare November 4, 2020 00:10
@cheftako
Copy link
Member

cheftako commented Nov 4, 2020

/test pull-kubernetes-bazel-test

@cheftako
Copy link
Member

cheftako commented Nov 4, 2020

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 4, 2020
@cheftako
Copy link
Member

cheftako commented Nov 4, 2020

/assign @liggitt

- k8s.io/kubernetes/pkg/api/legacyscheme
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was adding these back intentional or a bad rebase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intentional. This is indirectly imported by k8s.io/kubernetes/pkg/controller/testutil -> node_ipam_controller_test in `k8s.io/kubernetes/pkg/controller/nodeipam'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could have a follow up PR to put the samples(like nodeIpamController) in its own directory and also split out the base/vanilla version from the sample showing usage of the extensions as mentioned above. In this way the .import-restrictions file might look better and make less confusion for cloud providers to follow.

Copy link
Member

@liggitt liggitt Nov 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes under staging look fine to me.

I'm having trouble understanding the current status/consumers and goals for each of the following:

  1. current cloud-controller-manager command / binary (cmd/cloud-controller-manager)
    • is this included in current releases or image manifests built/pushed? are there consumers of this today?
  2. ideal minimal example main()
    • I expected this to live in staging k8s.io/cloud-provider with no k8s.io/kubernetes dependencies
  3. an example that stitches in node_ipam to the minimal example (either by moving out node_ipam or by living in k8s.io/kubernetes)

Modifying the existing main() to add node_ipam without first extracting the minimal k8s.io/kubernetes-free one is making the path to the minimal on harder to see. Is that still the plan?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.a) As a fast follow it might be nice to split the binary. Have a vanilla binary with none of the extensions in use which a cloud provider could copy and modify to get started. A sample which shows how to use each of the extension mechanisms. I think the vanilla should be here and the sample under a sample sub directory?

1.b) The CCM in K/K is not included in any releases but it is built to ensure it remains buildable. It is forked and modified into various other repos where the modified version is released. These changes should allow the cloud providers to just depend on libraries, no more forking needed.

2.a) +1, see the fast follow suggestion above.

2.b) +Alot, if we break the sample into its own directory, that should allow the base version to have no K8s.io/kubernetes dependencies. More importantly we can then enforce that the base version have no K8s.io/kubernetes dependencies.

  1. +1

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 5, 2020
@cici37
Copy link
Contributor Author

cici37 commented Nov 5, 2020

@liggitt Would you please have a look when you have time? Thank you :)

The changes added after last review:
Commented out the example of deleting a controller to prevent it from accidentally deletion. Add more comments.

@@ -114,10 +115,17 @@ func StartTestServer(t Logger, customFlags []string) (result TestServer, err err
if err != nil {
return result, fmt.Errorf("failed to create config from options: %v", err)
}
cloud, err := cloudprovider.InitCloudProvider(config.Complete().ComponentConfig.KubeCloudShared.CloudProvider.Name, config.Complete().ComponentConfig.KubeCloudShared.CloudProvider.CloudConfigFile)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have the context to know whether we should use a fake or not, but calling config.Complete() twice was surprising to me

@@ -84,13 +85,13 @@ func StartTestServer(t Logger, customFlags []string) (result TestServer, err err
if err != nil {
return TestServer{}, err
}
all, disabled := app.KnownControllers(), app.ControllersDisabledByDefault.List()
all := []string{"cloud-node", "cloud-node-lifecycle", "service", "route"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will anything notice if this drifts? does it matter if it does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter even if we pass an empty string array. Used for validate: https://github.com/kubernetes/kubernetes/blob/bd95fb101dd83cfca9d95e55b2f0d29720016e51/staging/src/k8s.io/controller-manager/options/generic.go#L90

@@ -17,32 +17,131 @@ limitations under the License.
// The external controller manager is responsible for running controller loops that
// are cloud provider dependent. It uses the API to listen to new events on resources.

// This file should be written by each cloud provider.
// The current file demonstrate how other cloud provider should leverage CCM and it uses fake parameters. Please modify for your own use.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be changing the current cloud-controller-manager binary to use fake / example params... am I reading that correctly? Is this in use by anyone who would be affected by these changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current plan is to keep this file as a sample main.go in k/k. Other cloud providers will have their own main.go file in their repo(e.g.:https://github.com/kubernetes/cloud-provider-aws/blob/master/cmd/aws-cloud-controller-manager/main.go).
The current cloud-controller-manager binary (cmd/cloud-controller-manager) will only contain sample code and will not have consumers.

- k8s.io/kubernetes/pkg/api/legacyscheme
Copy link
Member

@liggitt liggitt Nov 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes under staging look fine to me.

I'm having trouble understanding the current status/consumers and goals for each of the following:

  1. current cloud-controller-manager command / binary (cmd/cloud-controller-manager)
    • is this included in current releases or image manifests built/pushed? are there consumers of this today?
  2. ideal minimal example main()
    • I expected this to live in staging k8s.io/cloud-provider with no k8s.io/kubernetes dependencies
  3. an example that stitches in node_ipam to the minimal example (either by moving out node_ipam or by living in k8s.io/kubernetes)

Modifying the existing main() to add node_ipam without first extracting the minimal k8s.io/kubernetes-free one is making the path to the minimal on harder to see. Is that still the plan?

@cici37 cici37 force-pushed the ccmwrap branch 2 times, most recently from 3b7330e to 5fd7cef Compare November 6, 2020 09:56
@cici37
Copy link
Contributor Author

cici37 commented Nov 6, 2020

@liggitt Thank you so much for the review.

  1. The current plan to to have cmd/cloud-controller-manager only contains sample code and each cloud provider will vendor in k8s.io/cloud-provider instead.

  2. cmd/cloud-controller-manager/controller-manager.go would be living in k/k as a detailed example to show new coming/existing cloud providers the way to leverage CCM. I added another minimal example main() under k8s.io/cloud-provider/sample/main.go.

  3. Since cmd/cloud-controller-manager/controller-manager.go still live in k/k, I guess it would be ok to have node_ipam example living inside k/k as well. Plan to have a following PR to move node_ipam sample out to a separate dir so that the .import-restrictions file might look better.

@liggitt
Copy link
Member

liggitt commented Nov 9, 2020

talked offline... cmd/cloud-controller-manager is currently built, but not included in release artifacts

this PR is an interim state, and will be followed by the split described in #93764 (comment)

/approve
for dep changes

will leave lgtm to @cheftako

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cici37, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 9, 2020
@cheftako
Copy link
Member

cheftako commented Nov 9, 2020

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 9, 2020
@cici37
Copy link
Contributor Author

cici37 commented Nov 9, 2020

Have the following PR to add more samples: #96385

@k8s-ci-robot
Copy link
Contributor

@cici37: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kubernetes-e2e-gce 3b0c2394ccf0e98da5757b861b8fb375e48e2c2a link /test pull-kubernetes-e2e-gce
pull-kubernetes-e2e-gce-ubuntu-containerd 895a0a8 link /test pull-kubernetes-e2e-gce-ubuntu-containerd

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@cici37
Copy link
Contributor Author

cici37 commented Nov 9, 2020

/retest

@k8s-ci-robot k8s-ci-robot merged commit 995e531 into kubernetes:master Nov 9, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.20 milestone Nov 9, 2020
@cici37 cici37 deleted the ccmwrap branch November 9, 2020 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cloudprovider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants