Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add lease endpoint reconciler #51698

Merged

Conversation

rphillips
Copy link
Member

@rphillips rphillips commented Aug 31, 2017

What this PR does / why we need it: Adds OpenShift's LeaseEndpointReconciler to register kube-apiserver endpoints within the storage registry.

Adds a command-line argument alpha-endpoint-reconciler-type to the kube-apiserver.

Defaults to the old MasterCount reconciler.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes kubernetes/community#939 fixes #22609

Release note:

Adds a command-line argument to kube-apiserver called
--alpha-endpoint-reconciler-type=(master-count, lease, none) (default
"master-count"). The original reconciler is 'master-count'. The 'lease'
reconciler uses the storageapi and a TTL to keep alive an endpoint within the
`kube-apiserver-endpoint` storage namespace. The 'none' reconciler is a noop
reconciler that does not do anything. This is useful for self-hosted
environments.

/cc @lavalamp @smarterclayton @ncdc

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 31, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @rphillips. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 31, 2017
@k8s-github-robot k8s-github-robot added the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Aug 31, 2017
@lpabon
Copy link
Contributor

lpabon commented Aug 31, 2017

@rphillips Could you describe which files are equal to OpenShift's and which lines/files you added/edited to make it easier to review?

@ericchiang
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 31, 2017
ttl := c.MasterEndpointReconcileTTL
config, err := c.StorageFactory.NewConfig(kapi.Resource("apiServerIPInfo"))
if err != nil {
glog.Fatalf("Error determining service IP ranges: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an acceptable pattern to exit the application at this point instead of returning an error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so. It is during config setup

@rphillips
Copy link
Member Author

@lpabon The OpenShift files are marked with their original location. Everything else is written or generated for this PR

@calebamiles
Copy link
Contributor

/release-note-none

@k8s-ci-robot k8s-ci-robot added the release-note-none Denotes a PR that doesn't merit a release note. label Aug 31, 2017
@k8s-github-robot k8s-github-robot removed the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Aug 31, 2017
@calebamiles
Copy link
Contributor

/test pull-kubernetes-unit
/test pull-kubernetes-bazel-test
/test pull-kubernetes-node-e2e

@calebamiles
Copy link
Contributor

/release-note

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Aug 31, 2017
@rphillips rphillips force-pushed the feat/lease_endpoint_reconciler branch from a462c9c to 1c9ec47 Compare August 31, 2017 18:30
func (c *Config) createEndpointReconciler() EndpointReconciler {
glog.Infof("Using reconciler: %v", c.EndpointReconcilerType)
switch c.EndpointReconcilerType {
case "":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A option to disable the reconciler, would be useful for people who use self-hosted k8s.

Then they can just create a kubernetes service which use selector.

See: kubernetes/community#939 (comment) and kubernetes/community#939 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added a none reconciler to not do anything on the reconcile loop.

@calebamiles
Copy link
Contributor

/test pull-kubernetes-e2e-gce-bazel

Copy link
Member

@lavalamp lavalamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't yet reviewed the code that you're upstreaming from openshift.

@@ -164,6 +166,9 @@ func (s *ServerRunOptions) AddFlags(fs *pflag.FlagSet) {
fs.IntVar(&s.MasterCount, "apiserver-count", s.MasterCount,
"The number of apiservers running in the cluster, must be a positive number.")

fs.StringVar(&s.EndpointReconcilerType, "alpha-endpoint-reconciler-type", "master-count",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default should be s.EndpointReconcilerType, and that should be initialized properly (look for where other defaults are set).

Description must clearly specify what the options are.

AllowPrivileged: false,
ServiceNodePortRange: DefaultServiceNodePortRange,
MasterCount: 5,
EndpointReconcilerType: "master-count",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a type and constants for this rather than write the string everywhere.

limitations under the License.
*/

// Package election provides objects for managing the list of active masters via leases.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this a sub-directory under pkg/master/

Also, add a disclaimer here that this is not the intended way for any apiserver other than kube-apiserver to accomplish this task.


const (
// DefaultMasterCountReconciler will select the original reconciler
MasterCountReconciler = "master-count"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you already have these constants! give them a type and use them everywhere. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(you may need to put them in a more visible place)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should make them CamelCase to be consistent with other constants for type

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem was the circular dependency it caused, but I'll look into it.

@@ -155,6 +180,58 @@ type completedConfig struct {
*Config
}

func (c *Config) createMasterCountReconciler() EndpointReconciler {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding a "reconcilers.go" file to collect these?

switch c.EndpointReconcilerType {
case "":
fallthrough
case MasterCountReconciler:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

case "", MasterCountReconciler: is more concise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But consider rejecting "" as a flag value on start up.

case NoneEndpointReconciler:
return c.createNoneReconciler()
default:
glog.Fatalf("Reconciler not implemented: %v", c.EndpointReconcilerType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%q

StorageConfig: endpointConfig,
Decorator: generic.UndecoratedStorage,
DeleteCollectionWorkers: 0,
ResourcePrefix: c.StorageFactory.ResourcePrefix(kapi.Resource("endpoints")),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use something that clearly won't collide with the endpoints api. "kube-apiserver-endpoint"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ncdc I don't remember whether there was a reason we didn't do that before?

@smarterclayton
Copy link
Contributor

@ncdc since you're the original author if you want to review

@bgrant0607
Copy link
Member

@rphillips Was there a specific aspect you thought I should look at?

This just changes how Endpoints is populated for the apiserver?

This needs a better release note. See kubernetes/community#484 for guidance.

@k8s-ci-robot k8s-ci-robot removed the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Sep 1, 2017
@rphillips
Copy link
Member Author

rebased

@calebamiles
Copy link
Contributor

/test pull-kubernetes-federation-e2e-gce

@smarterclayton
Copy link
Contributor

/lgtm
/retest

But we need to @kubernetes/sig-release-feature-requests to approve this if the intent is still to deliver for 1.8.

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/release Categorizes an issue or PR as relevant to SIG Release. kind/feature Categorizes issue or PR as related to a new feature. labels Sep 12, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rphillips, smarterclayton

Associated issue: 939

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 52240, 48145, 52220, 51698, 51777). If you want to cherry-pick this change to another branch, please follow the instructions here..

@alexbrand
Copy link
Contributor

Great to see this. Thanks!

I think user-facing documentation about the problem and the reconcilers would be useful? Maybe in https://kubernetes.io/docs/admin/high-availability/?

Happy to open a PR if it makes sense.

@luxas
Copy link
Member

luxas commented Dec 25, 2017

@alexbrand An user-facing docs PR for this would make sense to me 👍

@rphillips Please graduate this to beta in v1.10 and add e2e tests #57617

@tengqm
Copy link
Contributor

tengqm commented Dec 27, 2017

@luxas FYI, this has been documented kubernetes/website#6695.

@luxas
Copy link
Member

luxas commented Dec 27, 2017

Great! Thanks @tengqm!

@rphillips
Copy link
Member Author

@tengqm I've been on vacation. thanks!

bergmannf pushed a commit to bergmannf/salt that referenced this pull request Feb 27, 2019
When a cluster is bootstrapped with multiple kube-apiservers, the `kubernetes`
service contains a list of all of these endpoints.

By default, this list of endpoints will *not* be updated if one of the
apiservers goes down. This can lead to the api becoming unresponsive and
breaking it. To have the endpoints automatically keep track of the apiservers
that are available the `--endpoint-reconciler-type` option `lease` needs to be
added.

(The default option for 1.10 `master-count` only changes the endpoint when the
count changes: https://github.com/apprenda/kismatic/issues/987)

See:

kubernetes/kubernetes#22609
kubernetes/kubernetes#56584
kubernetes/kubernetes#51698
bergmannf pushed a commit to bergmannf/salt that referenced this pull request Feb 27, 2019
When a cluster is bootstrapped with multiple kube-apiservers, the `kubernetes`
service contains a list of all of these endpoints.

By default, this list of endpoints will *not* be updated if one of the
apiservers goes down. This can lead to the api becoming unresponsive and
breaking it. To have the endpoints automatically keep track of the apiservers
that are available the `--endpoint-reconciler-type` option `lease` needs to be
added.

(The default option for 1.10 `master-count` only changes the endpoint when the
count changes: https://github.com/apprenda/kismatic/issues/987)

See:

kubernetes/kubernetes#22609
kubernetes/kubernetes#56584
kubernetes/kubernetes#51698
bergmannf pushed a commit to bergmannf/salt that referenced this pull request Feb 27, 2019
When a cluster is bootstrapped with multiple kube-apiservers, the `kubernetes`
service contains a list of all of these endpoints.

By default, this list of endpoints will *not* be updated if one of the
apiservers goes down. This can lead to the api becoming unresponsive and
breaking it. To have the endpoints automatically keep track of the apiservers
that are available the `--endpoint-reconciler-type` option `lease` needs to be
added.

(The default option for 1.10 `master-count` only changes the endpoint when the
count changes: https://github.com/apprenda/kismatic/issues/987)

See:

kubernetes/kubernetes#22609
kubernetes/kubernetes#56584
kubernetes/kubernetes#51698
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/release Categorizes an issue or PR as relevant to SIG Release. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kube-apiserver endpoint cleanup when --apiserver-count>1