Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add lease endpoint reconciler #51698

Merged

Conversation

@rphillips
Copy link
Member

commented Aug 31, 2017

What this PR does / why we need it: Adds OpenShift's LeaseEndpointReconciler to register kube-apiserver endpoints within the storage registry.

Adds a command-line argument alpha-endpoint-reconciler-type to the kube-apiserver.

Defaults to the old MasterCount reconciler.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes kubernetes/community#939 fixes #22609

Release note:

Adds a command-line argument to kube-apiserver called
--alpha-endpoint-reconciler-type=(master-count, lease, none) (default
"master-count"). The original reconciler is 'master-count'. The 'lease'
reconciler uses the storageapi and a TTL to keep alive an endpoint within the
`kube-apiserver-endpoint` storage namespace. The 'none' reconciler is a noop
reconciler that does not do anything. This is useful for self-hosted
environments.

/cc @lavalamp @smarterclayton @ncdc

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Aug 31, 2017

Hi @rphillips. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@lpabon

This comment has been minimized.

Copy link
Contributor

commented Aug 31, 2017

@rphillips Could you describe which files are equal to OpenShift's and which lines/files you added/edited to make it easier to review?

@ericchiang

This comment has been minimized.

Copy link
Member

commented Aug 31, 2017

/ok-to-test

ttl := c.MasterEndpointReconcileTTL
config, err := c.StorageFactory.NewConfig(kapi.Resource("apiServerIPInfo"))
if err != nil {
glog.Fatalf("Error determining service IP ranges: %v", err)

This comment has been minimized.

Copy link
@lpabon

lpabon Aug 31, 2017

Contributor

Is this an acceptable pattern to exit the application at this point instead of returning an error?

This comment has been minimized.

Copy link
@rphillips

rphillips Aug 31, 2017

Author Member

Yes, I think so. It is during config setup

@rphillips

This comment has been minimized.

Copy link
Member Author

commented Aug 31, 2017

@lpabon The OpenShift files are marked with their original location. Everything else is written or generated for this PR

@calebamiles

This comment has been minimized.

Copy link
Member

commented Aug 31, 2017

/release-note-none

@calebamiles

This comment has been minimized.

Copy link
Member

commented Aug 31, 2017

/test pull-kubernetes-unit
/test pull-kubernetes-bazel-test
/test pull-kubernetes-node-e2e

@calebamiles

This comment has been minimized.

Copy link
Member

commented Aug 31, 2017

/release-note

@rphillips rphillips force-pushed the rphillips:feat/lease_endpoint_reconciler branch from a462c9c to 1c9ec47 Aug 31, 2017

func (c *Config) createEndpointReconciler() EndpointReconciler {
glog.Infof("Using reconciler: %v", c.EndpointReconcilerType)
switch c.EndpointReconcilerType {
case "":

This comment has been minimized.

Copy link
@klausenbusk

klausenbusk Aug 31, 2017

Contributor

A option to disable the reconciler, would be useful for people who use self-hosted k8s.

Then they can just create a kubernetes service which use selector.

See: kubernetes/community#939 (comment) and kubernetes/community#939 (comment)

This comment has been minimized.

Copy link
@rphillips

rphillips Aug 31, 2017

Author Member

i added a none reconciler to not do anything on the reconcile loop.

@calebamiles

This comment has been minimized.

Copy link
Member

commented Aug 31, 2017

/test pull-kubernetes-e2e-gce-bazel

@lavalamp
Copy link
Member

left a comment

I haven't yet reviewed the code that you're upstreaming from openshift.

@@ -164,6 +166,9 @@ func (s *ServerRunOptions) AddFlags(fs *pflag.FlagSet) {
fs.IntVar(&s.MasterCount, "apiserver-count", s.MasterCount,
"The number of apiservers running in the cluster, must be a positive number.")

fs.StringVar(&s.EndpointReconcilerType, "alpha-endpoint-reconciler-type", "master-count",

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 31, 2017

Member

Default should be s.EndpointReconcilerType, and that should be initialized properly (look for where other defaults are set).

Description must clearly specify what the options are.

AllowPrivileged: false,
ServiceNodePortRange: DefaultServiceNodePortRange,
MasterCount: 5,
EndpointReconcilerType: "master-count",

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 31, 2017

Member

Make a type and constants for this rather than write the string everywhere.

limitations under the License.
*/

// Package election provides objects for managing the list of active masters via leases.

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 31, 2017

Member

Please make this a sub-directory under pkg/master/

Also, add a disclaimer here that this is not the intended way for any apiserver other than kube-apiserver to accomplish this task.


const (
// DefaultMasterCountReconciler will select the original reconciler
MasterCountReconciler = "master-count"

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 31, 2017

Member

Oh, you already have these constants! give them a type and use them everywhere. :)

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 31, 2017

Member

(you may need to put them in a more visible place)

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Sep 1, 2017

Contributor

Probably should make them CamelCase to be consistent with other constants for type

This comment has been minimized.

Copy link
@rphillips

rphillips Sep 1, 2017

Author Member

The problem was the circular dependency it caused, but I'll look into it.

@@ -155,6 +180,58 @@ type completedConfig struct {
*Config
}

func (c *Config) createMasterCountReconciler() EndpointReconciler {

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 31, 2017

Member

consider adding a "reconcilers.go" file to collect these?

switch c.EndpointReconcilerType {
case "":
fallthrough
case MasterCountReconciler:

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 31, 2017

Member

case "", MasterCountReconciler: is more concise.

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 31, 2017

Member

But consider rejecting "" as a flag value on start up.

case NoneEndpointReconciler:
return c.createNoneReconciler()
default:
glog.Fatalf("Reconciler not implemented: %v", c.EndpointReconcilerType)

This comment has been minimized.

Copy link
@lavalamp
StorageConfig: endpointConfig,
Decorator: generic.UndecoratedStorage,
DeleteCollectionWorkers: 0,
ResourcePrefix: c.StorageFactory.ResourcePrefix(kapi.Resource("endpoints")),

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 31, 2017

Member

I would use something that clearly won't collide with the endpoints api. "kube-apiserver-endpoint"?

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Sep 1, 2017

Contributor

@ncdc I don't remember whether there was a reason we didn't do that before?

@smarterclayton

This comment has been minimized.

Copy link
Contributor

commented Sep 1, 2017

@ncdc since you're the original author if you want to review

@bgrant0607

This comment has been minimized.

Copy link
Member

commented Sep 1, 2017

@rphillips Was there a specific aspect you thought I should look at?

This just changes how Endpoints is populated for the apiserver?

This needs a better release note. See kubernetes/community#484 for guidance.

@k8s-ci-robot k8s-ci-robot removed the size/XL label Sep 1, 2017

@smarterclayton

This comment has been minimized.

Copy link
Contributor

commented Sep 12, 2017

/lgtm
/retest

But we need to @kubernetes/sig-release-feature-requests to approve this if the intent is still to deliver for 1.8.

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

commented Sep 12, 2017

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rphillips, smarterclayton

Associated issue: 939

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@fejta-bot

This comment has been minimized.

Copy link

commented Sep 12, 2017

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

1 similar comment
@fejta-bot

This comment has been minimized.

Copy link

commented Sep 16, 2017

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

commented Sep 23, 2017

Automatic merge from submit-queue (batch tested with PRs 52240, 48145, 52220, 51698, 51777). If you want to cherry-pick this change to another branch, please follow the instructions here..

@k8s-github-robot k8s-github-robot merged commit fd3c1f4 into kubernetes:master Sep 23, 2017

13 of 14 checks passed

pull-kubernetes-e2e-kubeadm-gce Parent Job Status Changed: Job triggered.
Submit Queue Queued to run github e2e tests a second time.
Details
cla/linuxfoundation rphillips authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-e2e-gce-bazel Job succeeded.
Details
pull-kubernetes-e2e-gce-etcd3 Jenkins job succeeded.
Details
pull-kubernetes-e2e-gce-gpu Jenkins job succeeded.
Details
pull-kubernetes-e2e-kops-aws Jenkins job succeeded.
Details
pull-kubernetes-federation-e2e-gce Jenkins job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce Jenkins job succeeded.
Details
pull-kubernetes-node-e2e Jenkins job succeeded.
Details
pull-kubernetes-unit Jenkins job succeeded.
Details
pull-kubernetes-verify Jenkins job succeeded.
Details

@mumoshu mumoshu referenced this pull request Dec 6, 2017

Closed

Reconsider apiserver-count #499

@alexbrand

This comment has been minimized.

Copy link
Member

commented Dec 18, 2017

Great to see this. Thanks!

I think user-facing documentation about the problem and the reconcilers would be useful? Maybe in https://kubernetes.io/docs/admin/high-availability/?

Happy to open a PR if it makes sense.

@luxas

This comment has been minimized.

Copy link
Member

commented Dec 25, 2017

@alexbrand An user-facing docs PR for this would make sense to me 👍

@rphillips Please graduate this to beta in v1.10 and add e2e tests #57617

@tengqm

This comment has been minimized.

Copy link
Contributor

commented Dec 27, 2017

@luxas FYI, this has been documented kubernetes/website#6695.

@luxas

This comment has been minimized.

Copy link
Member

commented Dec 27, 2017

Great! Thanks @tengqm!

@rphillips

This comment has been minimized.

Copy link
Member Author

commented Jan 8, 2018

@tengqm I've been on vacation. thanks!

bergmannf added a commit to bergmannf/salt that referenced this pull request Feb 27, 2019

Automatically update the kubernetes-service endpoint.
When a cluster is bootstrapped with multiple kube-apiservers, the `kubernetes`
service contains a list of all of these endpoints.

By default, this list of endpoints will *not* be updated if one of the
apiservers goes down. This can lead to the api becoming unresponsive and
breaking it. To have the endpoints automatically keep track of the apiservers
that are available the `--endpoint-reconciler-type` option `lease` needs to be
added.

(The default option for 1.10 `master-count` only changes the endpoint when the
count changes: apprenda/kismatic#987)

See:

kubernetes/kubernetes#22609
kubernetes/kubernetes#56584
kubernetes/kubernetes#51698

bergmannf added a commit to bergmannf/salt that referenced this pull request Feb 27, 2019

Automatically update the kubernetes-service endpoint.
When a cluster is bootstrapped with multiple kube-apiservers, the `kubernetes`
service contains a list of all of these endpoints.

By default, this list of endpoints will *not* be updated if one of the
apiservers goes down. This can lead to the api becoming unresponsive and
breaking it. To have the endpoints automatically keep track of the apiservers
that are available the `--endpoint-reconciler-type` option `lease` needs to be
added.

(The default option for 1.10 `master-count` only changes the endpoint when the
count changes: apprenda/kismatic#987)

See:

kubernetes/kubernetes#22609
kubernetes/kubernetes#56584
kubernetes/kubernetes#51698

bergmannf added a commit to bergmannf/salt that referenced this pull request Feb 27, 2019

Automatically update the kubernetes-service endpoint.
When a cluster is bootstrapped with multiple kube-apiservers, the `kubernetes`
service contains a list of all of these endpoints.

By default, this list of endpoints will *not* be updated if one of the
apiservers goes down. This can lead to the api becoming unresponsive and
breaking it. To have the endpoints automatically keep track of the apiservers
that are available the `--endpoint-reconciler-type` option `lease` needs to be
added.

(The default option for 1.10 `master-count` only changes the endpoint when the
count changes: apprenda/kismatic#987)

See:

kubernetes/kubernetes#22609
kubernetes/kubernetes#56584
kubernetes/kubernetes#51698
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.