Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add apiserver-count fix proposal #939

Merged

Conversation

@rphillips
Copy link
Member

commented Aug 17, 2017

This is a proposal to fix the apiserver-count issue at kubernetes/kubernetes#22609. I would appreciate a review on the proposal.

  • Add ConfigMap for configurable options
  • Find out dependencies on the Endpoints API and add them to the proposal
Ryan Phillips
@hongchaodeng

This comment has been minimized.

Copy link
Member

commented Aug 17, 2017

Shouldn't the commit msg and title be "add apiserver-count fix proposal"? Mind the "-".

@ericchiang

This comment has been minimized.

Copy link
Member

commented Aug 17, 2017

@rphillips rphillips changed the title add apiserver-count-fix proposal add apiserver-count fix proposal Aug 18, 2017

@luxas

This comment has been minimized.

Copy link
Member

commented Aug 18, 2017

cc @kubernetes/sig-cluster-lifecycle-feature-requests

@lpabon

This comment has been minimized.

Copy link

commented Aug 18, 2017

Hi @rphillips thanks for putting this together. Could you add the following to the documentation:

  • Configurable values for timeouts or other magic numbers
  • Explanation in the document on how clients consume this information

## Proposal

### Create New Reconciler

This comment has been minimized.

Copy link
@mikedanese

mikedanese Aug 18, 2017

Member

This design says "create new reconciler" and has a struct definition but doesn't say what the reconciler does or how the struct is used.

This comment has been minimized.

Copy link
@rphillips

rphillips Aug 18, 2017

Author Member

Thank you @mikedanese. Added a stanza in commit 81b6c58

Ryan Phillips

@k8s-github-robot k8s-github-robot added size/L and removed size/M labels Aug 18, 2017

@lavalamp lavalamp assigned lavalamp and unassigned idvoretskyi and sarahnovotny Aug 21, 2017


| Kubernetes Release | Quality | Description |
| ------------- | ------------- | ----------- |
| 1.9 | alpha | <ul><li>Add a new reconciler</li><li>Add a command-line switch --new-reconciler</li><li>Add a command-line switch --old-reconciler</li></ul>

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Aug 21, 2017

Contributor

We generally don't add flags like this, it would usually be a type flag with an experimental prefix for alpha.

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

Agreed; I think one flag, e.g. --reconcile-kubernetes-service-dynamically or something like that to opt in is the only necessary thing, prefixed with --alpha or --experimental for v1.9

}
```

Add a standard `kube-apiserver-endpoints` ConfigMap in the `default` namespace. The ConfigMap would be formed such that:

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Aug 21, 2017

Contributor

The last proposed fix discussed adding an extension resource via crd instead of using config map. Add a section discussing tradeoffs for that here

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

Wasn't that for the full-blown APIServer status registration or do I recall incorrectly?
Anyway, I think that a ConfigMap is good here.
We've discussed layer violations of this kind before and it seems like that would be one.

This comment has been minimized.

Copy link
@smarterclayton

smarterclayton Aug 22, 2017

Contributor

I don't think it would be a layer violation, because this is the core api server (it's allowed to depend on apis). We've already discussed it several times at an arch level.

@lavalamp
Copy link
Member

left a comment

Seems pretty close.


Create a new `MasterEndpointReconciler` within master/controller.go.

Add a `kube-apiserver-endpoints-config` ConfigMap in the `default` namespace. The duration found within the map would be configurable by admins without a recompile. The ConfigMap would include the following:

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 21, 2017

Member

The namespace and configmap name should probably be flags :/

I would at least default to the kube-system namespace.

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 21, 2017

Member

Actually the name of this should probably be kube-apiserver-config; it should grow to include everything that is currently a flag. Endpoints don't need a special config thingy.

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 21, 2017

Member

I think someone may have been working on moving flags to a config map, you may want to coordinate with them. @mikedanese do you recall who was working on that?

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

@lavalamp That is what we asked for here: https://docs.google.com/document/d/17C2D1Ghgv40pWByevo0ih8T-18Q1GlxbR4gu1_ZKlPE/edit?disco=AAAAA-EAeB0

(to implement componentconfig for the API server)

Could we prioritize that as part of this flow in v1.9?


```go
ConfigMap{
"ip-2001-4860-4860--8888-443": "serialized JSON ControllerEndpointData",

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 21, 2017

Member

Does anything other than a time need to be in the value? I'm not sure what benefit we get from cramming all the other stuff in there?

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

Kind of agree; but at the same time we don't know what the future holds for us :)

Add a `kube-apiserver-endpoints-config` ConfigMap in the `default` namespace. The duration found within the map would be configurable by admins without a recompile. The ConfigMap would include the following:

```go
ConfigMap{

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 21, 2017

Member

The name and namespace of the endpoint coordination config map should be stored in this configmap rather than be hard-coded.

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

Should this just be a flag on the API server; and eventually stored in componentconfig (in a ConfigMap) which will be easy to roll out?

The reconcile loop will expire endpoints that do not meet the duration. On
each reconcile loop (the loop runs every 10 seconds currently):

1. Retrieve `kube-apiserver-endpoints-config` ConfigMap (as configMap)

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 21, 2017

Member

The config probably doesn't need to be gotten all that frequently, although it doesn't hurt anything, either.

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

Is this GET on every reconcile loop or WATCH? Retrieve could mean both I guess


1. Retrieve `kube-apiserver-endpoints-config` ConfigMap (as configMap)
1. Retrieve `kube-apiserver-endpoints` ConfigMap (as endpointMap)
1. Update the `UpdateTimestamp` for the currently running API server

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 21, 2017

Member

Can this be done via a PATCH?

1. Retrieve `kube-apiserver-endpoints-config` ConfigMap (as configMap)
1. Retrieve `kube-apiserver-endpoints` ConfigMap (as endpointMap)
1. Update the `UpdateTimestamp` for the currently running API server
1. Remove all endpoints where the UpdateTimestamp is greater than `expire-duration` from the configMap.

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 21, 2017

Member

I would do this in two requests rather than a single one, to avoid conflicts when multiple apiservers delete a record. Actually I'm not 100% sure it's a conflict if two requests delete the same thing, would a PATCH like that apply cleanly or give an error?

```

The reconcile loop will expire endpoints that do not meet the duration. On
each reconcile loop (the loop runs every 10 seconds currently):

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 21, 2017

Member

Instead of a fixed 10s, this should probably be something like 50% to 80% of the expiration time, whatever that is.

| ------------- | ------------- | ----------- |
| 1.9 | stable | <ul><li>Change the logic in the current reconciler</li></ul>

We could potentially reuse the old reconciler, but ignore the masterCount and change the logic to use the proposal from the previous section.

This comment has been minimized.

Copy link
@lavalamp

lavalamp Aug 21, 2017

Member

A first step might just be to count the number of entries in the config map and change the masterCount variable to that every time through the above loop (probably need to add a lock). That would probably be just a few line change.

@rphillips

This comment has been minimized.

Copy link
Member Author

commented Aug 21, 2017

Thank you all for the feedback! I'll add everything in.

@fisherxu

This comment has been minimized.

Copy link
Member

commented Aug 22, 2017

Have little doubts, why can‘t we add label on kube-apiserver's pod and through the selector in service sync the endpoint...

@lavalamp

This comment has been minimized.

Copy link
Member

commented Aug 22, 2017

Have little doubts, why can‘t we add label on kube-apiserver's pod and through the selector in service sync the endpoint...

Yeah, this would definitely be the way to go if all installations of Kubernetes were self-hosted. Unfortunately that is not the case.

@luxas
Copy link
Member

left a comment

Thanks @rphillips!
Left a couple of comments.
Pros/cons between the different methods (new reconciler and old reconciler) would be nice to see as well

Proposal to fix Issue [#22609](https://github.com/kubernetes/kubernetes/issues/22609)

`kube-apiserver` currently has a command-line argument `--apiserver-count`
specifying the number of api masters. This masterCount is used in the

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

nit: API Servers

specifying the number of api masters. This masterCount is used in the
MasterCountEndpointReconciler on a 10 second interval to potentially cleanup
stale API Endpoints. The issue is when the number of kube-apiserver instances
gets below masterCount. If this case happens, the stale instances within the

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

and above, right? Then they flip nearly every time, right?


Create a new `MasterEndpointReconciler` within master/controller.go.

Add a `kube-apiserver-endpoints-config` ConfigMap in the `default` namespace. The duration found within the map would be configurable by admins without a recompile. The ConfigMap would include the following:

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

@lavalamp That is what we asked for here: https://docs.google.com/document/d/17C2D1Ghgv40pWByevo0ih8T-18Q1GlxbR4gu1_ZKlPE/edit?disco=AAAAA-EAeB0

(to implement componentconfig for the API server)

Could we prioritize that as part of this flow in v1.9?


| Kubernetes Release | Quality | Description |
| ------------- | ------------- | ----------- |
| 1.9 | alpha | <ul><li>Add a new reconciler</li><li>Add a command-line switch --new-reconciler</li><li>Add a command-line switch --old-reconciler</li></ul>

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

Agreed; I think one flag, e.g. --reconcile-kubernetes-service-dynamically or something like that to opt in is the only necessary thing, prefixed with --alpha or --experimental for v1.9

Add a `kube-apiserver-endpoints-config` ConfigMap in the `default` namespace. The duration found within the map would be configurable by admins without a recompile. The ConfigMap would include the following:

```go
ConfigMap{

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

Should this just be a flag on the API server; and eventually stored in componentconfig (in a ConfigMap) which will be easy to roll out?


```go
ConfigMap{
"ip-2001-4860-4860--8888-443": "serialized JSON ControllerEndpointData",

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

Kind of agree; but at the same time we don't know what the future holds for us :)

The reconcile loop will expire endpoints that do not meet the duration. On
each reconcile loop (the loop runs every 10 seconds currently):

1. Retrieve `kube-apiserver-endpoints-config` ConfigMap (as configMap)

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

Is this GET on every reconcile loop or WATCH? Retrieve could mean both I guess

namespace: default
```

### Refactor Old Reconciler

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

nit: Can you make it more obvious that these are the two current viable ways to do this? (and add one for the CRD thingy??)
like: Possible solution 1: foo, Possible solution 2: bar


## Prior Art

[Security Labeller](https://github.com/coreos-inc/security-labeller/issues/18#issuecomment-320791878)

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

404 for me


1. [Overview](#overview)
2. [Known Issues](#known-issues)
3. [Proposal](#proposal)

This comment has been minimized.

Copy link
@luxas

luxas Aug 22, 2017

Member

Add sub-section for different strategies?

Ryan Phillips added some commits Aug 22, 2017

@k8s-ci-robot k8s-ci-robot added size/M and removed size/L labels Aug 28, 2017

Ryan Phillips

@rphillips rphillips force-pushed the rphillips:fixes/apiserver-count-fix branch from 9374400 to 59a7e48 Aug 28, 2017


configmap.yml:
Custom Resource Definitions and ConfigMaps were proposed, but since they are

This comment has been minimized.

Copy link
@xiang90

xiang90 Aug 28, 2017

I can understand ConfigMap has the watching issue. But CRD should not be watched globally. The problem with CRD , as far as I can tell, is the not super clean layering, and potentially we might want something different that can handle both liveness and locking as a k8s native resource. And we are not sure yet.

Ryan Phillips

@rphillips rphillips force-pushed the rphillips:fixes/apiserver-count-fix branch from 493c3cb to 3cf30ed Aug 28, 2017

@rphillips

This comment has been minimized.

Copy link
Member Author

commented Aug 28, 2017

@smarterclayton @lavalamp updated

I have the LeaseEndpointReconciler partially ported (clean build), but more work to do on it.

@rphillips

This comment has been minimized.

Copy link
Member Author

commented Aug 30, 2017

Is fixing kube-proxy in scope of this issue and eventual PR? or should it be a separate effort?

* The IP for endpoint ‘B’ is not
removed from the Endpoints list

There is logic within the

This comment has been minimized.

Copy link
@bgrant0607

bgrant0607 Aug 30, 2017

Member

I assume aggregated apiservers won't do something similar?

This comment has been minimized.

Copy link
@liggitt

liggitt Aug 30, 2017

Member

aggregated servers can either delegate to service dns names (and therefore kube-proxy) or route to an IP selected from the endpoints for the service (and therefore take advantage of whatever mechanism removes unhealthy IP addresses from those endpoints)

@lavalamp

This comment has been minimized.

Copy link
Member

commented Aug 30, 2017

rphillips pushed a commit to rphillips/kubernetes that referenced this pull request Aug 31, 2017

Ryan Phillips
add lease endpoint reconciler
fixes kubernetes/community#939
fixes kubernetes#22609

diff --git a/pkg/election/doc.go b/pkg/election/doc.go
new file mode 100644
index 0000000000..d61d49d7bb
--- /dev/null
+++ b/pkg/election/doc.go
@@ -0,0 +1,2 @@
+// Package election provides objects for managing the list of active masters via leases.
+package election
diff --git a/pkg/election/lease_endpoint_reconciler.go b/pkg/election/lease_endpoint_reconciler.go
new file mode 100644
index 0000000000..397a174010
--- /dev/null
+++ b/pkg/election/lease_endpoint_reconciler.go
@@ -0,0 +1,228 @@
+package election
+
+import (
+	"fmt"
+	"net"
+
+	"github.com/golang/glog"
+	"k8s.io/apimachinery/pkg/api/errors"
+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+	kruntime "k8s.io/apimachinery/pkg/runtime"
+	apirequest "k8s.io/apiserver/pkg/endpoints/request"
+	"k8s.io/apiserver/pkg/storage"
+	"k8s.io/kubernetes/pkg/api"
+	"k8s.io/kubernetes/pkg/api/endpoints"
+	"k8s.io/kubernetes/pkg/registry/core/endpoint"
+)
+
+// Leases is an interface which assists in managing the set of active masters
+type Leases interface {
+	// ListLeases retrieves a list of the current master IPs
+	ListLeases() ([]string, error)
+
+	// UpdateLease adds or refreshes a master's lease
+	UpdateLease(ip string) error
+}
+
+type storageLeases struct {
+	storage   storage.Interface
+	baseKey   string
+	leaseTime uint64
+}
+
+var _ Leases = &storageLeases{}
+
+// ListLeases retrieves a list of the current master IPs from storage
+func (s *storageLeases) ListLeases() ([]string, error) {
+	ipInfoList := &api.EndpointsList{}
+	if err := s.storage.List(apirequest.NewDefaultContext(), s.baseKey, "0", storage.Everything, ipInfoList); err != nil {
+		return nil, err
+	}
+
+	ipList := make([]string, len(ipInfoList.Items))
+	for i, ip := range ipInfoList.Items {
+		ipList[i] = ip.Subsets[0].Addresses[0].IP
+	}
+
+	glog.V(6).Infof("Current master IPs listed in storage are %v", ipList)
+
+	return ipList, nil
+}
+
+// UpdateLease resets the TTL on a master IP in storage
+func (s *storageLeases) UpdateLease(ip string) error {
+	return s.storage.GuaranteedUpdate(apirequest.NewDefaultContext(), s.baseKey+"/"+ip, &api.Endpoints{}, true, nil, func(input kruntime.Object, respMeta storage.ResponseMeta) (kruntime.Object, *uint64, error) {
+		// just make sure we've got the right IP set, and then refresh the TTL
+		existing := input.(*api.Endpoints)
+		existing.Subsets = []api.EndpointSubset{
+			{
+				Addresses: []api.EndpointAddress{{IP: ip}},
+			},
+		}
+
+		leaseTime := s.leaseTime
+
+		// NB: GuaranteedUpdate does not perform the store operation unless
+		// something changed between load and store (not including resource
+		// version), meaning we can't refresh the TTL without actually
+		// changing a field.
+		existing.Generation += 1
+
+		glog.V(6).Infof("Resetting TTL on master IP %q listed in storage to %v", ip, leaseTime)
+
+		return existing, &leaseTime, nil
+	})
+}
+
+// NewLeases creates a new etcd-based Leases implementation.
+func NewLeases(storage storage.Interface, baseKey string, leaseTime uint64) Leases {
+	return &storageLeases{
+		storage:   storage,
+		baseKey:   baseKey,
+		leaseTime: leaseTime,
+	}
+}
+
+type leaseEndpointReconciler struct {
+	endpointRegistry endpoint.Registry
+	masterLeases     Leases
+}
+
+func NewLeaseEndpointReconciler(endpointRegistry endpoint.Registry, masterLeases Leases) *leaseEndpointReconciler {
+	return &leaseEndpointReconciler{
+		endpointRegistry: endpointRegistry,
+		masterLeases:     masterLeases,
+	}
+}
+
+// ReconcileEndpoints lists keys in a special etcd directory.
+// Each key is expected to have a TTL of R+n, where R is the refresh interval
+// at which this function is called, and n is some small value.  If an
+// apiserver goes down, it will fail to refresh its key's TTL and the key will
+// expire. ReconcileEndpoints will notice that the endpoints object is
+// different from the directory listing, and update the endpoints object
+// accordingly.
+func (r *leaseEndpointReconciler) ReconcileEndpoints(serviceName string, ip net.IP, endpointPorts []api.EndpointPort, reconcilePorts bool) error {
+	ctx := apirequest.NewDefaultContext()
+
+	// Refresh the TTL on our key, independently of whether any error or
+	// update conflict happens below. This makes sure that at least some of
+	// the masters will add our endpoint.
+	if err := r.masterLeases.UpdateLease(ip.String()); err != nil {
+		return err
+	}
+
+	// Retrieve the current list of endpoints...
+	e, err := r.endpointRegistry.GetEndpoints(ctx, serviceName, &metav1.GetOptions{})
+	if err != nil {
+		if !errors.IsNotFound(err) {
+			return err
+		}
+
+		e = &api.Endpoints{
+			ObjectMeta: metav1.ObjectMeta{
+				Name:      serviceName,
+				Namespace: api.NamespaceDefault,
+			},
+		}
+	}
+
+	// ... and the list of master IP keys from etcd
+	masterIPs, err := r.masterLeases.ListLeases()
+	if err != nil {
+		return err
+	}
+
+	// Since we just refreshed our own key, assume that zero endpoints
+	// returned from storage indicates an issue or invalid state, and thus do
+	// not update the endpoints list based on the result.
+	if len(masterIPs) == 0 {
+		return fmt.Errorf("no master IPs were listed in storage, refusing to erase all endpoints for the kubernetes service")
+	}
+
+	// Next, we compare the current list of endpoints with the list of master IP keys
+	formatCorrect, ipCorrect, portsCorrect := checkEndpointSubsetFormatWithLease(e, masterIPs, endpointPorts, reconcilePorts)
+	if formatCorrect && ipCorrect && portsCorrect {
+		return nil
+	}
+
+	if !formatCorrect {
+		// Something is egregiously wrong, just re-make the endpoints record.
+		e.Subsets = []api.EndpointSubset{{
+			Addresses: []api.EndpointAddress{},
+			Ports:     endpointPorts,
+		}}
+	}
+
+	if !formatCorrect || !ipCorrect {
+		// repopulate the addresses according to the expected IPs from etcd
+		e.Subsets[0].Addresses = make([]api.EndpointAddress, len(masterIPs))
+		for ind, ip := range masterIPs {
+			e.Subsets[0].Addresses[ind] = api.EndpointAddress{IP: ip}
+		}
+
+		// Lexicographic order is retained by this step.
+		e.Subsets = endpoints.RepackSubsets(e.Subsets)
+	}
+
+	if !portsCorrect {
+		// Reset ports.
+		e.Subsets[0].Ports = endpointPorts
+	}
+
+	glog.Warningf("Resetting endpoints for master service %q to %v", serviceName, masterIPs)
+	return r.endpointRegistry.UpdateEndpoints(ctx, e)
+}
+
+// checkEndpointSubsetFormatWithLease determines if the endpoint is in the
+// format ReconcileEndpoints expects when the controller is using leases.
+//
+// Return values:
+// * formatCorrect is true if exactly one subset is found.
+// * ipsCorrect when the addresses in the endpoints match the expected addresses list
+// * portsCorrect is true when endpoint ports exactly match provided ports.
+//     portsCorrect is only evaluated when reconcilePorts is set to true.
+func checkEndpointSubsetFormatWithLease(e *api.Endpoints, expectedIPs []string, ports []api.EndpointPort, reconcilePorts bool) (formatCorrect bool, ipsCorrect bool, portsCorrect bool) {
+	if len(e.Subsets) != 1 {
+		return false, false, false
+	}
+	sub := &e.Subsets[0]
+	portsCorrect = true
+	if reconcilePorts {
+		if len(sub.Ports) != len(ports) {
+			portsCorrect = false
+		} else {
+			for i, port := range ports {
+				if port != sub.Ports[i] {
+					portsCorrect = false
+					break
+				}
+			}
+		}
+	}
+
+	ipsCorrect = true
+	if len(sub.Addresses) != len(expectedIPs) {
+		ipsCorrect = false
+	} else {
+		// check the actual content of the addresses
+		// present addrs is used as a set (the keys) and to indicate if a
+		// value was already found (the values)
+		presentAddrs := make(map[string]bool, len(expectedIPs))
+		for _, ip := range expectedIPs {
+			presentAddrs[ip] = false
+		}
+
+		// uniqueness is assumed amongst all Addresses.
+		for _, addr := range sub.Addresses {
+			if alreadySeen, ok := presentAddrs[addr.IP]; alreadySeen || !ok {
+				ipsCorrect = false
+				break
+			}
+
+			presentAddrs[addr.IP] = true
+		}
+	}
+
+	return true, ipsCorrect, portsCorrect
+}
diff --git a/pkg/election/lease_endpoint_reconciler_test.go b/pkg/election/lease_endpoint_reconciler_test.go
new file mode 100644
index 0000000000..f5cdb2a675
--- /dev/null
+++ b/pkg/election/lease_endpoint_reconciler_test.go
@@ -0,0 +1,510 @@
+package election
+
+import (
+	"net"
+	"reflect"
+	"testing"
+
+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+	"k8s.io/kubernetes/pkg/api"
+	"k8s.io/kubernetes/pkg/registry/registrytest"
+)
+
+type fakeLeases struct {
+	keys map[string]bool
+}
+
+var _ Leases = &fakeLeases{}
+
+func newFakeLeases() *fakeLeases {
+	return &fakeLeases{make(map[string]bool)}
+}
+
+func (f *fakeLeases) ListLeases() ([]string, error) {
+	res := make([]string, 0, len(f.keys))
+	for ip := range f.keys {
+		res = append(res, ip)
+	}
+	return res, nil
+}
+
+func (f *fakeLeases) UpdateLease(ip string) error {
+	f.keys[ip] = true
+	return nil
+}
+
+func (f *fakeLeases) SetKeys(keys []string) {
+	for _, ip := range keys {
+		f.keys[ip] = false
+	}
+}
+
+func (f *fakeLeases) GetUpdatedKeys() []string {
+	res := []string{}
+	for ip, updated := range f.keys {
+		if updated {
+			res = append(res, ip)
+		}
+	}
+	return res
+}
+
+func TestLeaseEndpointReconciler(t *testing.T) {
+	ns := api.NamespaceDefault
+	om := func(name string) metav1.ObjectMeta {
+		return metav1.ObjectMeta{Namespace: ns, Name: name}
+	}
+	reconcile_tests := []struct {
+		testName      string
+		serviceName   string
+		ip            string
+		endpointPorts []api.EndpointPort
+		endpointKeys  []string
+		endpoints     *api.EndpointsList
+		expectUpdate  *api.Endpoints // nil means none expected
+	}{
+		{
+			testName:      "no existing endpoints",
+			serviceName:   "foo",
+			ip:            "1.2.3.4",
+			endpointPorts: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+			endpoints:     nil,
+			expectUpdate: &api.Endpoints{
+				ObjectMeta: om("foo"),
+				Subsets: []api.EndpointSubset{{
+					Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+					Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+				}},
+			},
+		},
+		{
+			testName:      "existing endpoints satisfy",
+			serviceName:   "foo",
+			ip:            "1.2.3.4",
+			endpointPorts: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+						Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+					}},
+				}},
+			},
+		},
+		{
+			testName:      "existing endpoints satisfy + refresh existing key",
+			serviceName:   "foo",
+			ip:            "1.2.3.4",
+			endpointPorts: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+			endpointKeys:  []string{"1.2.3.4"},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+						Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+					}},
+				}},
+			},
+		},
+		{
+			testName:      "existing endpoints satisfy but too many",
+			serviceName:   "foo",
+			ip:            "1.2.3.4",
+			endpointPorts: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}, {IP: "4.3.2.1"}},
+						Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+					}},
+				}},
+			},
+			expectUpdate: &api.Endpoints{
+				ObjectMeta: om("foo"),
+				Subsets: []api.EndpointSubset{{
+					Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+					Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+				}},
+			},
+		},
+		{
+			testName:      "existing endpoints satisfy but too many + extra masters",
+			serviceName:   "foo",
+			ip:            "1.2.3.4",
+			endpointPorts: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+			endpointKeys:  []string{"1.2.3.4", "4.3.2.2", "4.3.2.3", "4.3.2.4"},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{
+							{IP: "1.2.3.4"},
+							{IP: "4.3.2.1"},
+							{IP: "4.3.2.2"},
+							{IP: "4.3.2.3"},
+							{IP: "4.3.2.4"},
+						},
+						Ports: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+					}},
+				}},
+			},
+			expectUpdate: &api.Endpoints{
+				ObjectMeta: om("foo"),
+				Subsets: []api.EndpointSubset{{
+					Addresses: []api.EndpointAddress{
+						{IP: "1.2.3.4"},
+						{IP: "4.3.2.2"},
+						{IP: "4.3.2.3"},
+						{IP: "4.3.2.4"},
+					},
+					Ports: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+				}},
+			},
+		},
+		{
+			testName:      "existing endpoints satisfy but too many + extra masters + delete first",
+			serviceName:   "foo",
+			ip:            "4.3.2.4",
+			endpointPorts: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+			endpointKeys:  []string{"4.3.2.1", "4.3.2.2", "4.3.2.3", "4.3.2.4"},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{
+							{IP: "1.2.3.4"},
+							{IP: "4.3.2.1"},
+							{IP: "4.3.2.2"},
+							{IP: "4.3.2.3"},
+							{IP: "4.3.2.4"},
+						},
+						Ports: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+					}},
+				}},
+			},
+			expectUpdate: &api.Endpoints{
+				ObjectMeta: om("foo"),
+				Subsets: []api.EndpointSubset{{
+					Addresses: []api.EndpointAddress{
+						{IP: "4.3.2.1"},
+						{IP: "4.3.2.2"},
+						{IP: "4.3.2.3"},
+						{IP: "4.3.2.4"},
+					},
+					Ports: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+				}},
+			},
+		},
+		{
+			testName:      "existing endpoints current IP missing",
+			serviceName:   "foo",
+			ip:            "4.3.2.2",
+			endpointPorts: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+			endpointKeys:  []string{"4.3.2.1"},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{
+							{IP: "4.3.2.1"},
+						},
+						Ports: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+					}},
+				}},
+			},
+			expectUpdate: &api.Endpoints{
+				ObjectMeta: om("foo"),
+				Subsets: []api.EndpointSubset{{
+					Addresses: []api.EndpointAddress{
+						{IP: "4.3.2.1"},
+						{IP: "4.3.2.2"},
+					},
+					Ports: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+				}},
+			},
+		},
+		{
+			testName:      "existing endpoints wrong name",
+			serviceName:   "foo",
+			ip:            "1.2.3.4",
+			endpointPorts: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("bar"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+						Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+					}},
+				}},
+			},
+			expectUpdate: &api.Endpoints{
+				ObjectMeta: om("foo"),
+				Subsets: []api.EndpointSubset{{
+					Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+					Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+				}},
+			},
+		},
+		{
+			testName:      "existing endpoints wrong IP",
+			serviceName:   "foo",
+			ip:            "1.2.3.4",
+			endpointPorts: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{{IP: "4.3.2.1"}},
+						Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+					}},
+				}},
+			},
+			expectUpdate: &api.Endpoints{
+				ObjectMeta: om("foo"),
+				Subsets: []api.EndpointSubset{{
+					Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+					Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+				}},
+			},
+		},
+		{
+			testName:      "existing endpoints wrong port",
+			serviceName:   "foo",
+			ip:            "1.2.3.4",
+			endpointPorts: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+						Ports:     []api.EndpointPort{{Name: "foo", Port: 9090, Protocol: "TCP"}},
+					}},
+				}},
+			},
+			expectUpdate: &api.Endpoints{
+				ObjectMeta: om("foo"),
+				Subsets: []api.EndpointSubset{{
+					Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+					Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+				}},
+			},
+		},
+		{
+			testName:      "existing endpoints wrong protocol",
+			serviceName:   "foo",
+			ip:            "1.2.3.4",
+			endpointPorts: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+						Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "UDP"}},
+					}},
+				}},
+			},
+			expectUpdate: &api.Endpoints{
+				ObjectMeta: om("foo"),
+				Subsets: []api.EndpointSubset{{
+					Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+					Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+				}},
+			},
+		},
+		{
+			testName:      "existing endpoints wrong port name",
+			serviceName:   "foo",
+			ip:            "1.2.3.4",
+			endpointPorts: []api.EndpointPort{{Name: "baz", Port: 8080, Protocol: "TCP"}},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+						Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+					}},
+				}},
+			},
+			expectUpdate: &api.Endpoints{
+				ObjectMeta: om("foo"),
+				Subsets: []api.EndpointSubset{{
+					Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+					Ports:     []api.EndpointPort{{Name: "baz", Port: 8080, Protocol: "TCP"}},
+				}},
+			},
+		},
+		{
+			testName:    "existing endpoints extra service ports satisfy",
+			serviceName: "foo",
+			ip:          "1.2.3.4",
+			endpointPorts: []api.EndpointPort{
+				{Name: "foo", Port: 8080, Protocol: "TCP"},
+				{Name: "bar", Port: 1000, Protocol: "TCP"},
+				{Name: "baz", Port: 1010, Protocol: "TCP"},
+			},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+						Ports: []api.EndpointPort{
+							{Name: "foo", Port: 8080, Protocol: "TCP"},
+							{Name: "bar", Port: 1000, Protocol: "TCP"},
+							{Name: "baz", Port: 1010, Protocol: "TCP"},
+						},
+					}},
+				}},
+			},
+		},
+		{
+			testName:    "existing endpoints extra service ports missing port",
+			serviceName: "foo",
+			ip:          "1.2.3.4",
+			endpointPorts: []api.EndpointPort{
+				{Name: "foo", Port: 8080, Protocol: "TCP"},
+				{Name: "bar", Port: 1000, Protocol: "TCP"},
+			},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+						Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+					}},
+				}},
+			},
+			expectUpdate: &api.Endpoints{
+				ObjectMeta: om("foo"),
+				Subsets: []api.EndpointSubset{{
+					Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+					Ports: []api.EndpointPort{
+						{Name: "foo", Port: 8080, Protocol: "TCP"},
+						{Name: "bar", Port: 1000, Protocol: "TCP"},
+					},
+				}},
+			},
+		},
+	}
+	for _, test := range reconcile_tests {
+		fakeLeases := newFakeLeases()
+		fakeLeases.SetKeys(test.endpointKeys)
+		registry := &registrytest.EndpointRegistry{
+			Endpoints: test.endpoints,
+		}
+		r := NewLeaseEndpointReconciler(registry, fakeLeases)
+		err := r.ReconcileEndpoints(test.serviceName, net.ParseIP(test.ip), test.endpointPorts, true)
+		if err != nil {
+			t.Errorf("case %q: unexpected error: %v", test.testName, err)
+		}
+		if test.expectUpdate != nil {
+			if len(registry.Updates) != 1 {
+				t.Errorf("case %q: unexpected updates: %v", test.testName, registry.Updates)
+			} else if e, a := test.expectUpdate, &registry.Updates[0]; !reflect.DeepEqual(e, a) {
+				t.Errorf("case %q: expected update:\n%#v\ngot:\n%#v\n", test.testName, e, a)
+			}
+		}
+		if test.expectUpdate == nil && len(registry.Updates) > 0 {
+			t.Errorf("case %q: no update expected, yet saw: %v", test.testName, registry.Updates)
+		}
+		if updatedKeys := fakeLeases.GetUpdatedKeys(); len(updatedKeys) != 1 || updatedKeys[0] != test.ip {
+			t.Errorf("case %q: expected the master's IP to be refreshed, but the following IPs were refreshed instead: %v", test.testName, updatedKeys)
+		}
+	}
+
+	non_reconcile_tests := []struct {
+		testName      string
+		serviceName   string
+		ip            string
+		endpointPorts []api.EndpointPort
+		endpointKeys  []string
+		endpoints     *api.EndpointsList
+		expectUpdate  *api.Endpoints // nil means none expected
+	}{
+		{
+			testName:    "existing endpoints extra service ports missing port no update",
+			serviceName: "foo",
+			ip:          "1.2.3.4",
+			endpointPorts: []api.EndpointPort{
+				{Name: "foo", Port: 8080, Protocol: "TCP"},
+				{Name: "bar", Port: 1000, Protocol: "TCP"},
+			},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+						Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+					}},
+				}},
+			},
+			expectUpdate: nil,
+		},
+		{
+			testName:    "existing endpoints extra service ports, wrong ports, wrong IP",
+			serviceName: "foo",
+			ip:          "1.2.3.4",
+			endpointPorts: []api.EndpointPort{
+				{Name: "foo", Port: 8080, Protocol: "TCP"},
+				{Name: "bar", Port: 1000, Protocol: "TCP"},
+			},
+			endpoints: &api.EndpointsList{
+				Items: []api.Endpoints{{
+					ObjectMeta: om("foo"),
+					Subsets: []api.EndpointSubset{{
+						Addresses: []api.EndpointAddress{{IP: "4.3.2.1"}},
+						Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+					}},
+				}},
+			},
+			expectUpdate: &api.Endpoints{
+				ObjectMeta: om("foo"),
+				Subsets: []api.EndpointSubset{{
+					Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+					Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+				}},
+			},
+		},
+		{
+			testName:      "no existing endpoints",
+			serviceName:   "foo",
+			ip:            "1.2.3.4",
+			endpointPorts: []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+			endpoints:     nil,
+			expectUpdate: &api.Endpoints{
+				ObjectMeta: om("foo"),
+				Subsets: []api.EndpointSubset{{
+					Addresses: []api.EndpointAddress{{IP: "1.2.3.4"}},
+					Ports:     []api.EndpointPort{{Name: "foo", Port: 8080, Protocol: "TCP"}},
+				}},
+			},
+		},
+	}
+	for _, test := range non_reconcile_tests {
+		fakeLeases := newFakeLeases()
+		fakeLeases.SetKeys(test.endpointKeys)
+		registry := &registrytest.EndpointRegistry{
+			Endpoints: test.endpoints,
+		}
+		r := NewLeaseEndpointReconciler(registry, fakeLeases)
+		err := r.ReconcileEndpoints(test.serviceName, net.ParseIP(test.ip), test.endpointPorts, false)
+		if err != nil {
+			t.Errorf("case %q: unexpected error: %v", test.testName, err)
+		}
+		if test.expectUpdate != nil {
+			if len(registry.Updates) != 1 {
+				t.Errorf("case %q: unexpected updates: %v", test.testName, registry.Updates)
+			} else if e, a := test.expectUpdate, &registry.Updates[0]; !reflect.DeepEqual(e, a) {
+				t.Errorf("case %q: expected update:\n%#v\ngot:\n%#v\n", test.testName, e, a)
+			}
+		}
+		if test.expectUpdate == nil && len(registry.Updates) > 0 {
+			t.Errorf("case %q: no update expected, yet saw: %v", test.testName, registry.Updates)
+		}
+		if updatedKeys := fakeLeases.GetUpdatedKeys(); len(updatedKeys) != 1 || updatedKeys[0] != test.ip {
+			t.Errorf("case %q: expected the master's IP to be refreshed, but the following IPs were refreshed instead: %v", test.testName, updatedKeys)
+		}
+	}
+}
diff --git a/pkg/master/master.go b/pkg/master/master.go
index 97c5c5357b..d140b179a0 100644
--- a/pkg/master/master.go
+++ b/pkg/master/master.go
@@ -51,12 +51,17 @@ import (
 	genericapiserver "k8s.io/apiserver/pkg/server"
 	"k8s.io/apiserver/pkg/server/healthz"
 	serverstorage "k8s.io/apiserver/pkg/server/storage"
+	storagefactory "k8s.io/apiserver/pkg/storage/storagebackend/factory"
 	corev1client "k8s.io/client-go/kubernetes/typed/core/v1"
 	"k8s.io/kubernetes/cmd/kube-apiserver/app/options"
 	"k8s.io/kubernetes/pkg/api"
+	kapi "k8s.io/kubernetes/pkg/api"
 	coreclient "k8s.io/kubernetes/pkg/client/clientset_generated/internalclientset/typed/core/internalversion"
+	election "k8s.io/kubernetes/pkg/election"
 	kubeletclient "k8s.io/kubernetes/pkg/kubelet/client"
 	"k8s.io/kubernetes/pkg/master/tunneler"
+	"k8s.io/kubernetes/pkg/registry/core/endpoint"
+	endpointsstorage "k8s.io/kubernetes/pkg/registry/core/endpoint/storage"
 	"k8s.io/kubernetes/pkg/routes"
 	nodeutil "k8s.io/kubernetes/pkg/util/node"

@@ -87,6 +92,16 @@ const (
 	DefaultEndpointReconcilerInterval = 10 * time.Second
 )

+// EndpointReconcilerEnum selects which reconciler to use
+type EndpointReconcilerEnum int
+
+const (
+	// DefaultMasterCountReconciler will select the original reconciler
+	DefaultMasterCountReconciler = 0
+	// LeaseEndpointReconciler will select a storage based reconciler
+	LeaseEndpointReconciler = iota
+)
+
 type Config struct {
 	GenericConfig *genericapiserver.Config

@@ -135,6 +150,12 @@ type Config struct {
 	// Number of masters running; all masters must be started with the
 	// same value for this field. (Numbers > 1 currently untested.)
 	MasterCount int
+
+	// out of the kubernetes service record. It is not recommended to set this value below 15s.
+	MasterEndpointReconcileTTL int
+
+	// Selects which reconciler to use
+	EndpointReconcilerEnum int
 }

 // EndpointReconcilerConfig holds the endpoint reconciler and endpoint reconciliation interval to be
@@ -155,6 +176,49 @@ type completedConfig struct {
 	*Config
 }

+func (c *Config) createMasterCountReconciler() EndpointReconciler {
+	// use a default endpoint reconciler if nothing is set
+	endpointClient := coreclient.NewForConfigOrDie(c.GenericConfig.LoopbackClientConfig)
+	return NewMasterCountEndpointReconciler(c.MasterCount, endpointClient)
+}
+
+func (c *Config) createLeaseReconciler() EndpointReconciler {
+	ttl := c.MasterEndpointReconcileTTL
+	config, err := c.StorageFactory.NewConfig(kapi.Resource("apiServerIPInfo"))
+	if err != nil {
+		glog.Fatalf("Error determining service IP ranges: %v", err)
+	}
+	leaseStorage, _, err := storagefactory.Create(*config)
+	if err != nil {
+		glog.Fatalf("Error creating storage factory: %v", err)
+	}
+	endpointConfig, err := c.StorageFactory.NewConfig(kapi.Resource("endpoints"))
+	if err != nil {
+		glog.Fatalf("Error getting storage config: %v", err)
+	}
+	endpointsStorage := endpointsstorage.NewREST(generic.RESTOptions{
+		StorageConfig:           endpointConfig,
+		Decorator:               generic.UndecoratedStorage,
+		DeleteCollectionWorkers: 0,
+		ResourcePrefix:          c.StorageFactory.ResourcePrefix(kapi.Resource("endpoints")),
+	})
+	endpointRegistry := endpoint.NewRegistry(endpointsStorage)
+	masterLeases := election.NewLeases(leaseStorage, "/masterleases/", uint64(ttl))
+	return election.NewLeaseEndpointReconciler(endpointRegistry, masterLeases)
+}
+
+func (c *Config) createEndpointReconciler() EndpointReconciler {
+	switch c.EndpointReconcilerEnum {
+	case DefaultMasterCountReconciler:
+		return c.createMasterCountReconciler()
+	case LeaseEndpointReconciler:
+		return c.createLeaseReconciler()
+	default:
+		glog.Fatalf("Reconciler not implemented: %v", c.EndpointReconcilerEnum)
+	}
+	return nil
+}
+
 // Complete fills in any fields not set that are required to have valid data. It's mutating the receiver.
 func (c *Config) Complete() completedConfig {
 	c.GenericConfig.Complete()
@@ -169,6 +233,9 @@ func (c *Config) Complete() completedConfig {
 	if c.APIServerServiceIP == nil {
 		c.APIServerServiceIP = apiServerServiceIP
 	}
+	if c.MasterEndpointReconcileTTL == 0 {
+		c.MasterEndpointReconcileTTL = 15
+	}

 	discoveryAddresses := discovery.DefaultAddresses{DefaultAddress: c.GenericConfig.ExternalAddress}
 	discoveryAddresses.CIDRRules = append(discoveryAddresses.CIDRRules,
@@ -192,9 +259,7 @@ func (c *Config) Complete() completedConfig {
 	}

 	if c.EndpointReconcilerConfig.Reconciler == nil {
-		// use a default endpoint reconciler if nothing is set
-		endpointClient := coreclient.NewForConfigOrDie(c.GenericConfig.LoopbackClientConfig)
-		c.EndpointReconcilerConfig.Reconciler = NewMasterCountEndpointReconciler(c.MasterCount, endpointClient)
+		c.EndpointReconcilerConfig.Reconciler = c.createEndpointReconciler()
 	}

 	// this has always been hardcoded true in the past

pass down the endpoint reconciler type

fixup bazel

lint

add headers

rphillips pushed a commit to rphillips/kubernetes that referenced this pull request Aug 31, 2017

@rphillips

This comment has been minimized.

Copy link
Member Author

commented Aug 31, 2017

I have submitted a PR for this issue.

@klausenbusk

This comment has been minimized.

Copy link

commented Aug 31, 2017

Yeah, this would definitely be the way to go if all installations of Kubernetes were self-hosted. Unfortunately that is not the case.

Just a question. Is there anyway to disable the default reconciler? So I can try with selector already now.

@lavalamp

This comment has been minimized.

Copy link
Member

commented Aug 31, 2017

rphillips pushed a commit to rphillips/kubernetes that referenced this pull request Aug 31, 2017

@rphillips

This comment has been minimized.

Copy link
Member Author

commented Aug 31, 2017

@lavalamp @klausenbusk I added a 'none' reconciler to the PR to effectively be a noop.

@lavalamp

This comment has been minimized.

Copy link
Member

commented Aug 31, 2017

/approve
/lgtm

At the SIG meeting yesterday we collectively decided that using the storage interface directly was the least bad way of implementing this.

Note that other apiservers won't have this problem (they can use services & endpoints), it is unique to kube-apiserver.

@k8s-ci-robot k8s-ci-robot added the lgtm label Aug 31, 2017

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

commented Aug 31, 2017

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 9bb6fef into kubernetes:master Aug 31, 2017

2 checks passed

Submit Queue Queued to run github e2e tests a second time.
cla/linuxfoundation rphillips authorized
Details
@luxas

This comment has been minimized.

Copy link
Member

commented Aug 31, 2017

@lavalamp what "storage interface" are you regerring to in this context?

This will still use Endpoints, but in an other way, right?

@lavalamp

This comment has been minimized.

Copy link
Member

commented Aug 31, 2017

rphillips pushed a commit to rphillips/kubernetes that referenced this pull request Sep 11, 2017

hh pushed a commit to ii/kubernetes that referenced this pull request Sep 23, 2017

Kubernetes Submit Queue
Merge pull request kubernetes#51698 from rphillips/feat/lease_endpoin…
…t_reconciler

Automatic merge from submit-queue (batch tested with PRs 52240, 48145, 52220, 51698, 51777). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>..

add lease endpoint reconciler

**What this PR does / why we need it**: Adds OpenShift's LeaseEndpointReconciler to register kube-apiserver endpoints within the storage registry.

Adds a command-line argument `alpha-endpoint-reconciler-type` to the kube-apiserver.

Defaults to the old MasterCount reconciler.

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: fixes kubernetes/community#939 fixes kubernetes#22609

**Release note**:
```release-note
Adds a command-line argument to kube-apiserver called
--alpha-endpoint-reconciler-type=(master-count, lease, none) (default
"master-count"). The original reconciler is 'master-count'. The 'lease'
reconciler uses the storageapi and a TTL to keep alive an endpoint within the
`kube-apiserver-endpoint` storage namespace. The 'none' reconciler is a noop
reconciler that does not do anything. This is useful for self-hosted
environments.
```

/cc @lavalamp @smarterclayton @ncdc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.