Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify nodes to register directly with the master. #6949

Merged
merged 2 commits into from
May 19, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
15 changes: 14 additions & 1 deletion cluster/saltbase/salt/kubelet/default
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,19 @@
{% set api_servers_with_port = api_servers + ":6443" -%}
{% endif -%}

# Disable registration for the kubelet running on the master on GCE.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually we do want to register the master kubelet. Registration just needs to be retried if the apiserver doesn't respond. And, for now, we want scheduling to be disabled for the master node. #6428
cc @dchen1107 @vishh

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why disable registration of the master kubelet? The API server reports the master pods but the usual APIs used to access the pods, like '/containerLogs', '/exec', etc, are not working as of now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It keeps things consistent with the current deployment. I'd be happy to change this in a future PR to be configurable, but didn't want too many moving pieces here. For one, we'd need to change cluster validation to expect N+1 nodes (rather than N) since the master is now counted. And I didn't want to worry about other ramifications at the same time as changing the node creation pattern.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Robert, fyi I tested your kubelet mod in my GCE environment this weekend.
Worked perfectly! I deploy nodes with cloud configs derived from Paulo's
and the nodes show up with proper names that I can then use to create
services with an external load balancer.

Very nice, Thanks ... WkH

On Fri, Apr 17, 2015 at 8:19 PM, Robert Bailey notifications@github.com
wrote:

In cluster/saltbase/salt/kubelet/default
#6949 (comment)
:

@@ -15,6 +15,15 @@
{% set api_servers = "--api_servers=https://" + ips[0][0] + ":6443" -%}
{% endif -%}

+# Disable registration for the kubelet running on the master on GCE.

It keeps things consistent with the current deployment. I'd be happy to
change this in a future PR to be configurable, but didn't want too many
moving pieces here. For one, we'd need to change cluster validation to
expect N+1 nodes (rather than N) since the master is now counted. And I
didn't want to worry about other ramifications at the same time as changing
the node creation pattern.


Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/pull/6949/files#r28639946
.

... WkH

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so awesome Ward! Can't wait to see this merged.
On Apr 19, 2015 6:57 PM, "wkharold" notifications@github.com wrote:

In cluster/saltbase/salt/kubelet/default
#6949 (comment)
:

@@ -15,6 +15,15 @@
{% set api_servers = "--api_servers=https://" + ips[0][0] + ":6443" -%}
{% endif -%}

+# Disable registration for the kubelet running on the master on GCE.

Robert, fyi I tested your kubelet mod in my GCE environment this weekend.
Worked perfectly! I deploy nodes with cloud configs derived from Paulo's
and the nodes show up with proper names that I can then use to create
services with an external load balancer. Very nice, Thanks ... WkH
… <#14cd3e667ba8e637_>
On Fri, Apr 17, 2015 at 8:19 PM, Robert Bailey notifications@github.com
wrote: In cluster/saltbase/salt/kubelet/default <
https://github.com/GoogleCloudPlatform/kubernetes/pull/6949#discussion_r28639946>
: > @@ -15,6 +15,15 @@ > {% set api_servers = "--api_servers=https://" +
ips[0][0] + ":6443" -%} > {% endif -%} > > +# Disable registration for the
kubelet running on the master on GCE. It keeps things consistent with the
current deployment. I'd be happy to change this in a future PR to be
configurable, but didn't want too many moving pieces here. For one, we'd
need to change cluster validation to expect N+1 nodes (rather than N) since
the master is now counted. And I didn't want to worry about other
ramifications at the same time as changing the node creation pattern. —
Reply to this email directly or view it on GitHub <
https://github.com/GoogleCloudPlatform/kubernetes/pull/6949/files#r28639946>
.
-- ... WkH


Reply to this email directly or view it on GitHub
https://github.com/GoogleCloudPlatform/kubernetes/pull/6949/files#r28657993
.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine with you doing the master registration in a future PR, as long as it gets done at some point (with appropriate adjustments for GKE)

# TODO(roberthbailey): Make this configurable via an env var in config-default.sh
{% if grains.cloud == 'gce' -%}
{% if grains['roles'][0] == 'kubernetes-master' -%}
{% set api_servers_with_port = "" -%}
{% endif -%}
{% endif -%}

{% set cloud_provider = "" -%}
{% if grains.cloud is defined -%}
{% set cloud_provider = "--cloud_provider=" + grains.cloud -%}
{% endif -%}

{% set config = "--config=/etc/kubernetes/manifests" -%}
{% set hostname_override = "" -%}
{% if grains.hostname_override is defined -%}
Expand All @@ -45,4 +58,4 @@
{% set configure_cbr0 = "--configure-cbr0=" + pillar['allocate_node_cidrs'] -%}
{% endif -%}

DAEMON_ARGS="{{daemon_args}} {{api_servers_with_port}} {{hostname_override}} {{config}} --allow_privileged={{pillar['allow_privileged']}} {{pillar['log_level']}} {{cluster_dns}} {{cluster_domain}} {{docker_root}} {{configure_cbr0}}"
DAEMON_ARGS="{{daemon_args}} {{api_servers_with_port}} {{hostname_override}} {{cloud_provider}} {{config}} --allow_privileged={{pillar['allow_privileged']}} {{pillar['log_level']}} {{cluster_dns}} {{cluster_domain}} {{docker_root}} {{configure_cbr0}}"
7 changes: 3 additions & 4 deletions cmd/integration/integration.go
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,6 @@ func startComponents(firstManifestURL, secondManifestURL, apiVersion string) (st
// Setup
servers := []string{}
glog.Infof("Creating etcd client pointing to %v", servers)
machineList := []string{"localhost", "127.0.0.1"}

handler := delegateHandler{}
apiServer := httptest.NewServer(&handler)
Expand Down Expand Up @@ -196,7 +195,7 @@ func startComponents(firstManifestURL, secondManifestURL, apiVersion string) (st
api.ResourceName(api.ResourceMemory): resource.MustParse("10G"),
}}

nodeController := nodecontroller.NewNodeController(nil, "", machineList, nodeResources, cl, 10, 5*time.Minute, util.NewFakeRateLimiter(),
nodeController := nodecontroller.NewNodeController(nil, "", nodeResources, cl, 10, 5*time.Minute, util.NewFakeRateLimiter(),
40*time.Second, 60*time.Second, 5*time.Second, nil, false)
nodeController.Run(5*time.Second, true)
cadvisorInterface := new(cadvisor.Fake)
Expand All @@ -206,15 +205,15 @@ func startComponents(firstManifestURL, secondManifestURL, apiVersion string) (st
configFilePath := makeTempDirOrDie("config", testRootDir)
glog.Infof("Using %s as root dir for kubelet #1", testRootDir)
fakeDocker1.VersionInfo = docker.Env{"ApiVersion=1.15"}
kcfg := kubeletapp.SimpleKubelet(cl, &fakeDocker1, machineList[0], testRootDir, firstManifestURL, "127.0.0.1", 10250, api.NamespaceDefault, empty_dir.ProbeVolumePlugins(), nil, cadvisorInterface, configFilePath, nil, kubecontainer.FakeOS{})
kcfg := kubeletapp.SimpleKubelet(cl, &fakeDocker1, "localhost", testRootDir, firstManifestURL, "127.0.0.1", 10250, api.NamespaceDefault, empty_dir.ProbeVolumePlugins(), nil, cadvisorInterface, configFilePath, nil, kubecontainer.FakeOS{})
kubeletapp.RunKubelet(kcfg, nil)
// Kubelet (machine)
// Create a second kubelet so that the guestbook example's two redis slaves both
// have a place they can schedule.
testRootDir = makeTempDirOrDie("kubelet_integ_2.", "")
glog.Infof("Using %s as root dir for kubelet #2", testRootDir)
fakeDocker2.VersionInfo = docker.Env{"ApiVersion=1.15"}
kcfg = kubeletapp.SimpleKubelet(cl, &fakeDocker2, machineList[1], testRootDir, secondManifestURL, "127.0.0.1", 10251, api.NamespaceDefault, empty_dir.ProbeVolumePlugins(), nil, cadvisorInterface, "", nil, kubecontainer.FakeOS{})
kcfg = kubeletapp.SimpleKubelet(cl, &fakeDocker2, "127.0.0.1", testRootDir, secondManifestURL, "127.0.0.1", 10251, api.NamespaceDefault, empty_dir.ProbeVolumePlugins(), nil, cadvisorInterface, "", nil, kubecontainer.FakeOS{})
kubeletapp.RunKubelet(kcfg, nil)
return apiServer.URL, configFilePath
}
Expand Down
2 changes: 1 addition & 1 deletion cmd/kube-controller-manager/app/controllermanager.go
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ func (s *CMServer) Run(_ []string) error {
glog.Warning("DEPRECATION NOTICE: sync-node-status flag is being deprecated. It has no effect now and it will be removed in a future version.")
}

nodeController := nodecontroller.NewNodeController(cloud, s.MinionRegexp, s.MachineList, nodeResources,
nodeController := nodecontroller.NewNodeController(cloud, s.MinionRegexp, nodeResources,
kubeClient, s.RegisterRetryCount, s.PodEvictionTimeout, util.NewTokenBucketRateLimiter(s.DeletingPodsQps, s.DeletingPodsBurst),
s.NodeMonitorGracePeriod, s.NodeStartupGracePeriod, s.NodeMonitorPeriod, (*net.IPNet)(&s.ClusterCIDR), s.AllocateNodeCIDRs)
nodeController.Run(s.NodeSyncPeriod, s.SyncNodeList)
Expand Down
7 changes: 7 additions & 0 deletions cmd/kubelet/app/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ type KubeletServer struct {
HealthzBindAddress util.IP
OOMScoreAdj int
APIServerList util.StringList
RegisterNode bool
ClusterDomain string
MasterServiceNamespace string
ClusterDNS util.IP
Expand Down Expand Up @@ -155,6 +156,7 @@ func NewKubeletServer() *KubeletServer {
CadvisorPort: 4194,
HealthzPort: 10248,
HealthzBindAddress: util.IP(net.ParseIP("127.0.0.1")),
RegisterNode: true, // will be ignored if no apiserver is configured
OOMScoreAdj: -900,
MasterServiceNamespace: api.NamespaceDefault,
ImageGCHighThresholdPercent: 90,
Expand Down Expand Up @@ -211,6 +213,7 @@ func (s *KubeletServer) AddFlags(fs *pflag.FlagSet) {
fs.Var(&s.HealthzBindAddress, "healthz-bind-address", "The IP address for the healthz server to serve on, defaulting to 127.0.0.1 (set to 0.0.0.0 for all interfaces)")
fs.IntVar(&s.OOMScoreAdj, "oom-score-adj", s.OOMScoreAdj, "The oom_score_adj value for kubelet process. Values must be within the range [-1000, 1000]")
fs.Var(&s.APIServerList, "api-servers", "List of Kubernetes API servers for publishing events, and reading pods and services. (ip:port), comma separated.")
fs.BoolVar(&s.RegisterNode, "register-node", s.RegisterNode, "Register the node with the apiserver (defaults to true if --api-server is set)")
fs.StringVar(&s.ClusterDomain, "cluster-domain", s.ClusterDomain, "Domain for this cluster. If set, kubelet will configure all containers to search this domain in addition to the host's search domains")
fs.StringVar(&s.MasterServiceNamespace, "master-service-namespace", s.MasterServiceNamespace, "The namespace from which the kubernetes master services should be injected into pods")
fs.Var(&s.ClusterDNS, "cluster-dns", "IP address for a cluster DNS server. If set, kubelet will configure all containers to use this for DNS resolution in addition to the host's DNS servers")
Expand Down Expand Up @@ -318,6 +321,7 @@ func (s *KubeletServer) Run(_ []string) error {
MinimumGCAge: s.MinimumGCAge,
MaxPerPodContainerCount: s.MaxPerPodContainerCount,
MaxContainerCount: s.MaxContainerCount,
RegisterNode: s.RegisterNode,
ClusterDomain: s.ClusterDomain,
ClusterDNS: s.ClusterDNS,
Runonce: s.RunOnce,
Expand Down Expand Up @@ -493,6 +497,7 @@ func SimpleKubelet(client *client.Client,
MinimumGCAge: 10 * time.Second,
MaxPerPodContainerCount: 5,
MaxContainerCount: 100,
RegisterNode: true,
MasterServiceNamespace: masterServiceNamespace,
VolumePlugins: volumePlugins,
TLSOptions: tlsOptions,
Expand Down Expand Up @@ -618,6 +623,7 @@ type KubeletConfig struct {
MinimumGCAge time.Duration
MaxPerPodContainerCount int
MaxContainerCount int
RegisterNode bool
ClusterDomain string
ClusterDNS util.IP
EnableServer bool
Expand Down Expand Up @@ -675,6 +681,7 @@ func createAndInitKubelet(kc *KubeletConfig) (k KubeletBootstrap, pc *config.Pod
kc.RegistryBurst,
gcPolicy,
pc.SeenAllSources,
kc.RegisterNode,
kc.ClusterDomain,
net.IP(kc.ClusterDNS),
kc.MasterServiceNamespace,
Expand Down
10 changes: 4 additions & 6 deletions cmd/kubernetes/kubernetes.go
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ func runScheduler(cl *client.Client) {
}

// RunControllerManager starts a controller
func runControllerManager(machineList []string, cl *client.Client, nodeMilliCPU, nodeMemory int64) {
func runControllerManager(cl *client.Client, nodeMilliCPU, nodeMemory int64) {
nodeResources := &api.NodeResources{
Capacity: api.ResourceList{
api.ResourceCPU: *resource.NewMilliQuantity(nodeMilliCPU, resource.DecimalSI),
Expand All @@ -133,7 +133,7 @@ func runControllerManager(machineList []string, cl *client.Client, nodeMilliCPU,

const nodeSyncPeriod = 10 * time.Second
nodeController := nodecontroller.NewNodeController(
nil, "", machineList, nodeResources, cl, 10, 5*time.Minute, util.NewTokenBucketRateLimiter(*deletingPodsQps, *deletingPodsBurst),
nil, "", nodeResources, cl, 10, 5*time.Minute, util.NewTokenBucketRateLimiter(*deletingPodsQps, *deletingPodsBurst),
40*time.Second, 60*time.Second, 5*time.Second, nil, false)
nodeController.Run(nodeSyncPeriod, true)

Expand All @@ -150,18 +150,16 @@ func runControllerManager(machineList []string, cl *client.Client, nodeMilliCPU,
}

func startComponents(etcdClient tools.EtcdClient, cl *client.Client, addr net.IP, port int) {
machineList := []string{"localhost"}

runApiServer(etcdClient, addr, port, *masterServiceNamespace)
runScheduler(cl)
runControllerManager(machineList, cl, *nodeMilliCPU, *nodeMemory)
runControllerManager(cl, *nodeMilliCPU, *nodeMemory)

dockerClient := dockertools.ConnectToDockerOrDie(*dockerEndpoint)
cadvisorInterface, err := cadvisor.New(0)
if err != nil {
glog.Fatalf("Failed to create cAdvisor: %v", err)
}
kcfg := kubeletapp.SimpleKubelet(cl, dockerClient, machineList[0], "/tmp/kubernetes", "", "127.0.0.1", 10250, *masterServiceNamespace, kubeletapp.ProbeVolumePlugins(), nil, cadvisorInterface, "", nil, kubecontainer.RealOS{})
kcfg := kubeletapp.SimpleKubelet(cl, dockerClient, "localhost", "/tmp/kubernetes", "", "127.0.0.1", 10250, *masterServiceNamespace, kubeletapp.ProbeVolumePlugins(), nil, cadvisorInterface, "", nil, kubecontainer.RealOS{})
kubeletapp.RunKubelet(kcfg, nil)

}
Expand Down
2 changes: 2 additions & 0 deletions pkg/api/validation/validation.go
Original file line number Diff line number Diff line change
Expand Up @@ -1195,6 +1195,8 @@ func ValidateNodeUpdate(oldNode *api.Node, node *api.Node) errs.ValidationErrorL
oldNode.ObjectMeta = node.ObjectMeta
// Allow users to update capacity
oldNode.Status.Capacity = node.Status.Capacity
// Allow the controller manager to assign a CIDR to a node.
oldNode.Spec.PodCIDR = node.Spec.PodCIDR
// Allow users to unschedule node
oldNode.Spec.Unschedulable = node.Spec.Unschedulable
// Clear status
Expand Down
3 changes: 3 additions & 0 deletions pkg/cloudprovider/cloud.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ limitations under the License.
package cloudprovider

import (
"errors"
"net"
"strings"

Expand Down Expand Up @@ -86,6 +87,8 @@ type Instances interface {
Release(name string) error
}

var InstanceNotFound = errors.New("instance not found")

// Zone represents the location of a particular machine.
type Zone struct {
FailureDomain string
Expand Down
5 changes: 4 additions & 1 deletion pkg/cloudprovider/gce/gce.go
Original file line number Diff line number Diff line change
Expand Up @@ -444,7 +444,10 @@ func (gce *GCECloud) getInstanceByName(name string) (*compute.Instance, error) {
name = canonicalizeInstanceName(name)
res, err := gce.service.Instances.Get(gce.projectID, gce.zone, name).Do()
if err != nil {
glog.Errorf("Failed to retrieve TargetInstance resource for instance:%s", name)
glog.Errorf("Failed to retrieve TargetInstance resource for instance: %s", name)
if apiErr, ok := err.(*googleapi.Error); ok && apiErr.Code == http.StatusNotFound {
return nil, cloudprovider.InstanceNotFound
}
return nil, err
}
return res, nil
Expand Down