Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register the kubelet on the master node with an apiserver. #12349

Merged
merged 1 commit into from
Aug 6, 2015

Conversation

roberthbailey
Copy link
Contributor

This option is separated from the apiserver running locally on the master node so that it
can be optionally enabled or disabled as needed.

Also, fix the healthchecking configuration for the master components, which
was previously only working by coincidence:

If a kubelet doesn't register with a master, it never bothers to figure out
what its local address is. In which case it ends up constructing a URL like
http://:8080/healthz for the http probe. This happens to work on the master
because all of the pods are using host networking and explicitly binding to
127.0.0.1. Once the kubelet is registered with the master and it determines
the local node address, it tries to healthcheck on an address where the pod
isn't listening and the kubelet periodically restarts each master component
when the liveness probe fails.

@roberthbailey
Copy link
Contributor Author

/cc @dchen1107

@k8s-bot
Copy link

k8s-bot commented Aug 6, 2015

GCE e2e build/test passed for commit 21628c6f4fc82746497279b5bc26837ac9e75bdd.

@roberthbailey
Copy link
Contributor Author

The shippable error is

!!! [0806 18:09:17] Timed out waiting for kubelet(masterless) to answer at http://127.0.0.1:10248/healthz; tried 25 waiting 0.5 between each
!!! Error in ./hack/test-cmd.sh:48
'return 1' exited with status 1
Call stack:
1: ./hack/test-cmd.sh:48 main(...)
Exiting with status 1

which seems unrelated to this change. I've restarted shippable.

@roberthbailey
Copy link
Contributor Author

Running a more complete e2e suite locally:

$ go run hack/e2e.go -v -test --test_args="--ginkgo.skip=Skipped|Restart|Etcd.*|Reboot.*packages|Reboot.*unclean|Reboot.*|Reboot.*network|Reboot.*kernel\spanic|Nodes\sNetwork.*|Shell.*services.sh|Addon\supdate"
...
Ran 97 of 136 Specs in 5137.354 seconds
SUCCESS! -- 97 Passed | 0 Failed | 2 Pending | 37 Skipped PASS

Ginkgo ran 1 suite in 1h25m43.05540199s
Test Suite Passed

found=$(cat "${MINIONS_FILE}" | sed '1d' | grep -c .) || true
ready=$(cat "${MINIONS_FILE}" | sed '1d' | awk '{print $NF}' | grep -c '^Ready') || true

if (( ${found} == "${NUM_MINIONS}" )) && (( ${ready} == "${NUM_MINIONS}")); then
if (( ${found} == "${EXPECTED_NUM_NODES}" )) && (( ${ready} == "${EXPECTED_NUM_NODES}")); then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "${found}" and "${ready}"? (pre-existing).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed and pushed squashed changes.

@zmerlynn
Copy link
Member

zmerlynn commented Aug 6, 2015

LGTM modulo nit

@roberthbailey
Copy link
Contributor Author

Next shippable flake:

ok k8s.io/kubernetes/pkg/client/cache   1.702s  coverage: 75.9% of statements
I0806 19:06:37.638803 12059 portforward.go:260] Forwarding from localhost:12345 -> 12345
I0806 19:06:37.639627 12059 portforward.go:260] Forwarding from 127.0.0.1:12345 -> 12345
I0806 19:06:37.640164 12059 portforward.go:260] Forwarding from [::1]:12345 -> 12345
E0806 19:06:37.640593 12059 portforward.go:246] Unable to create listener: Error listen tcp4 [::1]:12345: non-IPv4 address
E0806 19:06:37.640964 12059 portforward.go:246] Unable to create listener: Error listen tcp6 127.0.0.1:12345: bind: invalid argument
E0806 19:06:37.641248 12059 portforward.go:246] Unable to create listener: Error listen tcp6: too many colons in address ::1:12345
I0806 19:06:37.649375 12059 portforward.go:260] Forwarding from 127.0.0.1:5000 -> 5000
I0806 19:06:37.649812 12059 portforward.go:260] Forwarding from [::1]:5000 -> 5000
I0806 19:06:37.656432 12059 portforward.go:260] Forwarding from 127.0.0.1:5000 -> 5000
I0806 19:06:37.657112 12059 portforward.go:260] Forwarding from [::1]:5000 -> 5000
I0806 19:06:37.657473 12059 portforward.go:260] Forwarding from 127.0.0.1:6000 -> 6000
I0806 19:06:37.672490 12059 portforward.go:260] Forwarding from [::1]:6000 -> 6000
I0806 19:06:37.674596 12059 portforward.go:293] Handling connection for 5000
--- FAIL: TestForwardPorts (0.04s)
portforward_test.go:373: 2: expected to read '1234', got ''
FAIL
coverage: 82.4% of statements
FAIL    k8s.io/kubernetes/pkg/client/portforward    1.328s

…n is

separated from the apiserver running locally on the master node so that it
can be optionally enabled or disabled as needed.

Also, fix the healthchecking configuration for the master components, which
was previously only working by coincidence:

If a kubelet doesn't register with a master, it never bothers to figure out
what its local address is. In which case it ends up constructing a URL like
http://:8080/healthz for the http probe. This happens to work on the master
because all of the pods are using host networking and explicitly binding to
127.0.0.1. Once the kubelet is registered with the master and it determines
the local node address, it tries to healthcheck on an address where the pod
isn't listening and the kubelet periodically restarts each master component
when the liveness probe fails.
@roberthbailey
Copy link
Contributor Author

Shippable is running again on the slightly modified changes. Third time is the charm?

@k8s-bot
Copy link

k8s-bot commented Aug 6, 2015

GCE e2e build/test passed for commit 8df33bc.

@zmerlynn zmerlynn added lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-merge labels Aug 6, 2015
"metadata": {"name":"kube-scheduler"},
"metadata": {
"name":"kube-scheduler",
"namespace": "kube-system"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want all master components are in the same namespaces as our addons?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, no. But it seemed weird to have them in the default namespace, and kube-system is where we've been putting the system components. The master components are also system components, so that seemed like the logical place to move them to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I can live with it in kube-system for now.

@dchen1107
Copy link
Member

LGTM.

dchen1107 added a commit that referenced this pull request Aug 6, 2015
Register the kubelet on the master node with an apiserver.
@dchen1107 dchen1107 merged commit 2fa3004 into kubernetes:master Aug 6, 2015
@dchen1107
Copy link
Member

cc/ @vishh Now we can make sure heapster has master components' stats. :-)

@vishh
Copy link
Contributor

vishh commented Aug 6, 2015

Yay! Thanks for this PR!

On Thu, Aug 6, 2015 at 3:22 PM, Dawn Chen notifications@github.com wrote:

cc/ @vishh https://github.com/vishh Now we can make sure heapster has
master components' stats. :-)


Reply to this email directly or view it on GitHub
#12349 (comment)
.

zmerlynn added a commit that referenced this pull request Aug 6, 2015
…-#11483-#12349-upstream-release-1.0

Automated cherry pick of #11483 #12349
@brendandburns brendandburns mentioned this pull request Aug 7, 2015
@derekwaynecarr
Copy link
Member

Ok, so something in this PR broke the vagrant setup.

I am guessing its the removal of kubernetes_auth, @justinsb - are you still working fine on AWS?

I am not sure why that had to be removed as part of this PR, but I need more time to dig into what is actually happening here and why.

@@ -31,19 +31,6 @@
- mode: 400
- makedirs: true

#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a problem keeping /var/lib/kubelet/kubernetes_auth in addition to /var/lib/kubelet/kubeconfig?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to clean it up since I thought everyone had moved off of it ages ago.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steelbrain
Copy link

Is it possible to run pods on kubes master on AWS (not GCE)? or is it at least planned?

@roberthbailey
Copy link
Contributor Author

It's possible on any kubernetes deployment. It's just a matter of which flags you pass to the kubelet running on the master.

How are you creating your cluster on AWS? Are you using ./cluster/kube-up.sh or kops?

@justinsb

@steelbrain
Copy link

I'm using ./cluster/kube-up.sh with some env vars that determine the size of the aws ec2 instances spawned for both master and minions. Currently it has one master and two minions but I only see the minions when I goto kube's nodes dashboard

@roberthbailey
Copy link
Contributor Author

I don't think this flag has been plumbed through the aws startup scripts (this PR only did it for the GCE ones, since that's all I can test).

@steelbrain
Copy link

Is there any help someone can provide (like manually testing your PRs) to make this feature available for aws users? :)

@dwiyerr
Copy link

dwiyerr commented May 7, 2019

What is/are the downside of scheduling pods on master?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm "Looks good to me", indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants