Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: use etcd's /health endpoint for it's liveness probe #81385

Merged
merged 1 commit into from Aug 16, 2019

Conversation

@neolit123
Copy link
Member

commented Aug 13, 2019

What this PR does / why we need it:

Etcd v3.3.0 added the --listen-metrics-urls flag which allows specifying
addition URLs to the already present /health and /metrics endpoints.

While /health and /metrics are enabled for URLS defined with
--listen-client-urls (v3+ ?) they do require HTTPS.

Replace the present etcdctl based liveness probe with a standard HTTP
GET v1.Probe that connects to http://127.0.0.1:2381/health.

These endpoints are not reachable from the outside and only available
for localhost connections.

Which issue(s) this PR fixes:

Fixes kubernetes/kubeadm#1708

Special notes for your reviewer:

  • with this PR merging we can have a better health check for control-plane components during upgrade. see kubernetes/kubeadm#1717 and #81319
  • if this merges does it mean that we can deprecate etcd-healthcheck.key/crt?

Does this PR introduce a user-facing change?:

kubeadm: use etcd's /health endpoint for a HTTP liveness probe on localhost instead of having a custom health check using etcdctl

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@kubernetes/sig-cluster-lifecycle-pr-reviews
/assign @fabriziopandini
/kind cleanup
/priority backlog

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Aug 13, 2019

/hold

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Aug 13, 2019

/test pull-kubernetes-e2e-kind

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Aug 13, 2019

/test pull-kubernetes-e2e-gce-device-plugin-gpu

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Aug 13, 2019

/retest

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Aug 13, 2019

/assign @mauilion
for a security stamp.

@rosti
Copy link
Member

left a comment

Thanks @neolit123 !
Looks good. I am not sure, however, of the security implications of exposing /metrics hook on http://localhost:<etcd-metrics-port>/metrics of local etcd instances.
Possibly, we should move to HTTPS for all health checks.

cmd/kubeadm/app/util/staticpod/utils.go Outdated Show resolved Hide resolved
cmd/kubeadm/app/util/staticpod/utils.go Outdated Show resolved Hide resolved
cmd/kubeadm/app/util/staticpod/utils.go Outdated Show resolved Hide resolved
cmd/kubeadm/app/util/staticpod/utils.go Outdated Show resolved Hide resolved
@randomvariable

This comment has been minimized.

Copy link
Member

commented Aug 14, 2019

Metrics and healthcheck exposed on http on localhost only is fine. I would expect people to use kube-rbac-proxy to ship metrics to Prometheus or similar. An lgtm from me.

@fabriziopandini
Copy link
Member

left a comment

thanks @neolit123!
only one minor from my side, not blocking
/approve
/lgtm

cmd/kubeadm/app/phases/etcd/local.go Outdated Show resolved Hide resolved

@k8s-ci-robot k8s-ci-robot added the lgtm label Aug 15, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Aug 15, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fabriziopandini, neolit123

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@neolit123 neolit123 force-pushed the neolit123:etcd-probe branch from 65ca539 to 20c3ba2 Aug 15, 2019

@k8s-ci-robot k8s-ci-robot removed the lgtm label Aug 15, 2019

kubeadm: use etcd's /health endpoint for it's liveness probe
Etcd v3.3.0 added the --listen-metrics-urls flag which allows specifying
addition URLs to the already present /health and /metrics endpoints.

While /health and /metrics are enabled for URLS defined with
--listen-client-urls (v3+ ?) they do require HTTPS.

Replace the present etcdctl based liveness probe with a standard HTTP
GET v1.Probe that connects to http://127.0.0.1:2381/health.

These endpoints are not reachable from the outside and only available
for localhost connections.

@neolit123 neolit123 force-pushed the neolit123:etcd-probe branch from 20c3ba2 to 99b64f1 Aug 15, 2019

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Aug 15, 2019

@rosti i have updated the PR with the requested changes by you and @fabriziopandini

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Aug 15, 2019

/retest

@SataQiu
Copy link
Member

left a comment

/lgtm

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Aug 16, 2019

/retest

@rosti

rosti approved these changes Aug 16, 2019

Copy link
Member

left a comment

Thanks @neolit123 !
Let's get the usage of etcdctl to history.
/lgtm
/hold cancel

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Aug 16, 2019

/retest

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Aug 16, 2019

prow's build seems to fail. seems unrelated to this PR.
/retest

@fejta-bot

This comment has been minimized.

Copy link

commented Aug 16, 2019

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 9e60bed into kubernetes:master Aug 16, 2019

23 checks passed

cla/linuxfoundation neolit123 authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-iscsi Skipped.
pull-kubernetes-e2e-gce-iscsi-serial Skipped.
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-node-e2e-containerd Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details

@k8s-ci-robot k8s-ci-robot added this to the v1.16 milestone Aug 16, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.