Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apiserver http healthcheck #84797

Open
george-angel opened this issue Nov 5, 2019 · 8 comments
Open

apiserver http healthcheck #84797

george-angel opened this issue Nov 5, 2019 · 8 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@george-angel
Copy link
Contributor

george-angel commented Nov 5, 2019

What would you like to be added:

Apiserver endpoint serving /healthz over HTTP, no TLS.

Why is this needed:

Coming out of this issue: #43784

Running our own, self-signed PKI - we can only use TCP healthchecks for GCP LoadBalancer. It would be great to be able to point the LB healthcheck at a http port exposing the healthcheck.

/sig api-machinery

@george-angel george-angel added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 5, 2019
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 5, 2019
@johscheuer
Copy link
Contributor

johscheuer commented Nov 6, 2019

I'm not an GCP expert but the docs at least state that they don't perform any certificate validation: https://cloud.google.com/load-balancing/docs/health-check-concepts#criteria-certificates so it shouldn't matter that you are using a self-signed PKI.

@george-angel
Copy link
Contributor Author

@johscheuer it looks like a lot of that documentation has change since last I looked at it, and I don't quite have the time to re-dig into it all right now.

But we are using a "forwarding rule" - https://github.com/utilitywarehouse/tf_kube_gcp/blob/master/masters.tf#L96 since we want the load balance to be EXTERNAL.

Which must use "targets" - https://www.terraform.io/docs/providers/google/r/compute_forwarding_rule.html instead of "backend_service" (https://www.terraform.io/docs/providers/google/r/compute_region_backend_service.html) which would enable us to have a healthcheck as the one you have linked.

Mind you I don't see how we would utilize a HTTP /healthz endpoint either, so unless anyone is willing to jump in and argue my point, I will close this Issue and raise a new one when I can formulate the problem properly.

@jennybuckley
Copy link

/assign @cheftako

@jennybuckley
Copy link

/assign @logicalhan

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 5, 2020
@micahhausler
Copy link
Member

/reopen
/lifecycle frozen

I'd love to see a non-auth healthcheck port handling /healthz in kube-apiserver. I get that its possible to have some proxy that is authenticated/authorized, but as an operator who tries to be unopinionated about what RBAC roles are in the cluster, it causes a lot of extra work to ensure the RBAC role for my healthz proxy exists if a customer deletes it. From an operator perspective, it separates the concerns and vastly simplifies the deployment of the kube-apiserver. In our specific case, this heathz port wouldn't be accessable to customers/the internet and just be available for the loadbalancer.

@k8s-ci-robot
Copy link
Contributor

@micahhausler: Reopened this issue.

In response to this:

/reopen
/lifecycle frozen

I'd love to see a non-auth healthcheck port handling /healthz in kube-apiserver. I get that its possible to have some proxy that is authenticated/authorized, but as an operator who tries to be unopinionated about what RBAC roles are in the cluster, it causes a lot of extra work to ensure the RBAC role for my healthz proxy exists if a customer deletes it. From an operator perspective, it separates the concerns and vastly simplifies the deployment of the kube-apiserver. In our specific case, this heathz port wouldn't be accessable to customers/the internet and just be available for the loadbalancer.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Jun 2, 2020
@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 2, 2020
@lavalamp
Copy link
Member

We discussed this at the SIG meeting and we're recommending that people in this situation can use some or all of these techniques:
a) use the authn/authz webhooks to ensure that their health check agent is never denied
b) make a lightweight separate process to proxy healthcheck info
c) rename or explain the purpose of the public info viewer role - consumers of a cluster do need to be aware of what their responsibilities are for not breaking the cluster.

We also think it might be a good idea to rename the default "public info viewer" role to something less likely to make users think "wow I don't want that, delete".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests

9 participants