Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control plane bootstrapping order AKA we need a run-level concept #54522

Open
lavalamp opened this issue Oct 24, 2017 · 8 comments
Open

Control plane bootstrapping order AKA we need a run-level concept #54522

lavalamp opened this issue Oct 24, 2017 · 8 comments

Comments

@lavalamp
Copy link
Member

@lavalamp lavalamp commented Oct 24, 2017

Background:

We are adding extension mechanisms to the Kubernetes control plane, initializers and admission webhooks. If e.g. the webhooks are configured but not actually running in the cluster, then the cluster is broken until an administrator can fix it. To make it possible to avoid this situation, we're going to let the webhook be gated on a selector matching the labels on the namespace containing the item under consideration. This should make it possible to construct a set of labels on namespaces that will allow the namespaces hosting the critical webhooks to be operational when the webhooks aren't running. (I will add a link to the design when it is published.)

What we need:

We're looking for documented best practices around this. We imagined building a "run level" system in labels on namespaces out of this. A complete solution should

  • Cover how many run levels there are
  • Cover what components go in which run level
  • Analyze the functionality of the current controller-manager; it may need to be split into binaries or modes that are in different run levels
  • Draw some inspiration from Brian's layers doc.

We think cluster lifecycle SIG is probably the best place for this to be worked out.

(This is from a meeting between myself, @cheftako, @deads2k, @smarterclayton, @liggitt, @caesarxuchao, and @jagosan. )

@smarterclayton

This comment has been minimized.

Copy link
Contributor

@smarterclayton smarterclayton commented Oct 25, 2017

@tengqm

This comment has been minimized.

Copy link
Contributor

@tengqm tengqm commented Oct 25, 2017

/cc

@lavalamp

This comment has been minimized.

Copy link
Member Author

@lavalamp lavalamp commented Nov 7, 2017

Due to the lack of runlevel right now and the bundling of controllers in kube-controller-manager, a runlevel 1+ controller can prevent the cluster from starting: #55022 (comment)

@bgrant0607

This comment has been minimized.

Copy link
Member

@bgrant0607 bgrant0607 commented Nov 7, 2017

See also #16337

@bgrant0607

This comment has been minimized.

Copy link
Member

@bgrant0607 bgrant0607 commented Nov 7, 2017

@fejta-bot

This comment has been minimized.

Copy link

@fejta-bot fejta-bot commented Feb 7, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@sttts

This comment has been minimized.

Copy link
Contributor

@sttts sttts commented Feb 8, 2018

/remove-lifecycle stale

@tallclair

This comment has been minimized.

Copy link
Member

@tallclair tallclair commented Apr 6, 2018

Another scenario affected by this: Configure a ValidatingAdmissionWebhook that checks policy on pod creation, and fails closed. The webhook's node goes down, but the cluster cannot bring it backup since it fails the admission check.

I feel like a warning with the various ways to break the cluster (and how to avoid it) should minimally be laid out on https://kubernetes.io/docs/admin/extensible-admission-controllers/

@liggitt liggitt added this to Triage in Admission Webhooks Jul 8, 2019
@liggitt liggitt moved this from Triage to Not required for GA in Admission Webhooks Aug 28, 2019
@liggitt liggitt moved this from Bugs to Proposed enhancements in Admission Webhooks Aug 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Admission Webhooks
Proposed enhancements
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
9 participants
You can’t perform that action at this time.