Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP: Node Fencing Specification #2763

Closed
wants to merge 2 commits into from

Conversation

beekhof
Copy link

@beekhof beekhof commented Oct 8, 2018

Submitting early to solicit feedback

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 8, 2018
@k8s-ci-robot
Copy link
Contributor

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Oct 8, 2018
@k8s-ci-robot
Copy link
Contributor

Hi @beekhof. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 8, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: jdumars

If they are not already assigned, you can assign the PR to them by writing /assign @jdumars in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added kind/kep sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Oct 8, 2018
@beekhof
Copy link
Author

beekhof commented Oct 8, 2018

CLA signed

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Oct 8, 2018
@NickrenREN
Copy link
Contributor

/cc

@NickrenREN
Copy link
Contributor

/cc @orainxiong

@k8s-ci-robot
Copy link
Contributor

@NickrenREN: GitHub didn't allow me to request PR reviews from the following users: orainxiong.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @orainxiong

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@beekhof
Copy link
Author

beekhof commented Oct 9, 2018

As suggested by the bot...

/assign @jdumars

@thockin thockin changed the title Node Fencing Specification KEP: Node Fencing Specification Oct 18, 2018
@quinton-hoole
Copy link
Contributor

I agree, fencing is a fundamental and useful building block. Do we cover pod fencing here, or only node fencing? Both seem useful. Superficially this proposal looks sane (although I've not reviewed it in detail). Specifically, ability to request a node/pod be fenced, and confirmation that the node/pod has been successfully fenced. A useful addition would be some intermediate state where the node/pod has not actually been fenced yet, but is guaranteed not come alive again (for example, the machine is currently down, might be rebooted, but promises not to do anything until it's checked that it has not been requested to fence while down). Admittedly that's hard to do in practise, but might be possible. My 2c.


Three additional NodeConditions and taints:

* a `node.alpha.kubernetes.io/fencingtriaged` taint corresponding to NodeCondition `FencingTriaged` being
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we do not want to put the alpha/beta in the name any more. It causes too much user pain for too little value.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

Admins may also set NodeCondition `FencingRequired` to `True` to manually
trigger fencing.

Once fencing has been initiated, the NodeCondition `FencingComplete` should
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"should" is a bad word.

Who will set this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect this to be handled by the fencing implementation.
I will replace s/should/must/


## Drawbacks

* In deleting the Node object, we are also removing the history of events that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this doc does not talk about delete ... ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this paragraph is missing some much needed context.

The rest of the document covers how implementations should identify nodes in need of fencing and indicate progress towards the node being made safe.

However once the node is safe, something must use that information to convey to the rest of the system that affected workloads are now safe to run elsewhere.

One approach would be having the fencing logic delete Node objects.
The other extreme is for every controller to monitor for the new Node Conditions and react appropriately.

I do not yet have a good sense of which part of the spectrum would be preferred.


Three additional NodeConditions and taints:

* a `node.alpha.kubernetes.io/fencingtriaged` taint corresponding to NodeCondition `FencingTriaged` being
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this indicate? What does "triaged" mean in this context? If a node has been triaged for fencing, this taint will prevent new work from landing here, even if fencing is not required -- why?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention is that this would be an intermediate state that indicates the node is in some bad state, but policy dictates that it should not be fenced (at least not yet).

Whether this state should inhibit new work is a good question. I can see arguments for both ways.


## Proposal

Three additional NodeConditions and taints:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to talk about the flavor of taint - what is the effect?

It might be good to introduce a real-world example and use that to motivate the design points.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

@k8s-ci-robot
Copy link
Contributor

@beekhof: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 20, 2018
@beekhof
Copy link
Author

beekhof commented Oct 31, 2018

Do we cover pod fencing here, or only node fencing? Both seem useful.

We do not consider pod fencing in the absence of a node level failure.
With a functional kubelet to talk to, would pod fencing not essentially map to a forced pod deletion?

Fencing at both levels is useful but the focus here is on node fencing to recover from node level failures. Though node fencing does imply pod fencing for anything unfortunate enough to be located there.

A useful addition would be some intermediate state where the node/pod has not actually been fenced yet, but is guaranteed not come alive again (for example, the machine is currently down, might be rebooted, but promises not to do anything until it's checked that it has not been requested to fence while down).

Fencing is the act of putting the machine into a safe state. Without having been fenced you can't really know that it has actually been down in order to respect this kind of please-be-good-after-reboot logic. It could be disconnected from the network, or kubelet could be hung - both would look identical to "currently down" from the master's perspective.

What we see in traditional HA solutions, is:
a. flags/delays that allow triage to happen prior to fencing
b. core services like kubelet not being enabled at boot (which allows fencing to happen and the machine to come back, but also time for someone to investigate before returning the node to the cluster)

Our implementation is already planning for a), but perhaps the switch for that mechanism deserves to be part of this specification rather than an implementation detail.

@justaugustus
Copy link
Member

REMINDER: KEPs are moving to k/enhancements on November 30. Please attempt to merge this KEP before then to signal consensus.
For more details on this change, review this thread.

Any questions regarding this move should be directed to that thread and not asked on GitHub.

@beekhof
Copy link
Author

beekhof commented Nov 22, 2018

@derekwaynecarr Should we try to merge this KEP prior to the Nov 30 deadline?

If so, what would you like to see added/changed?

@justaugustus
Copy link
Member

KEPs have moved to k/enhancements.
This PR will be closed and any additional changes to this KEP should be submitted to k/enhancements.
For more details on this change, review this thread.

Any questions regarding this move should be directed to that thread and not asked on GitHub.
/close

@k8s-ci-robot
Copy link
Contributor

@justaugustus: Closed this PR.

In response to this:

KEPs have moved to k/enhancements.
This PR will be closed and any additional changes to this KEP should be submitted to k/enhancements.
For more details on this change, review this thread.

Any questions regarding this move should be directed to that thread and not asked on GitHub.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants