New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support pod bootstrap "checkpointing" in the kubelet #378
Comments
@roberthbailey this seems to miss the v1.8 target? Should we move to next-milestone? |
Yes. |
Proposal is here -> https://github.com/kubernetes/community/pull/1241/files |
@timothysc still alpha for 1.9? |
Yes. |
Automatic merge from submit-queue (batch tested with PRs 55812, 55752, 55447, 55848, 50984). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Initial basic bootstrap-checkpoint support **What this PR does / why we need it**: Adds initial support for Pod checkpointing to allow for controlled recovery of the control plane during self host failure conditions. fixes #49236 xref kubernetes/enhancements#378 **Special notes for your reviewer**: Proposal is here: https://docs.google.com/document/d/1hhrCa_nv0Sg4O_zJYOnelE8a5ClieyewEsQM6c7-5-o/edit?ts=5988fba8# 1. Controlled tests work, but I have not tested the self hosted api-server recovery, that requires validation and logs. /cc @luxas 2. In adding hooks for checkpoint manager much of the tests around basicpodmanager appears to be stub'd. This has become an anti-pattern in the code and should be avoided. 3. I need a node-e2e to ensure consistency of behavior. **Release note**: ``` Add basic bootstrap checkpointing support to the kubelet for control plane recovery ``` /cc @kubernetes/sig-cluster-lifecycle-misc @kubernetes/sig-node-pr-reviews
@calebamiles 👋 Please indicate in the 1.9 feature tracking board |
@timothysc will open a PR with documentation for this soon. |
I think until we are consuming this as part of the test suite as part of the self-hosting feature it is premature to document it's usage. We need to get testing cycles under it's belt and enable the broader feature that the code was written to enable. |
Ok, let's discuss whether we need docs or not tomorrow in the SIG call |
per sig discussion this morning, we are not planning to document the feature until the primary use case which is self-hosting has been enabled by default. We did not have enough test cycles to enable it for 1.9 but are planning to enable in 1.10. I'm not certain who is managing feature documentation for 1.10 but this is the official plan from @kubernetes/sig-cluster-lifecycle-feature-requests |
@timothysc Thanks for the update! |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Automatic merge from submit-queue (batch tested with PRs 55812, 55752, 55447, 55848, 50984). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Initial basic bootstrap-checkpoint support **What this PR does / why we need it**: Adds initial support for Pod checkpointing to allow for controlled recovery of the control plane during self host failure conditions. fixes #49236 xref kubernetes/enhancements#378 **Special notes for your reviewer**: Proposal is here: https://docs.google.com/document/d/1hhrCa_nv0Sg4O_zJYOnelE8a5ClieyewEsQM6c7-5-o/edit?ts=5988fba8# 1. Controlled tests work, but I have not tested the self hosted api-server recovery, that requires validation and logs. /cc @luxas 2. In adding hooks for checkpoint manager much of the tests around basicpodmanager appears to be stub'd. This has become an anti-pattern in the code and should be avoided. 3. I need a node-e2e to ensure consistency of behavior. **Release note**: ``` Add basic bootstrap checkpointing support to the kubelet for control plane recovery ``` /cc @kubernetes/sig-cluster-lifecycle-misc @kubernetes/sig-node-pr-reviews
Automatic merge from submit-queue (batch tested with PRs 55812, 55752, 55447, 55848, 50984). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Initial basic bootstrap-checkpoint support **What this PR does / why we need it**: Adds initial support for Pod checkpointing to allow for controlled recovery of the control plane during self host failure conditions. fixes #49236 xref kubernetes/enhancements#378 **Special notes for your reviewer**: Proposal is here: https://docs.google.com/document/d/1hhrCa_nv0Sg4O_zJYOnelE8a5ClieyewEsQM6c7-5-o/edit?ts=5988fba8# 1. Controlled tests work, but I have not tested the self hosted api-server recovery, that requires validation and logs. /cc @luxas 2. In adding hooks for checkpoint manager much of the tests around basicpodmanager appears to be stub'd. This has become an anti-pattern in the code and should be avoided. 3. I need a node-e2e to ensure consistency of behavior. **Release note**: ``` Add basic bootstrap checkpointing support to the kubelet for control plane recovery ``` /cc @kubernetes/sig-cluster-lifecycle-misc @kubernetes/sig-node-pr-reviews
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@timothysc If so, can you please ensure the feature is up-to-date with the appropriate:
cc @idvoretskyi |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@timothysc there are no references of "checkpoint" in https://kubernetes.io/docs/imported/release/notes/#v1-10-0 ; any pointer please? :) |
greetings. The issue got closed due to it being stale for some time, was curious if it is still planned in a future release |
Feature Description
The text was updated successfully, but these errors were encountered: