Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing Sleep Action for PreStop Hook #3960

Open
8 of 12 tasks
AxeZhan opened this issue Apr 22, 2023 · 64 comments
Open
8 of 12 tasks

Introducing Sleep Action for PreStop Hook #3960

AxeZhan opened this issue Apr 22, 2023 · 64 comments
Assignees
Labels
sig/node Categorizes an issue or PR as relevant to SIG Node. stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status
Milestone

Comments

@AxeZhan
Copy link
Member

AxeZhan commented Apr 22, 2023

Enhancement Description

Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.

@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Apr 22, 2023
@AxeZhan
Copy link
Member Author

AxeZhan commented Apr 22, 2023

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 22, 2023
@AxeZhan
Copy link
Member Author

AxeZhan commented Apr 22, 2023

/cc @thockin

@SergeyKanzhelev
Copy link
Member

/milestone v1.28

adding to the milestone for tracking

@Atharva-Shinde
Copy link
Contributor

Hello @AxeZhan 👋, Enhancements team here.

Just checking in as we approach Enhancements freeze on Thursday, 16th June 2023.

This enhancement is targeting for stage alpha for 1.28 (correct me, if otherwise)

Here's where this enhancement currently stands:

  • KEP readme using the latest template has been merged into the k/enhancements repo.
  • KEP status is marked as implementable for latest-milestone: 1.28
  • KEP readme has a updated detailed test plan section filled out
  • KEP readme has up to date graduation criteria
  • KEP has a production readiness review that has been completed and merged into k/enhancements.

For this KEP, we would need to update the following:

  • Add approver/s to prod-readiness yaml file
  • Change the status of the KEP to implementable
  • Get approval from the PRR authors

The status of this enhancement is marked as at risk. Please keep the issue description up-to-date with appropriate stages as well. Thank you!

@Atharva-Shinde
Copy link
Contributor

/stage alpha

@k8s-ci-robot k8s-ci-robot added the stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status label Jun 5, 2023
@Atharva-Shinde
Copy link
Contributor

Hey @AxeZhan,
We would just need to update the status inside kep.yaml to implementable instead of provisional as mentioned in my comment above. Please make the change within 24 hours else you would have to file an exception request.
Thanks :)

@SergeyKanzhelev
Copy link
Member

this should be updated now. Since there were no material change, can we get it tracked for 1.28?

@Atharva-Shinde
Copy link
Contributor

With all the KEP requirements in place and merged into k/enhancements, the status of this enhancement is now marked as tracked. Please keep the issue description up-to-date with appropriate stages as well. Thank you :)

@AdminTurnedDevOps
Copy link

Hey @AxeZhan

1.28 Docs Shadow here.

Does this enhancement work planned for 1.28 require any new docs or modification to existing docs?

If so, please follows the steps here to open a PR against dev-1.28 branch in the k/website repo. This PR can be just a placeholder at this time and must be created before Thursday 20th July 2023.

Also, take a look at Documenting for a release to get yourself familiarize with the docs requirement for the release.

Thank you!

@charles-chenzz
Copy link
Member

hi @AdminTurnedDevOps , we currently working on the implement in k/k repos, when we finish it, we will start to add docs. Thanks.

@AxeZhan
Copy link
Member Author

AxeZhan commented Jun 22, 2023

Hey @AdminTurnedDevOps
Thank you for your remind!

Does this enhancement work planned for 1.28 require any new docs or modification to existing docs?

This definitely needs some new docs :)
I've created a draft pr for place holder, and have updated the link in the Description above, we'll get back to finish the detail after the code changes are done.
We are not very familiar with updating the documentation, so we may need some assistance from your team when the time comes. Thank you in advance!

@AxeZhan
Copy link
Member Author

AxeZhan commented Jul 6, 2023

Hi all, I've created kubernetes/kubernetes#119026 for some discussion(because I'm not certain of some implementations/e2e tests), I'd be super glad if some of you can have a look and give me a help😊。

@Atharva-Shinde
Copy link
Contributor

Hey again @AxeZhan 👋

Just checking in as we approach Code freeze at 01:00 UTC Friday, 19th July 2023 .

Here’s the enhancement’s state for the upcoming code freeze:

  • All the PRs that are related to your enhancement are linked in the above issue description (for tracking purposes). This includes code, tests, and documentation related PR/s.
  • All code related PR/s are merged or are in merge-ready state ( i.e they have approved and lgtm labels applied) by the code freeze deadline. This includes any tests related PR/s too.

For this enhancement, it looks like the following code related PR/s are open and they need to be merged or should be in merge-ready state before the code freeze commences :

This is the code related PR/s that I found on this KEP issue

Please keep the issue description up-to-date with all the PR/s that are associated with this KEP and let me know if there are other PR/s in k/k we should be tracking for this KEP.

As always, we are here to help if any questions come up. Thanks!

@Atharva-Shinde
Copy link
Contributor

Hello @AxeZhan 👋, 1.28 Enhancements Lead here.

Unfortunately, the implementation (code related) PR associated with this enhancement is not in the merge-ready state by code-freeze and hence this enhancement is now removed from the v1.28 milestone.

If you still wish to progress this enhancement in v1.28, please file an exception request. Thanks!

/milestone clear

@k8s-ci-robot k8s-ci-robot removed this from the v1.28 milestone Jul 19, 2023
@pacoxu
Copy link
Member

pacoxu commented May 24, 2024

/remove-stage alpha
/stage ga
The stage label is for target status.

@k8s-ci-robot k8s-ci-robot removed the stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status label May 24, 2024
@pacoxu
Copy link
Member

pacoxu commented May 24, 2024

/stage stable

@k8s-ci-robot k8s-ci-robot added the stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status label May 24, 2024
@AlexanderYastrebov
Copy link

AlexanderYastrebov commented Jul 4, 2024

Hello.

In our usecase we add preStop sleep hook via admission webhook by default if preStop hook is not specified by the cluster user.
To opt-out from this we ask users to add a no-op preStop hook:

lifecycle:
  preStop:
    exec:
      command:
        - "true"

To remove dependency on true binary we tried:

lifecycle:
  preStop:
    sleep: 0

and discovered it does not pass validation:

Invalid value: 0: must be greater than 0 and less than terminationGracePeriodSeconds (30)

Is it possible to allow zero seconds value to enable no-op preStop hook?
Is there any other method maybe to define no-op preStop hook that does not require executing the binary?

@aojea
Copy link
Member

aojea commented Jul 4, 2024

can't you use an annotation?

@AlexanderYastrebov
Copy link

can't you use an annotation?

This is possible but my inquiry is more about allowing zero value.

E.g. go time.Sleep permits zero sleep duration:

A negative or zero duration causes Sleep to return immediately.

@thockin
Copy link
Member

thockin commented Jul 4, 2024 via email

@aojea
Copy link
Member

aojea commented Jul 4, 2024

We could, I guess. It's not semantically wrong, I guess.

Prestop is already a pointer

https://github.com/kubernetes/kubernetes/blob/c87c06d7ffb932395f963bc78f03f9d2f0e57698/pkg/apis/core/types.go#L2598-L2614

agree is semantically correct, but the motivation is the one that I'm a bit confused

In our usecase we add preStop sleep hook via admission webhook by default if preStop hook is not specified by the cluster user.
To opt-out from this we ask users to add a no-op preStop hook:

the logic implemented seems to be "by default I will add a hook, if you want to opt-out , just add one that is a noop"
It seems simpler/easier to indicate "use annotation i.do.not.want.you.to.default.my.prestop/hook if you want to opt-out of the default hook"

@thockin
Copy link
Member

thockin commented Jul 4, 2024

If the logic was:

if lifecycle.preStop is not set {
    set preStop to sleep(10)
}

Then any user can set preStop, whether to sleep(0) or sleep(3) or exec or whatever. I agree it's a little odd to use this as a trigger, but it seems consistent to support it. It' s sort of morally equivalent to a lifecycle.preStop.doNothing, but less ugly :)

@aojea
Copy link
Member

aojea commented Jul 5, 2024

who wants to create the KEP 😄 ???

This change will need to be feature gated, it is a small KEP, but it will require some work

@AlexanderYastrebov
Copy link

AlexanderYastrebov commented Jul 5, 2024

Does it have to be a KEP and a feature gate at all?

It seems that zero value was already a part of the original KEP, e.g. here I read:

  • Test that the validation returns an error when given an invalid duration value (e.g., a negative value).
  • Test that the runSleepHandler function returns immediately when given a duration of zero.

@thockin
Copy link
Member

thockin commented Jul 5, 2024

It seems that zero value was already a part of the original KEP

oops! It looks like the implementation demands >0 when it should have been >=0.

Does it have to be a KEP and a feature gate at all?

Unfortunately yes. In particular, we need to handle this case:

  • Suppose this change goes into 1.31
  • Admin rolls out 1.31
  • User uses this feature and sets it to 0
  • Admin rolls back to 1.30
  • The pods/deployments/etc which have this value as 0 will fail to validate

Demanding 1 release with an alpha gate means you CAN roll back and it will be tolerated.

@sreeram-venkitesh
Copy link
Member

@aojea @thockin I'd be interested in driving this if @AlexanderYastrebov isn't picking it up already!

@AlexanderYastrebov
Copy link

AlexanderYastrebov commented Jul 8, 2024

We went ahead with a sleep 1 as the next best thing for now:

lifecycle:
  preStop:
    sleep:
      seconds: 1 # not possible to set to 0: https://github.com/kubernetes/enhancements/issues/3960#issuecomment-2208556397

@aojea @thockin I'd be interested in driving this if @AlexanderYastrebov isn't picking it up already!

I did not pick it up but curious to learn the amount of effort required to fix such a minor bug in KEP implementation.

@szuecs
Copy link
Member

szuecs commented Jul 8, 2024

@sreeram-venkitesh can you just add @AlexanderYastrebov to whatever you propose as changes to fix it (KEP and PR)?

As far as I understand @AlexanderYastrebov could learn this way and @AlexanderYastrebov can review the change proposals. Please correct me if I am wrong. :)

@sreeram-venkitesh
Copy link
Member

@szuecs Looks like we'd need to roll this out as a different KEP as mentioned by @thockin in #3960 (comment) so that we can roll back the change if needed.

@szuecs
Copy link
Member

szuecs commented Jul 8, 2024

@sreeram-venkitesh yes!
I just refer to #3960 (comment) and I think my colleague @AlexanderYastrebov (hey welcome my colleague!) was trying to say that you can go for it to create KEP and PR.
Just say if you need our help.

@aojea
Copy link
Member

aojea commented Jul 8, 2024

Example of a KEP to relax validation https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/4427-relaxed-dns-search-validation

@sreeram-venkitesh
Copy link
Member

sreeram-venkitesh commented Jul 8, 2024

@szuecs Ah sorry! I thought you were talking about adding the change in the same KEP! Thanks for clarifying! I will ping you both once I have something ready.

@aojea Since we're past the enhancements freeze, this new offshoot KEP can target for alpha in v1.32 right? I can put something together soon and keep the KEP ready before the next release cycle starts. What do you think?

@szuecs
Copy link
Member

szuecs commented Jul 8, 2024

@sreeram-venkitesh we don't need to rush this, v1.32 looks fine.

@thockin
Copy link
Member

thockin commented Jul 8, 2024 via email

@sreeram-venkitesh
Copy link
Member

@thockin Since this KEP would be graduating from beta to GA in 1.32, would adding another gate in the same KEP require another round of beta then? Else the new feature gate would be going straight to GA right?

@thockin
Copy link
Member

thockin commented Jul 8, 2024

In theory, the one gate could go GA while the other did not. It's not COMMON, and this one is small, so I guess a new KEP is easy enough.

@kannon92
Copy link
Contributor

kannon92 commented Aug 29, 2024

For the new KEP, let's get the issue created so we can discuss separate from this one.

Looking at this KEP, it sounds like someone wants to promote this to stable in 1.32.

I will move this to considered for release for sig-node.

@sreeram-venkitesh
Copy link
Member

@kannon92 I'll tag in the issue for the new KEP once I open it. I've been drafting the KEP doc and was planning to open the issue soon.

@thockin
Copy link
Member

thockin commented Aug 31, 2024

Why not GA this? I mean, it's useful as is, the "allow 0" KEP can follow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/node Categorizes an issue or PR as relevant to SIG Node. stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status
Projects
Status: Removed from Milestone
Status: Tracked for Code Freeze
Status: Tracked for Doc Freeze
Status: Proposed for consideration
Development

No branches or pull requests