-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retriable and non-retriable Pod failures for Jobs #3329
Comments
/sig apps |
/assign |
/assign |
/sig scheduling |
Hello @alculquicondor 👋, 1.25 Enhancements team here. Just checking in as we approach enhancements freeze on 18:00 PT on Thursday June 23, 2022, which is just over 2 days from now. For note, This enhancement is targeting for stage Here's where this enhancement currently stands:
The open PR #3374 is addressing all the listed criteria above. We would just require getting it merged by the Enhancements Freeze. For note, the status of this enhancement is marked as |
With KEP PR #3374 merged, the enhancement is ready for the 1.25 Enhancements Freeze. For note, the status is now marked as |
Hello @alculquicondor 👋, 1.25 Release Docs Lead here. Please follow the steps detailed in the documentation to open a PR against |
@Atharva-Shinde @alculquicondor there is one more PR that should be included before the code freeze: kubernetes/kubernetes#111475 |
Hello, @alculquicondor and @mimowo! Just a friendly reminder: if you need to open a Feature Blog placeholder PR against the website repository, please go ahead and do so. If you have any questions or need any help, feel free to reach out! For more information about writing a blog, check out the blog contribution guidelines. Remember, the deadline for this is coming up soon: Wednesday, July 03, 2024, at 18:00 PDT. Thank you!! |
Hey again @mimowo - 👋 Enhancements team here, Just checking in as we approach code freeze at 02:00 UTC Wednesday 24th July 2024 / 19:00 PDT Tuesday 23rd July 2024. . Here's where this enhancement currently stands:
For this enhancement, it looks like the following PRs are open and need to be merged before code freeze:
If you anticipate missing code freeze, you can file an exception request in advance. Also, please let me know if there are other PRs in k/k we should be tracking for this KEP. |
With all PRs to k/k merged, this KEP is now |
@mimowo Look like kubernetes/kubernetes#126169 has tests related to this KEP. Please make sure to get it merged before the test freeze deadline (01:00 UTC Wednesday 31st July 2024 / 19:00 PDT Tuesday 30th July 2024). |
@sreeram-venkitesh thanks for reaching out, I have updated the PR and will try to merge it, but OTOH I don't think it is required for this release cycle, because we will not promote these tests in this cycle anyway (per kubernetes/kubernetes#125482 (comment)). We have promoted already 2 tests which didn't have the flakiness issue: kubernetes/kubernetes#125482 |
@edithturn I was busy preparing the code updates for the release cycle and missed this message, but I have already opened the placeholder PR for the blog-post, see #3329 (comment), and I'm working on the content. Can we still include it? |
@mimowo @alculquicondor Can we close this issue since the feature is stable? |
There are some post GA tasks to complete https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3329-retriable-and-non-retriable-failures#deprecation |
@alculquicondor in preparation for the next release, could you give me an ETA for when you expect the post GA tasks to complete? Do we need to track this KEP in the next cycle? |
The only pending work is to remove the feature-gate in 1.33. I think we can close the issue, I see that other issues corresponding to GA features which are pending feature-gate removal are already closed, examples: AdmissionWebhookMatchConditions, AggregatedDiscoveryEndpoint, APIListChunking. |
It might be confusing that we have the "Modify the code to ignore the PodDisruptionConditions and JobPodFailurePolicy feature gates" to reflect the actual state" task in the Deprecation, but it was already done. I have created the KEP update: #4835. @alculquicondor please add these two PRs to the implementation list in the issue: kubernetes/kubernetes#125994 and kubernetes/kubernetes#126102 |
Added |
Closing the issue seems ok, given the precedent of other KEPs that already graduated. |
/close |
@mimowo: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Enhancement Description
One-line enhancement description (can be used as a release note): An API to influence retries based on exit codes and/or pod deletion reasons.
Kubernetes Enhancement Proposal: https://git.k8s.io/enhancements/keps/sig-apps/3329-retriable-and-non-retriable-failures
Discussion Link: RFE: ability to define special exit code to terminate existing job kubernetes#17244
Primary contact (assignee): @alculquicondor
Responsible SIGs: apps, api-machinery, scheduling
Enhancement target (which target equals to which milestone):
Alpha
k/enhancements
) update PR(s):k/k
) update PR(s):k/website
) update PR(s): Add docs for KEP-3329 Retriable and non-retriable Pod failures for Jobs website#35219Beta
k/enhancements
) update PR(s):k/k
) update PR(s):k/website
) update(s):Stable
k/enhancements
) update PR(s): Graduate Job Pod Failure Policy to stable #4661k/k
) update PR(s):k/website
) update(s):The text was updated successfully, but these errors were encountered: