Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retriable and non-retriable Pod failures for Jobs #3329

Closed
12 tasks done
alculquicondor opened this issue Jun 1, 2022 · 111 comments
Closed
12 tasks done

Retriable and non-retriable Pod failures for Jobs #3329

alculquicondor opened this issue Jun 1, 2022 · 111 comments
Assignees
Labels
lead-opted-in Denotes that an issue has been opted in to a release sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team wg/batch Categorizes an issue or PR as relevant to WG Batch.
Milestone

Comments

@alculquicondor
Copy link
Member

alculquicondor commented Jun 1, 2022

Enhancement Description

@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jun 1, 2022
@alculquicondor
Copy link
Member Author

/sig apps
/wg batch

@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. wg/batch Categorizes an issue or PR as relevant to WG Batch. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 1, 2022
@alculquicondor
Copy link
Member Author

/assign

@mimowo
Copy link
Contributor

mimowo commented Jun 2, 2022

/assign

@alculquicondor alculquicondor changed the title Retriable and non-retriable Job pod failures Retriable and non-retriable Pod failures for Jobs Jun 6, 2022
@alculquicondor
Copy link
Member Author

/sig scheduling
/sig api-machinery

@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Jun 9, 2022
@Priyankasaggu11929
Copy link
Member

Hello @alculquicondor 👋, 1.25 Enhancements team here.

Just checking in as we approach enhancements freeze on 18:00 PT on Thursday June 23, 2022, which is just over 2 days from now.

For note, This enhancement is targeting for stage alpha for 1.25 (correct me, if otherwise)

Here's where this enhancement currently stands:

  • KEP file using the latest template has been merged into the k/enhancements repo.
  • KEP status is marked as implementable
  • KEP has a updated detailed test plan section filled out
  • KEP has up to date graduation criteria
  • KEP has a production readiness review that has been completed and merged into k/enhancements.

The open PR #3374 is addressing all the listed criteria above. We would just require getting it merged by the Enhancements Freeze.

For note, the status of this enhancement is marked as at risk. Please keep the issue description up-to-date with appropriate stages as well. Thank you!

@Priyankasaggu11929 Priyankasaggu11929 added the tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team label Jun 22, 2022
@Priyankasaggu11929 Priyankasaggu11929 added this to the v1.25 milestone Jun 22, 2022
@Priyankasaggu11929
Copy link
Member

With KEP PR #3374 merged, the enhancement is ready for the 1.25 Enhancements Freeze.

For note, the status is now marked as tracked. Thank you so much!

@kcmartin
Copy link

Hello @alculquicondor 👋, 1.25 Release Docs Lead here.
This enhancement is marked as ‘Needs Docs’ for 1.25 release.

Please follow the steps detailed in the documentation to open a PR against dev-1.25 branch in the k/website repo. This PR can be just a placeholder at this time, and must be created by August 4.
 Also, take a look at Documenting for a release to familiarize yourself with the docs requirement for the release.

Thank you!

@Atharva-Shinde
Copy link
Contributor

Atharva-Shinde commented Jul 25, 2022

Hi @alculquicondor @mimowo, Enhancements team here again 👋

Checking in as we approach Code Freeze at 01:00 UTC on Wednesday, 3rd August 2022.

Please ensure that the following items are completed before the code-freeze:

Let me know if there are any additional k/k PRs besides the ones listed above

Currently, the status of the enhancement is marked as at-risk

Thanks :)

@mimowo
Copy link
Contributor

mimowo commented Jul 29, 2022

@Atharva-Shinde @alculquicondor there is one more PR that should be included before the code freeze: kubernetes/kubernetes#111475

@edithturn
Copy link

edithturn commented Jun 30, 2024

Hello, @alculquicondor and @mimowo!

👋 from the v1.31 Communications Team!

We'd love for you to opt in to write a feature blog about your enhancement! Some reasons why you might want to write a blog for this feature include (but are not limited to) if this introduces breaking changes, is important to our users, or has been in progress for a long time and is graduating.

To opt in, let us know and open a Feature Blog placeholder PR against the website repository by 3rd July, 2024. For more information about writing a blog see the blog contribution guidelines.

Note: In your placeholder PR, use XX characters for the blog date in the front matter and file name. We will work with you on updating the PR with the publication date once we have a final number of feature blogs for this release.

Hello, @alculquicondor and @mimowo!

Just a friendly reminder: if you need to open a Feature Blog placeholder PR against the website repository, please go ahead and do so. If you have any questions or need any help, feel free to reach out! For more information about writing a blog, check out the blog contribution guidelines.

Remember, the deadline for this is coming up soon: Wednesday, July 03, 2024, at 18:00 PDT.

Thank you!!
Edith

@tjons
Copy link
Contributor

tjons commented Jul 8, 2024

Hey again @mimowo - 👋 Enhancements team here,

Just checking in as we approach code freeze at 02:00 UTC Wednesday 24th July 2024 / 19:00 PDT Tuesday 23rd July 2024. .

Here's where this enhancement currently stands:

  • All PRs to the Kubernetes repo that are related to your enhancement are linked in the above issue description (for tracking purposes).
  • All PR/s are ready to be merged (they have approved and lgtm labels applied) by the code freeze deadline. This includes tests.

For this enhancement, it looks like the following PRs are open and need to be merged before code freeze:

If you anticipate missing code freeze, you can file an exception request in advance.

Also, please let me know if there are other PRs in k/k we should be tracking for this KEP.
As always, we are here to help if any questions come up. Thanks!

@tjons
Copy link
Contributor

tjons commented Jul 18, 2024

With all PRs to k/k merged, this KEP is now tracked for code freeze!

@tjons tjons moved this from At Risk for Code Freeze to Tracked for Code Freeze in 1.31 Enhancements Tracking Jul 18, 2024
@sreeram-venkitesh
Copy link
Member

@mimowo Look like kubernetes/kubernetes#126169 has tests related to this KEP. Please make sure to get it merged before the test freeze deadline (01:00 UTC Wednesday 31st July 2024 / 19:00 PDT Tuesday 30th July 2024).

@mimowo
Copy link
Contributor

mimowo commented Jul 25, 2024

@mimowo Look like kubernetes/kubernetes#126169 has tests related to this KEP. Please make sure to get it merged before the test freeze deadline (01:00 UTC Wednesday 31st July 2024 / 19:00 PDT Tuesday 30th July 2024).

@sreeram-venkitesh thanks for reaching out, I have updated the PR and will try to merge it, but OTOH I don't think it is required for this release cycle, because we will not promote these tests in this cycle anyway (per kubernetes/kubernetes#125482 (comment)). We have promoted already 2 tests which didn't have the flakiness issue: kubernetes/kubernetes#125482

@mimowo
Copy link
Contributor

mimowo commented Jul 29, 2024

Just a friendly reminder: if you need to open a Feature Blog placeholder PR against the website repository, please go ahead and do so. If you have any questions or need any help, feel free to reach out! For more information about writing a blog, check out the blog contribution guidelines.

@edithturn I was busy preparing the code updates for the release cycle and missed this message, but I have already opened the placeholder PR for the blog-post, see #3329 (comment), and I'm working on the content. Can we still include it?

@Princesso Princesso moved this from Tracked for Code Freeze to At Risk for Doc Freeze in 1.31 Enhancements Tracking Jul 29, 2024
@Princesso Princesso moved this from At Risk for Doc Freeze to Tracked for Doc Freeze in 1.31 Enhancements Tracking Jul 29, 2024
@kannon92
Copy link
Contributor

@mimowo @alculquicondor Can we close this issue since the feature is stable?

@alculquicondor
Copy link
Member Author

@kannon92 kannon92 moved this from Implemented to Done in SIG Node 1.32 KEPs planning Sep 3, 2024
@tjons
Copy link
Contributor

tjons commented Sep 7, 2024

@alculquicondor in preparation for the next release, could you give me an ETA for when you expect the post GA tasks to complete? Do we need to track this KEP in the next cycle?

@mimowo
Copy link
Contributor

mimowo commented Sep 9, 2024

The only pending work is to remove the feature-gate in 1.33.

I think we can close the issue, I see that other issues corresponding to GA features which are pending feature-gate removal are already closed, examples: AdmissionWebhookMatchConditions, AggregatedDiscoveryEndpoint, APIListChunking.

@mimowo
Copy link
Contributor

mimowo commented Sep 9, 2024

It might be confusing that we have the "Modify the code to ignore the PodDisruptionConditions and JobPodFailurePolicy feature gates" to reflect the actual state" task in the Deprecation, but it was already done. I have created the KEP update: #4835. @alculquicondor please add these two PRs to the implementation list in the issue: kubernetes/kubernetes#125994 and kubernetes/kubernetes#126102

@alculquicondor
Copy link
Member Author

Added

@alculquicondor
Copy link
Member Author

Closing the issue seems ok, given the precedent of other KEPs that already graduated.

@mimowo
Copy link
Contributor

mimowo commented Sep 9, 2024

/close

@k8s-ci-robot
Copy link
Contributor

@mimowo: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lead-opted-in Denotes that an issue has been opted in to a release sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team wg/batch Categorizes an issue or PR as relevant to WG Batch.
Projects
Status: Graduating
Status: Tracked
Status: Tracked
Archived in project
Archived in project
Status: Tracked for Doc Freeze
Status: Done
Development

No branches or pull requests