Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retriable and non-retriable Pod failures for Jobs #3329

Open
6 of 8 tasks
alculquicondor opened this issue Jun 1, 2022 · 30 comments
Open
6 of 8 tasks

Retriable and non-retriable Pod failures for Jobs #3329

alculquicondor opened this issue Jun 1, 2022 · 30 comments
Assignees
Labels
lead-opted-in Denotes that an issue has been opted in to a release sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/beta Denotes an issue tracking an enhancement targeted for Beta status tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team wg/batch Categorizes an issue or PR as relevant to WG Batch.
Milestone

Comments

@alculquicondor
Copy link
Member

alculquicondor commented Jun 1, 2022

Enhancement Description

@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jun 1, 2022
@alculquicondor
Copy link
Member Author

alculquicondor commented Jun 1, 2022

/sig apps
/wg batch

@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. wg/batch Categorizes an issue or PR as relevant to WG Batch. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 1, 2022
@alculquicondor
Copy link
Member Author

alculquicondor commented Jun 1, 2022

/assign

@mimowo
Copy link
Contributor

mimowo commented Jun 2, 2022

/assign

@alculquicondor alculquicondor changed the title Retriable and non-retriable Job pod failures Retriable and non-retriable Pod failures for Jobs Jun 6, 2022
@alculquicondor
Copy link
Member Author

alculquicondor commented Jun 9, 2022

/sig scheduling
/sig api-machinery

@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Jun 9, 2022
@Priyankasaggu11929
Copy link
Member

Priyankasaggu11929 commented Jun 22, 2022

Hello @alculquicondor 👋, 1.25 Enhancements team here.

Just checking in as we approach enhancements freeze on 18:00 PT on Thursday June 23, 2022, which is just over 2 days from now.

For note, This enhancement is targeting for stage alpha for 1.25 (correct me, if otherwise)

Here's where this enhancement currently stands:

  • KEP file using the latest template has been merged into the k/enhancements repo.
  • KEP status is marked as implementable
  • KEP has a updated detailed test plan section filled out
  • KEP has up to date graduation criteria
  • KEP has a production readiness review that has been completed and merged into k/enhancements.

The open PR #3374 is addressing all the listed criteria above. We would just require getting it merged by the Enhancements Freeze.

For note, the status of this enhancement is marked as at risk. Please keep the issue description up-to-date with appropriate stages as well. Thank you!

@Priyankasaggu11929 Priyankasaggu11929 added the tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team label Jun 22, 2022
@Priyankasaggu11929 Priyankasaggu11929 added this to the v1.25 milestone Jun 22, 2022
@Priyankasaggu11929
Copy link
Member

Priyankasaggu11929 commented Jun 23, 2022

With KEP PR #3374 merged, the enhancement is ready for the 1.25 Enhancements Freeze.

For note, the status is now marked as tracked. Thank you so much!

@kcmartin
Copy link

kcmartin commented Jul 12, 2022

Hello @alculquicondor 👋, 1.25 Release Docs Lead here.
This enhancement is marked as ‘Needs Docs’ for 1.25 release.

Please follow the steps detailed in the documentation to open a PR against dev-1.25 branch in the k/website repo. This PR can be just a placeholder at this time, and must be created by August 4.
 Also, take a look at Documenting for a release to familiarize yourself with the docs requirement for the release.

Thank you!

@Atharva-Shinde
Copy link

Atharva-Shinde commented Jul 25, 2022

Hi @alculquicondor @mimowo, Enhancements team here again 👋

Checking in as we approach Code Freeze at 01:00 UTC on Wednesday, 3rd August 2022.

Please ensure that the following items are completed before the code-freeze:

Let me know if there are any additional k/k PRs besides the ones listed above

Currently, the status of the enhancement is marked as at-risk

Thanks :)

@mimowo
Copy link
Contributor

mimowo commented Jul 29, 2022

@Atharva-Shinde @alculquicondor there is one more PR that should be included before the code freeze: kubernetes/kubernetes#111475

@Atharva-Shinde
Copy link

Atharva-Shinde commented Jul 29, 2022

thank you @mimowo,
I have updated my comment with the PR and have also tagged you for future reference :)

@k8s-ci-robot k8s-ci-robot removed the tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team label Sep 22, 2022
@Atharva-Shinde
Copy link

Atharva-Shinde commented Sep 24, 2022

Hey @alculquicondor @mimowo 👋, 1.26 Enhancements team here!

Just checking in as we approach Enhancements Freeze on 18:00 PDT on Thursday 6th October 2022.

This enhancement is targeting for stage beta for 1.26

Here's where this enhancement currently stands:

  • KEP file using the latest template has been merged into the k/enhancements repo.
  • KEP status is marked as implementable
  • KEP has an updated detailed test plan section filled out
  • KEP has up to date graduation criteria
  • KEP has a production readiness review that has been completed and merged into k/enhancements.

For this KEP, we would need to:

  • Update the kep.yaml to reflect the current milestone information
  • Update the production readiness review with latest stage information
  • Include the new updated PR of this KEP in the Issue Description and get it merged before Enhancements Freeze to make this enhancement eligible for 1.26 release.

The status of this enhancement is marked as at risk. Please keep the issue description up-to-date with appropriate stages as well.
Thank you :)

@mimowo
Copy link
Contributor

mimowo commented Sep 26, 2022

@Atharva-Shinde the enhancement is targeting Beta for 1.26. This is the KEP update which is currently under review: #3463.

@Atharva-Shinde
Copy link

Atharva-Shinde commented Sep 26, 2022

Thanks @mimowo I've updated my comment :)

@derekwaynecarr
Copy link
Member

derekwaynecarr commented Oct 3, 2022

/milestone v1.26
/label lead-opted-in

(For sig-node, we see this is not attempting to derive any intelligence from kubelet/runtime initiated conditions)

@Atharva-Shinde
Copy link

Atharva-Shinde commented Oct 5, 2022

Hello @alculquicondor @mimowo 👋, just a quick check-in again, as we approach the 1.26 Enhancements freeze.

Please plan to get the action items mentioned in my comment above done before Enhancements freeze on 18:00 PDT on Thursday 6th October 2022 i.e tomorrow

For note, the current status of the enhancement is marked at-risk :)

@soltysh
Copy link
Contributor

soltysh commented Oct 5, 2022

@Atharva-Shinde the PRR has been approved at the correct beta level in https://github.com/kubernetes/enhancements/pull/3463/files so not quite sure what else do you expect?

@Atharva-Shinde
Copy link

Atharva-Shinde commented Oct 5, 2022

Thanks @soltysh for bringing this to my notice (not sure how I missed this sorry for the error), everything is up-to-date! I've updated the KEP status to tracked for 1.26 release cycle :)

@parul5sahoo
Copy link
Member

parul5sahoo commented Nov 1, 2022

Hi @alculquicondor and @mimowo 👋,

Checking in once more as we approach 1.26 code freeze at 17:00 PDT on Tuesday 8th November 2022.

Please ensure the following items are completed:

  • All PRs to the Kubernetes repo that are related to your enhancement are linked in the above issue description (for tracking purposes).
  • All PRs are fully merged by the code freeze deadline.

For this enhancement, please plan to get PRs out for all k/k code so it can be merged up by code freeze. If you do have k/k PRs open, please link them to this issue. Let me know if there aren't any further PRs that need to be created or merged for this enhancements, so that I can mark it as tracked for code freeze.

As always, we are here to help should questions come up. Thanks!

@cathchu
Copy link

cathchu commented Nov 2, 2022

Hello @alculquicondor and @mimowo 👋 1.26 Release Docs shadow here!

This enhancement is marked as ‘Needs Docs’ for 1.26 release.
Please follow the steps detailed in the documentation to open a PR against dev-1.26 branch in the k/website repo. This PR can be just a placeholder at this time, and must be created by November 9. Also, take a look at Documenting for a release to familiarize yourself with the docs requirement for the release.

Thank you!

@mimowo
Copy link
Contributor

mimowo commented Nov 2, 2022

The placeholder PR is prepared: kubernetes/website#37242. @alculquicondor please reference it in the Issue description.

@parul5sahoo
Copy link
Member

parul5sahoo commented Nov 7, 2022

Hey @mimowo and @alculquicondor ,

As the Code freeze is just a day away, just wanted to confirm that there are no open PRs in the K/K repo or any repo in general for this enhancement other than the ones outlined in the issue description? Please get the open PRs merged before the code freeze, so that the enhancement can be marked tracked.

@mimowo
Copy link
Contributor

mimowo commented Nov 7, 2022

There is one more k/e PR with a purpose to align the KEP with the decisions taken during the implementation phase. Not sure if it should be blocking for the Code Freeze. Anyway, could you @alculquicondor please add the KEP update
to the list of PRs and review / approve.

Hey @mimowo and @alculquicondor ,

As the Code freeze is just a day away, just wanted to confirm that there are no open PRs in the K/K repo or any repo in general for this enhancement other than the ones outlined in the issue description? Please get the open PRs merged before the code freeze, so that the enhancement can be marked tracked.

@rhockenbury
Copy link

rhockenbury commented Nov 9, 2022

We have this marked as tracked for code freeze.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lead-opted-in Denotes that an issue has been opted in to a release sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/beta Denotes an issue tracking an enhancement targeted for Beta status tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team wg/batch Categorizes an issue or PR as relevant to WG Batch.
Projects
Status: Graduating
Development

No branches or pull requests