Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle jobs with deleted pods #6273

Closed
wants to merge 1 commit into from

Conversation

dnephin
Copy link
Contributor

@dnephin dnephin commented Dec 3, 2023

Fixes #6272

Check for the condition where a job is completed but no longer has pods, and set those to state.Terminated to indicate they are ready.

Tested manually using the steps described in #6272, and added a unit test that fails without this change.

Signed-off-by: Daniel Nephin <dnephin@gmail.com>
@nicks
Copy link
Member

nicks commented Dec 4, 2023

thanks for investigating this!

let me poke around a bit before we merge this approach. historically, we try to keep "status interpretation" as low level as possible, or you get weird bugs where multiple layers are interpreting the status differently.

@dnephin
Copy link
Contributor Author

dnephin commented Dec 4, 2023

Another option I had considered was to remove the || res.Type == v1alpha1.TargetTypeJob from

} else if res.State.Active != nil && (!res.State.Active.Ready || res.Type == v1alpha1.TargetTypeJob) {
, but that seemed weird because then a job could be ready in two different states (either terminated, or active and ready).

@dnephin dnephin mentioned this pull request Dec 4, 2023
@nicks
Copy link
Member

nicks commented Dec 5, 2023

i re-rolled this into #6276, which has some additional integration tests. ya, i think removing the || res.Type == v1alpha1.TargetTypeJob is the right change.

@nicks nicks closed this Dec 5, 2023
@dnephin dnephin deleted the fix-jobs-with-deleted-pods branch December 6, 2023 00:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Job never becomes ready when re-attaching to exiting env when pods for the job are deleted
2 participants