Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete non-terminal jobs and subworkflow invocations when cancelling invocation #16252

Merged

Conversation

mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Jun 15, 2023

Came up as a wish during the last UI/UX call with @nomadscientist. I contemplated making the job deletion part optional, but couldn't think of a reason why we wouldn't make this the only default. No problem to make it configurable though if there is a reason to keep the old default.

Not sure this is completely free of race conditions, but worst case you gotta delete again ? That seems fine.

I think it'd also be nice to have various flavors of bulk deletion (delete all datasets produced by an invocation, delete all non-output datasets, maybe other modalities ?).

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

lib/galaxy/managers/workflows.py Outdated Show resolved Hide resolved
lib/galaxy/managers/workflows.py Outdated Show resolved Hide resolved
@mvdbeek
Copy link
Member Author

mvdbeek commented Jun 16, 2023

| 5 - Concatenate datasets (with sleep) on data 2 (HID - NAME) 
INFO:     127.0.0.1:54788 - "GET /api/histories/b492c5865e953d13/contents/f4569757dd099abb HTTP/1.1" 200 OK
| Dataset State:
|  running
| Dataset Blurb:
|  deleted
| Dataset Info:
|  *Dataset info is empty.*
| Peek:
|  <table cellspacing="0" cellpadding="3"><tr><td>Job deleted</td></tr></table>
INFO:     127.0.0.1:54798 - "GET /api/histories/b492c5865e953d13/contents/f4569757dd099abb/provenance HTTP/1.1" 200 OK
| Dataset Job Standard Output:
|  *Standard output was empty.*
| Dataset Job Standard Error:
|  *Standard error was empty.*

hmm, I guess that's the race condition I anticipated.

@mvdbeek mvdbeek marked this pull request as draft June 20, 2023 14:46
@mvdbeek mvdbeek removed this from the 23.1 milestone Jun 21, 2023
@mvdbeek mvdbeek force-pushed the delete_jobs_when_cancelling_invocation branch 7 times, most recently from 0c67ecd to ea97fe7 Compare October 11, 2023 22:06
@mvdbeek mvdbeek marked this pull request as ready for review October 12, 2023 09:55
@mvdbeek
Copy link
Member Author

mvdbeek commented Oct 12, 2023

This is ready for review now, the conda unit test is unrelated.

@github-actions github-actions bot added this to the 23.2 milestone Oct 12, 2023
@mvdbeek mvdbeek requested a review from a team October 12, 2023 09:56
@mvdbeek mvdbeek force-pushed the delete_jobs_when_cancelling_invocation branch from 001b7be to bd3eb21 Compare October 12, 2023 10:01
API / users set invocation to cancelling, scheduler then deletes
outputs. This avoids race conditions where the cancelled invocation
still generates new jobs.
@mvdbeek mvdbeek force-pushed the delete_jobs_when_cancelling_invocation branch from bd3eb21 to b684ee9 Compare November 13, 2023 15:00
@nsoranzo
Copy link
Member

Failed API tests seem relevant.

@mvdbeek
Copy link
Member Author

mvdbeek commented Nov 14, 2023

yeah, and the crazy thing is reverting 3e4380f fixes this

@mvdbeek
Copy link
Member Author

mvdbeek commented Nov 14, 2023

ouch workflow_invocation.state = workflow_invocation.mark_cancelled() ... well that sets the state to None 😆

@mvdbeek mvdbeek force-pushed the delete_jobs_when_cancelling_invocation branch from a4df188 to db9d5b5 Compare November 14, 2023 08:48
@mvdbeek mvdbeek force-pushed the delete_jobs_when_cancelling_invocation branch from a0055ca to 194f5ad Compare November 14, 2023 08:59
@mvdbeek
Copy link
Member Author

mvdbeek commented Nov 14, 2023

This probably also fixes #1450 ... if it is still an issue
A similar approach where we prevent changing the state from cancelling, cancelled, failed to new, ready, scheduled would probably also help with #1450

@mvdbeek mvdbeek force-pushed the delete_jobs_when_cancelling_invocation branch from 5e55552 to 2a11420 Compare November 14, 2023 14:12
@mvdbeek
Copy link
Member Author

mvdbeek commented Nov 14, 2023

Failing tests look unrelated, but I triggered a rerun anyway.

@mvdbeek mvdbeek merged commit 706e977 into galaxyproject:dev Nov 14, 2023
49 of 51 checks passed
@mvdbeek mvdbeek added the highlight/power-user Included at bottom of user-facing release notes (please use either this or highlight, but not both) label Jan 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/workflows highlight/power-user Included at bottom of user-facing release notes (please use either this or highlight, but not both) kind/enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants