Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reclaim ownership for failed jobs #6996

Merged
merged 2 commits into from Dec 14, 2018

Conversation

@bernt-matthias
Copy link
Contributor

bernt-matthias commented Nov 9, 2018

Fixes the first bug described in #4308. Note that the error described in the issue stems from _handle_runner_state. There are two ways the code may have ended up in this function:

  • _handle_runner_state() <- finish_or_resubmit() <- finish_job()
  • _handle_runner_state() <- fail_job()

The former already calls reclaim_ownership in finish_job, but for fail_job() seems to be missing.

Note that, I'm using the drmaa runner calling reclaim_ownership in _handle_runner_state() for a long time now (#4857), but as @jmchilton suggested this might not be the best place for the call (#4857 (comment)) -- also the reclaim_ownership would be called twice in when finish_job() is called.

Still a bit unsure if this is the optimal place for reclaiming owner ship, but certainly ownership needs to be changed for all jobs.

@galaxybot galaxybot added the triage label Nov 9, 2018
@galaxybot galaxybot added this to the 19.01 milestone Nov 9, 2018
@bernt-matthias bernt-matthias force-pushed the bernt-matthias:topic/reclaim_failed branch 3 times, most recently from 5269acb to 21a4f07 Nov 9, 2018
@bernt-matthias bernt-matthias force-pushed the bernt-matthias:topic/reclaim_failed branch from 21a4f07 to 876b460 Nov 14, 2018
@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Dec 12, 2018

👍 from me, I'll merge after these tests run.

@jmchilton jmchilton merged commit 1caf7f4 into galaxyproject:dev Dec 14, 2018
6 checks passed
6 checks passed
api test Build finished. 444 tests run, 1 skipped, 0 failed.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
framework test Build finished. 192 tests run, 0 skipped, 0 failed.
Details
integration test Build finished. 274 tests run, 10 skipped, 0 failed.
Details
selenium test Build finished. 151 tests run, 3 skipped, 0 failed.
Details
toolshed test Build finished. 577 tests run, 0 skipped, 0 failed.
Details
@bernt-matthias bernt-matthias deleted the bernt-matthias:topic/reclaim_failed branch Jan 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.