Added datacheck for timely release of eHive semaphors #557
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of the problem
Early semaphore release issues affected several pipelines in e111. Such issues have been reported before, in e109 Vertebrates FTP dumps, e105 Mammals EPOwithExt, e103 default ncRNAtrees, and in e96 Plants.
But the impact of this issue in e111 — one pipeline completely rerun, a second pipeline with out-of-sync homology data — calls for measures to catch such issues as soon as possible.
Scope of the pull request
The aim of this PR is to create a datacheck (TimelySemaphoreRelease) which queries the semaphore and job tables to check that every dependent semaphore job was started only after all of its fan jobs had completed. The datacheck should fail if this is not the case for any semaphored job.
Relevant compara ticket: ENSCOMPARASW-6845
WARNING: the position of the CompareVariationRows index entry was changed by this PR.
Testing
The datacheck was manually run on two databases where failure is expected and one where pass is expected.