Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cancel runs if no progress is made in the manifest #12

Merged
merged 13 commits into from Apr 5, 2023

Conversation

ayazhafiz
Copy link
Member

Presently, active runs can reach a state where all associated workers die, and no progress is made on the test suite, but the test suite sticks around in the queue memory. Since such runs may not be returned to at all, we'd like to diminish the amount of pressure they might place on a running queue.

This series of patches addresses the problem by running a job every hour that checks whether test runs have had any progress in their manifest. If either

  • there is no manifest associated with the run after an hour, or
  • no more items have been popped off the manifest since the last time progress was checked

then the run will be cancelled. If progress has been made, or the run was already done, the run is left untouched. If progress has been made and the run is not yet done, a job to check the progress again later is re-enqueued.

In the future, we'll likely want to adjust the behavior to not outright cancel a job, but to admit some way to re-launch the job from the last failure state.

@ayazhafiz ayazhafiz requested a review from doxavore April 5, 2023 17:36
@github-actions
Copy link

github-actions bot commented Apr 5, 2023

Bigtest for 9537e32 (run)

Benchmarks:

  • RSpec: 11.24% overhead
    • RSpec time: 17.79 seconds
    • ABQ time: 19.79 seconds
  • RSpec parallel, 10 runs: max 15.29% overhead
    • min 6.69% overhead
    • standard deviation: 2.87%
  • Jest: 5.58% overhead
    • Jest time: 21.16 seconds
    • ABQ time: 22.341 seconds

Fuzz result sizes:

  • PASSED

@ayazhafiz ayazhafiz enabled auto-merge (squash) April 5, 2023 20:16
@github-actions
Copy link

github-actions bot commented Apr 5, 2023

Bigtest for 66dac76 (run)

Benchmarks:

  • RSpec: 15.57% overhead
    • RSpec time: 17.73 seconds
    • ABQ time: 20.49 seconds
  • RSpec parallel, 10 runs: max 12.63% overhead
    • min 6.66% overhead
    • standard deviation: 1.51%
  • Jest: 6.11% overhead
    • Jest time: 21.362 seconds
    • ABQ time: 22.667 seconds

Fuzz result sizes:

  • PASSED

@ayazhafiz ayazhafiz merged commit 3547f18 into main Apr 5, 2023
17 checks passed
@ayazhafiz ayazhafiz deleted the cancel-inactive-runs branch April 5, 2023 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants