Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Failed Subjob Result Sending #454

Merged
merged 12 commits into from Dec 22, 2021

Conversation

rsennewald
Copy link
Contributor

  • Retry sending results on subjob completion up to 5 times with a linear sleep to resolve the issue where sometimes subjobs would fail to send on first attempt and cause a run to never send all results and show as In Progress indefinitely.
  • Check if executor count is >= 0 instead of == 0 when marking a slave as idle. There was a bug where a slave could get to a negative executor count likely through a race condition and cause an Exception here.
  • Rename stages of Dockerfile to be more explicit and utilize builder stage for building a test image and running tests in docker for easier local development.
  • Added targets docker-lint and docker-test to Makefile for making local testing in the same docker image as we build our rpm in possible.

@CLAassistant
Copy link

CLAassistant commented Dec 17, 2021

CLA assistant check
All committers have signed the CLA.

@rsennewald rsennewald changed the title Fix failed subjob result sending Fix Failed Subjob Result Sending Dec 17, 2021
@rsennewald rsennewald merged commit aed64a6 into box:master Dec 22, 2021
@rsennewald rsennewald deleted the fixFailedSubjobResultSending branch December 22, 2021 01:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants