set max dispatch workers to same as max forks #11800
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Right now, without this, we end up with a different number for max_workers than max_forks. For example, on a control node with 16 Gi of RAM,
max_mem_capacity w/ 100 MB/fork = (16*1024)/100 --> 164
max_workers = 5 * 16 --> 80
This means we would allow that control node to control up to 164 jobs, but all jobs after the 80th job will be stuck in
waiting
waiting for a dispatch worker to free up to run the job.SUMMARY
Have max_workers == max forks based on memory capacity to prevent situations where we start jobs because we believe there is enough capacity to control the job, but there are not enough dispatch workers available to actually do so. In cases where a user decides to use the "capacity adjustment" or otherwise limit how many jobs a control node can control, we
ISSUE TYPE
COMPONENT NAME
AWX VERSION
ADDITIONAL INFORMATION
@jainnikhil30 noticed jobs hanging out in waiting for a long time when he was running many concurrent jobs, @AlanCoding helped identify how the max number of dispatch workers factors into that.