Skip to content

Conversation

@moio
Copy link
Contributor

@moio moio commented Sep 21, 2017

What does this PR do?

This adds a minion configuration parameter to limit the number of processes or threads a minion will start in response to published messages, preventing resource exhaustion in case a high number of concurrent jobs is scheduled in a short time.

What issues does this PR fix or reference?

#43665

Previous Behavior

salt-minion could consume significant resources if a large number of functions calls is scheduled in parallel, as it would spawn one process for each.

New Behavior

By default, nothing changes. If process_count_max is specified, salt-minion will not spawn more than a certain amount of processes/threads.

Tests written?

One new unit test was added and existing tests were updated.

@ghost
Copy link

ghost commented Sep 21, 2017

@moio, thanks for your PR! By analyzing the history of the files in this pull request, we identified @s0undt3ch, @jacksontj and @cachedout to be potential reviewers.

@moio moio force-pushed the develop-limit-minion-processes branch 2 times, most recently from 1f73d69 to 84daaf2 Compare September 21, 2017 15:12
@cachedout cachedout requested a review from thatch45 September 21, 2017 16:41
salt/minion.py Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some real concerns here. This is perhaps the most performance-sensitive area of the minion and I really don't like introducing more disk hits and de-serialization overhead.

I would feel much more comfortable if this configuration option were disabled by default but I'll also let @thatch45 and others comment here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reasoning went along the lines that hitting the disk (I believe you are referring to salt.utils.minion.running here) is certainly less expensive than spawning a process. At least that is what I empirically observed in my setup - going above ~100 concurrent processes becomes intractable quickly.

That said I was also wondering whether it would be wiser to disable by default (retaining the current behavior) or implement the safety net by default, which is ultimately what I chose. In any case I am absolutely open to change course if you think it's more appropriate.

Thanks for looking into this @cachedout!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is just that we incur that overhead even in cases where we'll never be concerned about hitting an upper limit on processes. We go to great pains to avoid that. The only way I think I'll feel comfortable here is if this is disabled by default, I think.

@moio moio force-pushed the develop-limit-minion-processes branch from 84daaf2 to b81069b Compare September 22, 2017 13:47
@moio
Copy link
Contributor Author

moio commented Sep 22, 2017

@cachedout: I added a completely optional, self contained commit on top that disables the mechanism by default.

I did not find significant performance impacts for my typical use cases but I am totally fine if you judge it might not be OK in general - in that case simply include this commit, otherwise leave it out from the merge.

Thanks for your time

doc/man/salt.7 Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove all the changes to the man page. They are compiled separately and shouldn't be updated directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be updated to -1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be updated to -1 also.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. :] I appreciate you being willing to change this.

This allows users to limit the number of processes or threads a minion
will start in response to published messages, prevents resource
exhaustion in case a high number of concurrent jobs is scheduled in a
short time.
@moio moio force-pushed the develop-limit-minion-processes branch from b81069b to fd4194a Compare September 22, 2017 19:58
@moio
Copy link
Contributor Author

moio commented Sep 22, 2017

All fixed!

Jenkins failures do not seem related to me.

@thatch45 thatch45 merged commit d8f371b into saltstack:develop Sep 26, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants