Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add niceness control for subprocesses of salt master #50905

Closed
wants to merge 1 commit into from

Conversation

mattp-
Copy link
Contributor

@mattp- mattp- commented Dec 18, 2018

What does this PR do?

under high load these can be useful to tweak for ensuring critical parts
of salt aren't being resource starved.

Tests written?

No

Commits signed with GPG?

No

Please review Salt's Contributing Guide for best practices.

See GitHub's page on GPG signing for more information about signing commits with GPG.

@mattp- mattp- requested a review from a team as a code owner December 18, 2018 23:15
@mattp- mattp- force-pushed the niceness branch 5 times, most recently from 0bcebb6 to 86eca7f Compare December 18, 2018 23:48
@cachedout
Copy link
Contributor

Way cool!

We'll definitely need docs written before we can get this in, though.

@mattp- mattp- force-pushed the niceness branch 2 times, most recently from 96e767c to a3aaeea Compare December 19, 2018 03:04
@mattp-
Copy link
Contributor Author

mattp- commented Dec 19, 2018

@cachedout working on it, i'll ping ... not you :) (its likely not going to be by eod tomorrow) when tests and docs have this ready to go. thanks

@mattp- mattp- changed the title add niceness control for subrocesses of salt master add niceness control for subprocesses of salt master Dec 19, 2018
@mattp- mattp- force-pushed the niceness branch 2 times, most recently from e090b10 to 11fa1fc Compare December 19, 2018 03:12
@cachedout
Copy link
Contributor

@mattp- I turn into a pumpkin tomorrow at midnight. ;) I'm sure @thatch45 has your back though!

@mattp- mattp- force-pushed the niceness branch 2 times, most recently from fa44239 to c3b6fc1 Compare December 19, 2018 03:53
@thatch45
Copy link
Contributor

Yes, I really like this addition, a great idea! Once we have docs in for this it should be good to merge!

@mattp- mattp- force-pushed the niceness branch 2 times, most recently from aa066ff to 1ab37cc Compare December 19, 2018 04:31
@austinpapp
Copy link
Contributor

for completeness, do we see any value in add niceness tuning for fileserverupdate, maintenance, etc?

salt/master.py Outdated Show resolved Hide resolved
@dwoz
Copy link
Contributor

dwoz commented Dec 19, 2018

Personally I'd like to understand the driver for this better. Specifically which processes are causing the high load and why? Generally, I think of niceness as something that should be set overall for a program by something outside of the said program (not by the program itself).

@cachedout
Copy link
Contributor

@dwoz Since we're forking our own procs and those processes have different roles, I think there is a good argument for the application to take control over setting nice values appropriately per-process.

@sagetherage
Copy link
Contributor

@mattp, @dwoz raises a valid point to further the discussion and we would like to put this through our RFC process. Can you please look at this template and submit an RFC PR? https://github.com/saltstack/salt/blob/develop/rfcs/0000-template.md

@mattp-
Copy link
Contributor Author

mattp- commented Dec 19, 2018

@sagetherage sure I can try to get that together. but basically @dwoz to answer your question any orchestration heavy workload is going to cause cpu contention on the master. introduce orchestrations that call orchestrations, it adds up quickly. what we've seen is the ipcmessageserver and publisher was falling over/significantly falling behind on loads that are on the order of 30-40x core count of the server. using nice to bump priority for it was a large boon in performance under this load.

@dwoz
Copy link
Contributor

dwoz commented Dec 20, 2018

@mattp Thanks for the explanation. I think this is an easy and immediate fix for the issue you have described.

Longer term, I think we should try to determine what is causing the load. Then look into ways of mitigating the problem. Perhaps we could batch the orchestrations or otherwise optimize the process so it will play nice on it's own.

My concern for having salt set it's own nice settings via the config file is that it seems like a non-standard way for system administrators to set niceness for a process. It'd be nice if we could separate and name processes accordingly so that this could be done outside of salt in the standard way for the specific os. This would be /etc/security/limits.conf on Debian Linux.

Overall, I'm not opposed to this getting merged. However, I think we could discuss and possibly implement some of the solutions above which would allow us to remove these setting in the future. We can discuss all this more in the RFC if you make one.

@thatch45
Copy link
Contributor

While I agree that limits.conf and doing this from a systems level makes sense, I also think that having this in here is a good idea. Since salt is spinning up so many procs I think it will be wise to allow users to set this up in this way

@austinpapp
Copy link
Contributor

Any movement on this? FWIW, this code has been deployed in a production setting and working ... nicely.

@cachedout
Copy link
Contributor

Did the ability to nice the maintenance proc get in here? IMHO, it would be great to be able to dial back the priority of that proc in certain cases.

@mattp-
Copy link
Contributor Author

mattp- commented Jan 22, 2019

@cachedout good catch I did not add this to the maintenance proc - will take care of that too

@dwoz
Copy link
Contributor

dwoz commented May 5, 2019

@mattp- What is the status of this?

@mattp-
Copy link
Contributor Author

mattp- commented May 6, 2019

@dwoz i think the last standing issue was putting together an RFC which I've not had time to do. the original goal of this was to let the mworkers run under a lower priority to not starve the other processes of the salt-master - I don't think you can use limits.conf to do this externally (if you can, I'd happily close this and move to that). if we are ok with foregoing the RFC I can get the maintenance added as a switch. what do you think?

@dwoz
Copy link
Contributor

dwoz commented May 14, 2019

@mattp- Based on the conversation above, I am fine with this provided it's an optional setting. From the looks of the changes it is indeed optional. I think all that is needed is to add the option for the maintenance process.

@KChandrashekhar KChandrashekhar added the ZRELEASED - Neon retired label label May 15, 2019
@KChandrashekhar KChandrashekhar requested a review from dwoz May 15, 2019 16:13
@Ch3LL Ch3LL changed the base branch from develop to neon June 6, 2019 21:49
@Ch3LL
Copy link
Contributor

Ch3LL commented Jun 6, 2019

FYI I migrated this PR from develop to neon to ensure it is included in the upcoming neon release. Let me know if this caused any issues. Thanks

@mattp-
Copy link
Contributor Author

mattp- commented Jun 12, 2019

@dwoz added some docs, the opt for maintenance, as well as mworker_queue which I also missed. thanks

salt/master.py Show resolved Hide resolved
under high load these can be useful to tweak for ensuring critical parts
of salt aren't being resource starved.
@mattp-
Copy link
Contributor Author

mattp- commented Aug 13, 2019

rebased on current neon

@dwoz
Copy link
Contributor

dwoz commented Dec 9, 2019

@mattp- Can you rebase this against the master branch please?

@dwoz
Copy link
Contributor

dwoz commented Apr 13, 2020

@twangboy The master-port label is for PRs that are porting something to master, not PRs that need to be ported.t

@mattp- This needs rebasing against master.

@twangboy
Copy link
Contributor

I rebased this against the master branch: #57365

@twangboy twangboy added the has master-port port to master has been created label May 19, 2020
@sagetherage
Copy link
Contributor

sagetherage commented May 20, 2020

@dwoz it looks like we can close this PR in favor of the rebase from Shane in the PR #57365, yes?

@dwoz dwoz closed this May 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
has master-port port to master has been created Reviewers-Assigned ZRelease-Sodium retired label
Projects
None yet
Development

Successfully merging this pull request may close these issues.