Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
[2018.3] Fixes to schedule maxrunning on master #50130
What does this PR do?
Adding some master specific functions to uitls/masters.py to determine if a Salt process is running. Updating utils/schedule.py to use the appropriate running function either from utils/master.py or utils/minion.py depending on where the scheduled job is running. Adding tests to test maxrunning in scheduled jobs for both the minion and master.
What issues does this PR fix or reference?
Commits signed with GPG?
Please review Salt's Contributing Guide for best practices.
See GitHub's page on GPG signing for more information about signing commits with GPG.
@garethgreenaway Lint issues in the tests: https://jenkinsci.saltstack.com/job/pr-lint/job/PR-50130/1/warnings52Result/new/
DISCLAIMER: The code below does not have to be that way. It is just the way to express solution idea.
What I would propose here is different approach at this problem. Since our task here is to let
So the conceptually in pseudo-code I would express the solution idea and its interface in the following way. I personally would prefer it as a class, so one can instantiate it and start working with the jobs. Or just refer an instance at module level (YMMV). It is so:
class JobTrackerUtility(object): ''' Class that tracks jobs in cache. This should also be extended to https://github.com/saltstack/salt/pull/50078 ''' def register(self, job): ''' This registers job and works with the cache/serialiser. ''' ... return True or False # Registered or not def start_job(self, payload): ''' We just start one job and register it. ''' command, other_things = payload.parse_in_some_way() job = Job() # Wrapper to a job in cache. # Can be just a dict as well, # but objects just more convenient. job.command = command job.is_parallel = other_things.get_is_countable() # This is conceptual idea self.register(job) # This job is not yet started! def get_all_running_jobs(self): ''' This just reads what is in cache alongside with the whole metadata. ''' # At this point we no longer care if this is # Windows or Linux or Solaris. ... return available_jobs def running_jobs(self): ''' Get only jobs that we respect for maxrunning ''' jobs_we_need =  for job in self.get_all_running_jobs(): if job.we_are_interested_in(): # This is where we use beforehand prepared info jobs_we_need.append(job) return jobs_we_need def cleanup_jobs(self): ''' Cleanup jobs. ''' # This internally would get all running jobs, # see if job is actually running and cleanup PIDs or # leave as is. for job in self.get_all_running_jobs(): if not job.is_running(): job.cleanup() # This Job() instance would remove pid # check what is needed # clean cache etc
There is already an RFC: #50078 which has Job Retry idea. I wrote there an alternative approach to this and it seems better to work with caches on mater/minion. We can also reuse this approach above there and then in SaltSSH on multiple requests at the same time. The job tracker above fits there to control, view and manage jobs for the similar purposes: retry jobs from the cache, poll them, add them from another process or local client etc.
The only downside here that it would require more development.
Oct 22, 2018
@cachedout it still seems to me that it post-figures out running processes and thus still looks prone to errors, instead of pre-registering them and getting it reliable as OS would normally do. Nothing in that area has been changed from the design perspective. But since you are asking to approve, I am approving...