Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only add reevaluate_occupancy callback once #1953

Merged
merged 5 commits into from May 3, 2018
Merged

Conversation

alorenzo175
Copy link
Contributor

Scheduler.reevaluate_occupancy is currently added as a callback every time Scheduler.start() is called. Since Scheduler.start() is called every time Scheduler.restart()is called, long running schedulers that have restarted many times eventually consume most of a CPU trying to reevaluate occupancy. To add to the problem, the cpu percent check in reevaluate_occupancy always returns 0.0, at least on linux with psutil 5.4.3, since a new psutil.Processobject is made for each check (see https://psutil.readthedocs.io/en/latest/#psutil.cpu_percent).

I've moved where the reevaluate_occupancycallback is added to the IOLoop so that it is not re-added on a restart. I've also switched to using psutil.cpu_percent() instead of psutil.Process().cpu_percent().

I'm not sure if this needs testing or how to go about adding this to the test suite.

@mrocklin
Copy link
Member

mrocklin commented May 2, 2018

This is excellent. Thank you for tracking this down (I can't imagine it was easy) and then providing such a concise fix.

Regarding psutil.cpu_percent() I'm slightly concerned that this might be confused by other processes that are running on the same machine. Perhaps we should make a global process object in the file and use that? Though I can see arguments for either approach.

Agreed about testing. Not sure what to do here. I suggest skipping it for now.

@alorenzo175
Copy link
Contributor Author

I think you're right about making a dedicated process object; I assumed proc.cpu_percent() only applied to the current PID. I added a Scheduler.procattribute. Not sure if the PID check is really necessary.

if proc.cpu_percent() < 50:
# scheduler was somehow moved to another PID
if self.proc.pid != os.getpid():
self.proc = psutil.Process()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You observed this happening? This seems surprising ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not. Just a precaution that is likely unnecessary. I was thinking of running the scheduler on an HPC where it can be interrupted and resumed. Looks like that might break other parts of the code though.

@mrocklin
Copy link
Member

mrocklin commented May 3, 2018

I'm investigating the failures. They don't seem related, but they are persisting, which is odd.

@mrocklin
Copy link
Member

mrocklin commented May 3, 2018

Hrm, nevermind. Likely a false alarm. Merging.

Thanks @alorenzo175 !

@mrocklin mrocklin merged commit c088b3f into dask:master May 3, 2018
@alorenzo175 alorenzo175 deleted the reeval branch May 4, 2018 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants