New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only add reevaluate_occupancy callback once #1953
Conversation
This is excellent. Thank you for tracking this down (I can't imagine it was easy) and then providing such a concise fix. Regarding Agreed about testing. Not sure what to do here. I suggest skipping it for now. |
I think you're right about making a dedicated process object; I assumed |
distributed/scheduler.py
Outdated
if proc.cpu_percent() < 50: | ||
# scheduler was somehow moved to another PID | ||
if self.proc.pid != os.getpid(): | ||
self.proc = psutil.Process() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You observed this happening? This seems surprising ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not. Just a precaution that is likely unnecessary. I was thinking of running the scheduler on an HPC where it can be interrupted and resumed. Looks like that might break other parts of the code though.
I'm investigating the failures. They don't seem related, but they are persisting, which is odd. |
Hrm, nevermind. Likely a false alarm. Merging. Thanks @alorenzo175 ! |
Scheduler.reevaluate_occupancy
is currently added as a callback every timeScheduler.start()
is called. SinceScheduler.start()
is called every timeScheduler.restart()
is called, long running schedulers that have restarted many times eventually consume most of a CPU trying to reevaluate occupancy. To add to the problem, the cpu percent check inreevaluate_occupancy
always returns 0.0, at least on linux with psutil 5.4.3, since a newpsutil.Process
object is made for each check (see https://psutil.readthedocs.io/en/latest/#psutil.cpu_percent).I've moved where the
reevaluate_occupancy
callback is added to the IOLoop so that it is not re-added on a restart. I've also switched to usingpsutil.cpu_percent()
instead ofpsutil.Process().cpu_percent()
.I'm not sure if this needs testing or how to go about adding this to the test suite.