Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tune GC parameters #1653

Closed
pitrou opened this issue Dec 21, 2017 · 4 comments · Fixed by #2624
Closed

Tune GC parameters #1653

pitrou opened this issue Dec 21, 2017 · 4 comments · Fixed by #2624
Labels
enhancement Improve existing functionality or make things work better

Comments

@pitrou
Copy link
Member

pitrou commented Dec 21, 2017

According to Linux perf, 18% of the scheduler's runtime (and perhaps the workers as well, depending on the workload) can be spent in the Python GC. I suggest dask-worker and dask-scheduler (as well as Nanny perhaps) tune the GC parameters at process startup to lessen the number of garbage collection attempts.

Something vaguely like:

g0, g1, g2 = gc.get_threshold()
gc.set_threshold(g0 * 2, g1 * 2, g2 * 2)
@pitrou pitrou added the enhancement Improve existing functionality or make things work better label Dec 21, 2017
@pitrou
Copy link
Member Author

pitrou commented Dec 21, 2017

The following:

g0, g1, g2 = gc.get_threshold()
gc.set_threshold(g0 * 3, g1 * 3, g2 * 3)

seems to save around 8% on the scheduler benchmark.

@jonmorton
Copy link

8% seems like a lot! I am seeing similar gains.
I was getting a lot of "WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)" type messages until I increased the gc thresholds.

Any interest in incorporating this by default?

@mrocklin
Copy link
Member

I would be fine with this. Probably we would want to do it in distributed/cli/dask_scheduler.py. Want to give it a try and submit a small PR @jonmorton ?

mrocklin added a commit to mrocklin/distributed that referenced this issue Apr 18, 2019
@mrocklin
Copy link
Member

mrocklin commented Apr 18, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improve existing functionality or make things work better
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants