Scheduler memory just keep increasing in idle mode #5509

ridhachahed · 2021-11-08T15:31:06Z

Hello Dask community !

I am running Dask on a linux server and I am very limited by my memory. My main problem is that the the dask scheduler process just keep eating more and more memory even though I don't submit any work to the workers yet ! I would like to understand what's going on and if there is any mitigation to this problem ?

To reproduce it:
from dask.distributed import Client
client = Client(memory_limit='100MB', processes=False, n_workers=4, threads_per_worker=1)

Then check memory with top | grep python

Ps : I succeded in limiting the worker process memory by adding --memory limit to the client call but this doesn't seem to be possible for the scheduler process.

ian-r-rose · 2021-11-08T20:24:10Z

Hi @ridhachahed are you perhaps using the dask-labextension? If so, might you be running into dask/dask-labextension#185 ?

ridhachahed · 2021-11-08T21:23:55Z

Hello @ian-r-rose , unfortunately I am not using dask-labextension everything is run form the command line without the dashboard. You will find attached the numbers I have where we can the see scheduler memory increasing while the worker memory reaches a plateau at one point (when we don't put a limit on their memory).

jrbourbeau · 2021-11-09T22:39:15Z

The scheduler has an additional events logging system where certain messages are stored. The size of these logs are limited (the default size of 100k) to prevent unbounded memory usage

distributed/distributed/scheduler.py

Lines 3778 to 3782 in 06ba74b

    
           self.events = defaultdict( 
        
               partial( 
        
                   deque, maxlen=dask.config.get("distributed.scheduler.events-log-length") 
        
               ) 
        
           )

It could be that you're observing these administrative logs build up on the scheduler. What happens if you set the distributed.scheduler.events-log-length configuration value (which controls the size of these event logs) to be capped at a much smaller value, e.g. 10?

ridhachahed · 2021-11-12T15:15:17Z

@jrbourbeau Thanks for your help ! It definitely helped mitigate the problem as you can see on the graphs below, but we are still observing a slower increase. If I understand correctly self.events is only used for debugging purpose right ? So letting the queue size to 10 will not affect performance. How about self.transition_log, self.log and self.computation ? I am having a hard time debugging the scheduler attributes to see how they are modified during communication, do you have any suggestions ?

gjoseph92 · 2021-12-07T18:20:53Z

@ridhachahed I see you closed this—what resolution did you come to? Did reducing self.transition_log, self.log and self.computation help?

jrbourbeau transferred this issue from dask/dask Nov 9, 2021

ridhachahed closed this as completed Nov 19, 2021

gjoseph92 mentioned this issue Dec 7, 2021

Set scheduler log sizes automatically based on available memory #5570

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scheduler memory just keep increasing in idle mode #5509

Scheduler memory just keep increasing in idle mode #5509

ridhachahed commented Nov 8, 2021

ian-r-rose commented Nov 8, 2021

ridhachahed commented Nov 8, 2021 •

edited

Loading

jrbourbeau commented Nov 9, 2021

ridhachahed commented Nov 12, 2021

gjoseph92 commented Dec 7, 2021

Scheduler memory just keep increasing in idle mode #5509

Scheduler memory just keep increasing in idle mode #5509

Comments

ridhachahed commented Nov 8, 2021

ian-r-rose commented Nov 8, 2021

ridhachahed commented Nov 8, 2021 • edited Loading

jrbourbeau commented Nov 9, 2021

ridhachahed commented Nov 12, 2021

gjoseph92 commented Dec 7, 2021

ridhachahed commented Nov 8, 2021 •

edited

Loading