multiprocessing.Pool._worker_handler(): use SIGCHLD to be notified on worker exit #79674
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
assignee = None closed_at = <Date 2019-03-16.22:36:14.592> created_at = <Date 2018-12-14.11:27:54.050> labels = ['3.8', 'library'] title = 'multiprocessing.Pool._worker_handler(): use SIGCHLD to be notified on worker exit' updated_at = <Date 2020-03-14.15:01:39.718> user = 'https://github.com/vstinner'
activity = <Date 2020-03-14.15:01:39.718> actor = 'arekm' assignee = 'none' closed = True closed_date = <Date 2019-03-16.22:36:14.592> closer = 'pablogsal' components = ['Library (Lib)'] creation = <Date 2018-12-14.11:27:54.050> creator = 'vstinner' dependencies =  files =  hgrepos =  issue_num = 35493 keywords = ['patch', 'patch', 'patch'] message_count = 12.0 messages = ['331797', '331799', '331801', '331803', '331804', '331807', '331810', '333348', '333350', '338106', '361465', '364179'] nosy_count = 6.0 nosy_names = ['arekm', 'pitrou', 'vstinner', 'stefanor', 'davin', 'pablogsal'] pr_nums = ['11488', '11488', '11488'] priority = 'normal' resolution = 'fixed' stage = 'resolved' status = 'closed' superseder = None type = None url = 'https://bugs.python.org/issue35493' versions = ['Python 3.8']
The text was updated successfully, but these errors were encountered:
Currently, multiprocessing.Pool._worker_handler() checks every 100 ms if a worker exited using time.sleep(0.1). It causes a latency if worker exit frequently and the pool has to execute a large number of tasks.
import multiprocessing import time CONCURRENCY = 1 NTASK = 100 def noop(): pass with multiprocessing.Pool(CONCURRENCY, maxtasksperchild=1) as pool: start_time = time.monotonic() results = [pool.apply_async(noop, ()) for _ in range(NTASK)] for result in results: result.get() dt = time.monotonic() - start_time pool.terminate() pool.join() print("Total: %.1f sec" % dt)
The worst case is a pool of 1 process, each worker only executes a single task and the task does nothing (minimize task execution time): the latency is 100 ms per task, which means 10 seconds for 100 tasks.
Using SIGCHLD signal to be notified when a worker completes would allow to avoid polling: reduce the latency and reduce CPU usage (the thread doesn't have to be awaken every 100 ms anymore).
How do you use SIGCHLD on Windows?
There is actually a portable (and robust) solution: use Process.sentinel
There is another issue: Pool is currently subclassed by ThreadPool. You'll probably have to make the two implementations diverge a bit.
I'm only proposing to use a signal when it's available, on UNIX. So have multiple implementations of the function, depending on the ability to get notified on completion without polling.
On Windows, maybe we could use a dedicated thread to set an event once WaitForSingleObject/WaitForMultipleObjects completes?
The design of my bpo-35479 change is to replace polling with one or multiple events. Maybe we can use an event to wakeup _worker_handler() when something happens, but have different wants to signal this event.
I have to investigate how Process.sentinel can be used here.
I might be interesting to use asyncio internally, but I'm not sure if it's possible ;-)
Look how concurrent.futures uses it:
This also means: