Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing with maxtasksperchild can hang if unpickling causes import #93580

Closed
bmerry opened this issue Jun 7, 2022 · 4 comments
Closed
Labels
topic-multiprocessing type-bug An unexpected behavior, bug, or error

Comments

@bmerry
Copy link
Contributor

bmerry commented Jun 7, 2022

Bug report

This seems like another specific instance of the general issue identified in #50970.

If multiprocessing.Pool.map_async is used with maxtasksperchild and a value returned by a task is of a class not currently imported by the calling process, it can lead to a hang. Here is an example that reliably hangs for me, but which exits cleanly if ElementTree is imported at the top level.

#!/usr/bin/env python

import os
import multiprocessing

def worker(num: int):
    from xml.etree.ElementTree import ElementTree
    print(f"Worker {num} with pid {os.getpid()}")
    return ElementTree()

def main(cores: int = 4, num: int = 6):
    pool = multiprocessing.Pool(processes=cores, maxtasksperchild=1)
    barList = list(pool.map_async(worker, list(range(num))).get())
    print(barList)

if __name__ == "__main__":
    main()

Running py-spy dump on one of the workers shows this backtrace:

Process 47102: python ./demo_core.py
Python v3.10.4 (/usr/bin/python3.10)

Thread 47102 (idle): "Thread-1 (_handle_workers)"
    acquire (<frozen importlib._bootstrap>:120)
    __enter__ (<frozen importlib._bootstrap>:171)
    _find_and_load (<frozen importlib._bootstrap>:1024)
    worker (demo_core.py:8)
    mapstar (multiprocessing/pool.py:48)
    worker (multiprocessing/pool.py:125)
    run (multiprocessing/process.py:108)
    _bootstrap (multiprocessing/process.py:315)
    _launch (multiprocessing/popen_fork.py:71)
    __init__ (multiprocessing/popen_fork.py:19)
    _Popen (multiprocessing/context.py:277)
    start (multiprocessing/process.py:121)
    _repopulate_pool_static (multiprocessing/pool.py:326)
    _maintain_pool (multiprocessing/pool.py:337)
    _handle_workers (multiprocessing/pool.py:513)
    run (threading.py:946)
    _bootstrap_inner (threading.py:1009)
    _bootstrap (threading.py:966)

My guess (without any further proof) is that the main process receives a pickled ElementTree and starts importing the module. Concurrently, another thread realises it needs to start a new worker, so does a fork(). The child process has a half-imported, locked ElementTree module, and tries to import it again, leading to a deadlock.

Note that this is nothing to do with ElementTree - I get the same behaviour with numpy. I chose ElementTree as a reasonably complex module (to maximise the window for the race condition) with a picklable class.

Personally I consider the fork model of multiprocessing to be dangerous and requiring of care to ensure all worker tasks are created before doing anything that can conceivably create threads, and definitely a bad combination with maxtasksperchild. So I won't shed any tears if the resolution is "won't fix, don't do that". But #50970 (comment) seems to suggest that @vstinner has some appetite for addressing such issues and hence I'm filing this.

Your environment

  • CPython versions tested on: 3.8.10, 3.10.4
  • Operating system and architecture: Ubuntu 20.04, x86_64
@bmerry bmerry added the type-bug An unexpected behavior, bug, or error label Jun 7, 2022
@vstinner
Copy link
Member

vstinner commented Jun 7, 2022

But #50970 (comment) seems to suggest that @vstinner has some appetite for addressing such issues and hence I'm filing this.

I'm not interested to fix such issue.

@bmerry
Copy link
Contributor Author

bmerry commented Jun 7, 2022

But #50970 (comment) seems to suggest that @vstinner has some appetite for addressing such issues and hence I'm filing this.

I'm not interested to fix such issue.

Ok. Should I close it, or leave it in case someone else has an idea for fixing it?

@kumaraditya303
Copy link
Contributor

I don't think this is worth fixing as fork is dangerous with threads.

@kumaraditya303 kumaraditya303 closed this as not planned Won't fix, can't repro, duplicate, stale Jun 15, 2022
@bmerry
Copy link
Contributor Author

bmerry commented Jun 15, 2022

I don't think this is worth fixing as fork is dangerous with threads.

Perhaps maxtasksperchild with a fork context should be deprecated, since it's pretty much impossible to use safely?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-multiprocessing type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

3 participants