-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multiprocessing.Pool.imaps iterators do not maintain alive the multiprocessing.Pool objects #79559
Comments
After applying the PRs in bpo-34172, multiprocessing.Pool.imap hangs on MacOs and Linux. This is a simple reproducer: import multiprocessing
def the_test():
print("Begin")
for x in multiprocessing.Pool().imap(int,
["4", "3"]):
print(x)
print("End")
the_test() This happens because the IMapIterator does not maintain alive the multiprocessing.Pool object while it is still alive. |
It's important to note that before those PR, that code would leak the Pool instance until the process ends (once per call). Is my proposed fix (till I get it to a PR). |
I was working already on a PR. Do you prefer to wait for yours instead? |
I dont mind, I think my code is ready for review, but I'm not versed in this, so if you think you have something better, feel free to open a PR or tell me if I should submit mine, and you can comment on it: |
I think the code is almost ok (I also prefer to also use the cache as an excuse to maintain the pool alive) but the test needs to be done a bit more carefully to avoid hanging the test suite in case of failure and to avoid leaking threads or processes. |
Let's step back a bit here. This kind of code has never been supported. As Victor says, we should be careful not to add any potential sources of reference cycles. The reason the code originally "worked" is that it actually leaked the Pool object (and presumably its worker threads and processes) until the end of the application. |
Although I think keeping the iterator is not a bad solution if done correctly, I think more and more that is not the best solution. @antoine, would you be ok passing a weak reference to the iterator and raising if the pool is dead? I still think we should try to avoid hanging. |
In general, I'm fine with adding a *weak* reference, especially if it helps to detect bugs. |
I think a weakref is fine. You don't have to *pass* a weakref: just pass a normal ref and let the Result object take a weakref (and a reference to the cache perhaps). |
See https://bugs.python.org/issue34172#msg331216 for a more general discussion about how the multiprocessing API is supposed tobe used and multiprocessing objects lifetime. |
Ok, I will modify my PR to pass a weak reference and raise. |
I agree about making the multiprocessing API safer to use, but please let's have this discussion on a dedicated bug entry. |
I am playing with weakreferences inside the iterator objects, but this may not be enough. For example, take the code of ApplyResult.get: def get(self, timeout=None):
if self._pool() is None:
raise RuntimeError("The pool is dead!") <--- new code
self.wait(timeout) It can be that the pool is alive when we check for it (self._pool() is None) but while the code is waiting with no timeout, the pool dies, effectively leaving the program deadlocked with no error. |
Also, making tests that do not leak threads for this case is very difficult as if we go with weakrefs, the test *has* to leak a pool (the pool is dead but never calls join/close) to check that when you use the iterator the exception happens. Also, getting such test race-free is going to be challenging. |
See also bpo-35478: multiprocessing: ApplyResult.get() hangs if the pool is terminated. |
bpo-35629 has been marked as a duplicate of this issue. |
I have been playing with possible solutions for a while and the weak-reference solution seems not robust enough as there are too potential race conditions between the destruction of the weakreferences (the pool) and the handling code. I would advocate again for using a strong reference. The reasons are:
|
+1 |
According to all previous discussions, I now agree with having a strong reference. |
Thanks Pablo for the fix ;-) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: