Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions Doc/whatsnew/3.15.rst
Original file line number Diff line number Diff line change
Expand Up @@ -321,6 +321,15 @@ collections.abc
previously emitted if it was merely imported or accessed from the
:mod:`!collections.abc` module.

concurrent.futures
------------------

* Improved error reporting when a child process in a
:class:`concurrent.futures.ProcessPoolExecutor` terminates abruptly.
The resulting traceback will now tell you the PID and exit code of the
terminated process.
(Contributed by Jonathan Berg in :gh:`139486`.)

dbm
---

Expand Down
15 changes: 13 additions & 2 deletions Lib/concurrent/futures/process.py
Original file line number Diff line number Diff line change
Expand Up @@ -474,9 +474,20 @@ def _terminate_broken(self, cause):
bpe = BrokenProcessPool("A process in the process pool was "
"terminated abruptly while the future was "
"running or pending.")
cause_str = None
if cause is not None:
bpe.__cause__ = _RemoteTraceback(
f"\n'''\n{''.join(cause)}'''")
cause_str = ''.join(cause)
else:
# No cause known, synthesize from child process exitcodes
errors = []
Copy link
Contributor

@YvesDup YvesDup Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there really cases where multiple processes fail together ? If not, the errors list does not seem necessary. Otherwise, a test would be welcome.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible! Doing

    with ProcessPoolExecutor(max_workers=2) as executor:
        futures = [
            executor.submit(os._exit, 99),
            executor.submit(os._exit, 100),
        ]
        for future in as_completed(futures):
            try:
                future.result()
            except BrokenProcessPool as e:
                traceback.print_exception(e)

sometimes gives me

Lib.concurrent.futures.process._RemoteTraceback: 
'''
Process 84477 terminated abruptly with exit code 100
Process 84478 terminated abruptly with exit code 99'''

But this is basically a race between the subprocesses terminating and when we build the traceback. Testing this in a non-flaky way would be tricky at best.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original concern when deciding to report every known termination was that we could potentially end up with a "real" failure and a "red herring" failure, and since we can't say for sure which happened first or which was more important, it would be safer to just dump all known failures into the traceback. And the total size of the traceback would be bounded by the number of processes that could terminate at the same time, i.e.

On Windows, max_workers must be less than or equal to 61. If it is not then ValueError will be raised. If max_workers is None, then the default chosen will be at most 61

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm the flaky number of catched failed processes. Sometimes there is only one ...
I am wondering if we should not insert a brief comment. It's up to you.

for p in self.processes.values():
if p.exitcode is not None and p.exitcode != 0:
errors.append(f"Process {p.pid} terminated abruptly "
f"with exit code {p.exitcode}")
if errors:
cause_str = "\n".join(errors)
if cause_str:
bpe.__cause__ = _RemoteTraceback(f"\n'''\n{cause_str}'''")

# Mark pending tasks as failed.
for work_id, work_item in self.pending_work_items.items():
Expand Down
15 changes: 15 additions & 0 deletions Lib/test/test_concurrent_futures/test_process_pool.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,21 @@ def test_traceback(self):
self.assertIn('raise RuntimeError(123) # some comment',
f1.getvalue())

def test_traceback_when_child_process_terminates_abruptly(self):
# gh-139462 enhancement - BrokenProcessPool exceptions
# should describe which process terminated.
exit_code = 99
with self.executor_type(max_workers=1) as executor:
future = executor.submit(os._exit, exit_code)
with self.assertRaises(BrokenProcessPool) as bpe:
future.result()

cause = bpe.exception.__cause__
self.assertIsInstance(cause, futures.process._RemoteTraceback)
self.assertIn(
f"terminated abruptly with exit code {exit_code}", cause.tb
)

@warnings_helper.ignore_fork_in_thread_deprecation_warnings()
@hashlib_helper.requires_hashdigest('md5')
def test_ressources_gced_in_workers(self):
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
When a child process in a :class:`concurrent.futures.ProcessPoolExecutor`
terminates abruptly, the resulting traceback will now tell you the PID
and exit code of the terminated process. Contributed by Jonathan Berg.
Loading