Skip to content

Fork server doesn't flush stdout/stderr after preloading modules, potentially leaving buffered data to be inherited by child processes #135335

Closed
@duaneg

Description

@duaneg

Bug report

Bug description:

Files

a/__init__.py
import os
import time
print(f"init {os.getpid()} at {time.clock_gettime_ns(time.CLOCK_MONOTONIC)}")
repro.py
import multiprocessing
import os
import time

if __name__ == '__main__':
    print(f"run main {os.getpid()}")
    multiprocessing.set_forkserver_preload(['a'])
    for _ in range(2):
        p = multiprocessing.Process()
        p.start()
        p.join()
else:
    print(f"re-import main {os.getpid()} at {time.clock_gettime_ns(time.CLOCK_MONOTONIC)}")

Reproduction

  1. Create a new module a containing the __init__.py file above
  2. Run the repro.py script above, ensuring the module created is on PYTHONPATH

Result

> ./python repro.py
run main 1056488
init 1056490 at 151009034069836
re-import main 1056491 at 151009045273212
re-import main 1056492 at 151009051787587
> ./python repro.py | tee /dev/null
run main 1056607
init 1056610 at 151113770440639
re-import main 1056611 at 151113781130002
init 1056610 at 151113770440639
re-import main 1056612 at 151113787814593
init 1056610 at 151113770440639

Expected

The output to be the same when stdout is redirected as when it is not.

Analysis

This is due to fork server preloading a module that writes to stdout, but not flushing it. When a child process is spawned it inherits the buffered data and spuriously outputs it when it flushes its stdout. Note that #126631 prevents __main__ from being preloaded, so at present this will not be triggered by printing from __main__.

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions