Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_multiprocessing_spawn leaks QueueManager dangling processes #91231

Closed
vstinner opened this issue Mar 21, 2022 · 5 comments
Closed

test_multiprocessing_spawn leaks QueueManager dangling processes #91231

vstinner opened this issue Mar 21, 2022 · 5 comments
Labels
3.11 only security fixes tests Tests in the Lib/test dir

Comments

@vstinner
Copy link
Member

BPO 47075
Nosy @vstinner, @pablogsal, @erlend-aasland
PRs
  • bpo-47075: Add shutdown_timeout to multiprocessing BaseManager #32112
  • Files
  • manager_timeout.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2022-03-21.03:01:30.369>
    labels = ['tests', '3.11']
    title = 'test_multiprocessing_spawn leaks QueueManager dangling processes'
    updated_at = <Date 2022-03-25.11:40:41.516>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2022-03-25.11:40:41.516>
    actor = 'vstinner'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Tests']
    creation = <Date 2022-03-21.03:01:30.369>
    creator = 'vstinner'
    dependencies = []
    files = ['50694']
    hgrepos = []
    issue_num = 47075
    keywords = ['patch']
    message_count = 4.0
    messages = ['415650', '415651', '415652', '415996']
    nosy_count = 3.0
    nosy_names = ['vstinner', 'pablogsal', 'erlendaasland']
    pr_nums = ['32112']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue47075'
    versions = ['Python 3.11']

    @vstinner
    Copy link
    Member Author

    Apply attached manager_timeout.patch and run the command:

    ./python -m test --fail-env-changed test_multiprocessing_spawn -v -m test_rapid_restart -F -j4
    

    If you fail to reproduce the issue, replace -j4 with -j10 or use a larger number of concurrent jobs.

    Output:
    ---
    (...)
    0:00:00 load avg: 2.85 Run tests in parallel using 4 child processes
    0:00:06 load avg: 3.02 [ 1/1] test_multiprocessing_spawn failed (env changed)
    test_rapid_restart (test.test_multiprocessing_spawn.WithManagerTestManagerRestart) ... ok

    test_rapid_restart (test.test_multiprocessing_spawn.WithProcessesTestManagerRestart) ... ok
    Warning -- Dangling processes: {<SpawnProcess name='QueueManager-5' pid=438355 parent=438255 stopped exitcode=-SIGTERM>, <SpawnProcess name='QueueManager-7' pid=438395 parent=438255 stopped exitcode=-SIGTERM>}

    test_rapid_restart (test.test_multiprocessing_spawn.WithThreadsTestManagerRestart) ... ok
    Warning -- Dangling processes: {<SpawnProcess name='QueueManager-8' pid=438423 parent=438255 stopped exitcode=-SIGTERM>, <SpawnProcess name='QueueManager-9' pid=438456 parent=438255 stopped exitcode=-SIGTERM>}
    Warning -- Dangling processes: {<SpawnProcess name='QueueManager-8' pid=438423 parent=438255 stopped exitcode=-SIGTERM>, <SpawnProcess name='QueueManager-9' pid=438456 parent=438255 stopped exitcode=-SIGTERM>}

    ----------------------------------------------------------------------
    Ran 3 tests in 5.404s

    OK
    Kill <TestWorkerProcess #2 running test=test_multiprocessing_spawn pid=438256 time=6.9 sec> process group
    Kill <TestWorkerProcess #3 running test=test_multiprocessing_spawn pid=438259 time=6.9 sec> process group
    Kill <TestWorkerProcess #4 running test=test_multiprocessing_spawn pid=438260 time=6.9 sec> process group

    == Tests result: ENV CHANGED ==

    1 test altered the execution environment:
    test_multiprocessing_spawn

    (...)
    ---

    Examples of similar old issues:

    See also the old fixed bpo-31234 "Make support.threading_cleanup() stricter".

    @vstinner vstinner added 3.11 only security fixes tests Tests in the Lib/test dir labels Mar 21, 2022
    @vstinner
    Copy link
    Member Author

    BaseManager._finalize_manager() timeout should be configurable, the test should use the same timeout than support.wait_process() (SHORT_TIMEOUT), and maybe the default timeout should be increased?

    I propose to use the same timeout for the two process.join(timeout=xxx) calls (before/after terminate).

    @vstinner
    Copy link
    Member Author

    Recent failures.

    AMD64 FreeBSD Non-Debug 3.x:
    https://buildbot.python.org/all/#/builders/172/builds/1654
    ---
    Warning -- Dangling processes: {<SpawnProcess name='QueueManager-335' pid=15699 parent=14154 stopped exitcode=-SIGTERM>}
    Warning -- Dangling processes: {<SpawnProcess name='QueueManager-335' pid=15699 parent=14154 stopped exitcode=-SIGTERM>}

    1 test altered the execution environment:
    test_multiprocessing_spawn
    ---

    @vstinner
    Copy link
    Member Author

    vstinner commented Mar 25, 2022

    BaseManager._finalize_manager() timeout should be configurable

    I proposed #32112 for that.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    vstinner added a commit that referenced this issue Apr 19, 2022
    Add an optional keyword 'shutdown_timeout' parameter to the
    multiprocessing.BaseManager constructor. Kill the process if
    terminate() takes longer than the timeout.
    
    Multiprocessing tests pass test.support.SHORT_TIMEOUT
    to BaseManager.shutdown_timeout.
    vstinner added a commit that referenced this issue Apr 19, 2022
    Shutting down a multiprocessing BaseManager now waits for 1 second until
    the process completes, rather than 0.1 second, after the process is
    terminated.
    miss-islington pushed a commit to miss-islington/cpython that referenced this issue Apr 19, 2022
    …H-91701)
    
    Shutting down a multiprocessing BaseManager now waits for 1 second until
    the process completes, rather than 0.1 second, after the process is
    terminated.
    (cherry picked from commit a885f10)
    
    Co-authored-by: Victor Stinner <vstinner@python.org>
    miss-islington added a commit that referenced this issue Apr 19, 2022
    Shutting down a multiprocessing BaseManager now waits for 1 second until
    the process completes, rather than 0.1 second, after the process is
    terminated.
    (cherry picked from commit a885f10)
    
    Co-authored-by: Victor Stinner <vstinner@python.org>
    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 21, 2022

    I proposed #32112 for that.

    Merged. On Python 3.9 and 3.10, I increased the timeout from 0.1 to 1.0 second (10x longer!):

    Let's say that it's enough for now.

    hello-adam pushed a commit to hello-adam/cpython that referenced this issue Jun 2, 2022
    …H-91701)
    
    Shutting down a multiprocessing BaseManager now waits for 1 second until
    the process completes, rather than 0.1 second, after the process is
    terminated.
    (cherry picked from commit a885f10)
    
    Co-authored-by: Victor Stinner <vstinner@python.org>
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes tests Tests in the Lib/test dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant