Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socketserver.ForkingMixIn.server_close() leaks zombie processes #75334

Closed
vstinner opened this issue Aug 9, 2017 · 13 comments
Closed

socketserver.ForkingMixIn.server_close() leaks zombie processes #75334

vstinner opened this issue Aug 9, 2017 · 13 comments
Labels
3.7 performance tests

Comments

@vstinner
Copy link
Member

@vstinner vstinner commented Aug 9, 2017

BPO 31151
Nosy @vstinner, @ned-deily
PRs
  • #3057
  • #5417
  • Files
  • forkingmixin_sleep.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-05-29.22:01:13.402>
    created_at = <Date 2017-08-09.00:40:39.921>
    labels = ['3.7', 'tests', 'performance']
    title = 'socketserver.ForkingMixIn.server_close() leaks zombie processes'
    updated_at = <Date 2018-05-29.22:01:13.401>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2018-05-29.22:01:13.401>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-05-29.22:01:13.402>
    closer = 'vstinner'
    components = ['Tests']
    creation = <Date 2017-08-09.00:40:39.921>
    creator = 'vstinner'
    dependencies = []
    files = ['47072']
    hgrepos = []
    issue_num = 31151
    keywords = ['patch']
    message_count = 13.0
    messages = ['299959', '299960', '300074', '300084', '300106', '302013', '302015', '303041', '311015', '311022', '311093', '316834', '318116']
    nosy_count = 2.0
    nosy_names = ['vstinner', 'ned.deily']
    pr_nums = ['3057', '5417']
    priority = None
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue31151'
    versions = ['Python 3.6', 'Python 3.7']

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Aug 9, 2017

    It seems like test_socketserver leaks child processes on the "x86 Gentoo Refleaks 3.6" buildbot, but I'm unable to reproduce the bug.

    http://buildbot.python.org/all/builders/x86%20Gentoo%20Refleaks%203.6/builds/49/steps/test/logs/stdio

    0:48:11 load avg: 3.91 [154/405/2] test_socketserver failed -- running: test_decimal (630 sec)
    Warning -- reap_children() reaped child process 1044
    beginning 6 repetitions
    123456
    Warning -- reap_children() reaped child process 1104
    Warning -- reap_children() reaped child process 1115
    .Warning -- reap_children() reaped child process 1170
    Warning -- reap_children() reaped child process 1175
    Warning -- reap_children() reaped child process 1184
    .Warning -- reap_children() reaped child process 1249
    .Warning -- reap_children() reaped child process 1311
    Warning -- reap_children() reaped child process 1316
    ...
    test_socketserver leaked [1, 1, 1] memory blocks, sum=3

    (...)

    test_ForkingUnixStreamServer (test.test_socketserver.SocketServerTest) ... creating server
    ADDR = /tmp/unix_socket.hqh5x95a
    CLASS = <class 'test.test_socketserver.ForkingUnixStreamServer'>
    server running
    test client 0
    test client 1
    test client 2
    waiting for server
    done
    Warning -- reap_children() reaped child process 17938
    ok

    test_ForkingUnixDatagramServer (test.test_socketserver.SocketServerTest) ... creating server
    ADDR = /tmp/unix_socket.gry6ulhp
    CLASS = <class 'test.test_socketserver.ForkingUnixDatagramServer'>
    server running
    test client 0
    test client 1
    test client 2
    waiting for server
    done
    Warning -- reap_children() reaped child process 18212
    ok

    test_ForkingUDPServer (test.test_socketserver.SocketServerTest) ... creating server
    ADDR = ('127.0.0.1', 43415)
    CLASS = <class 'socketserver.ForkingUDPServer'>
    server running
    test client 0
    test client 1
    test client 2
    waiting for server
    done
    Warning -- reap_children() reaped child process 18281
    ok
    test_ForkingUnixDatagramServer (test.test_socketserver.SocketServerTest)

    @vstinner vstinner added tests performance labels Aug 9, 2017
    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Aug 9, 2017

    Ah, the master branch is also affected. Example:

    http://buildbot.python.org/all/builders/x86%20Gentoo%20Refleaks%203.x/builds/47/steps/test/logs/stdio

    1:35:05 load avg: 4.64 [212/406] test_socketserver passed -- running: test_subprocess (370 sec)
    beginning 6 repetitions
    123456
    Warning -- reap_children() reaped child process 4641
    .Warning -- reap_children() reaped child process 4708
    ..Warning -- reap_children() reaped child process 4832
    .Warning -- reap_children() reaped child process 4916
    Warning -- reap_children() reaped child process 4921
    ..

    @vstinner vstinner added the 3.7 label Aug 9, 2017
    @vstinner vstinner changed the title [3.6] test_socketserver: Warning -- reap_children() reaped child process test_socketserver: Warning -- reap_children() reaped child process Aug 9, 2017
    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Aug 10, 2017

    The problem is that socketserver.ForkinMixin doesn't wait until all children completes. It's only calls os.waitpid() in non-blocking module (using os.WNOHANG) after each loop iteration. If a child process completes after the last call to ForkingMixIn.collect_children(), the server leaks zombie processes.

    The server must wait until all children completes. Attached PR implements that.

    The bug was be reproduced with the attached forkingmixin_sleep.patch.

    haypo@selma$ ./python -m test -v -u all test_socketserver --fail-env-changed -m '*Fork*'
    (...)
    Warning -- reap_children() reaped child process 17093
    Warning -- reap_children() reaped child process 17094
    (...)

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Aug 10, 2017

    New changeset aa8ec34 by Victor Stinner in branch 'master':
    bpo-31151: Add socketserver.ForkingMixIn.server_close() (bpo-3057)
    aa8ec34

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Aug 10, 2017

    bpo-31010 has been marked as a duplicate of this issue.

    @vstinner vstinner changed the title test_socketserver: Warning -- reap_children() reaped child process socketserver.ForkingMixIn.server_close() leaks zombie processes Aug 18, 2017
    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Sep 13, 2017

    I created a thread on the python-dev mailing list to discuss this issue:
    https://mail.python.org/pipermail/python-dev/2017-August/148826.html

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Sep 13, 2017

    I tag this issue as release blocker as a remainder that we have to agree how to handle threads/processes before Python 3.7 feature freeze.

    See also bpo-31233: socketserver.ThreadingMixIn leaks running threads after server_close().

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Sep 26, 2017

    I proposed the bpo-31593 to fix the issue differently in Python 3.6 (and maybe also Python 2.7).

    @ned-deily
    Copy link
    Member

    @ned-deily ned-deily commented Jan 28, 2018

    What's the status of this? Feature freeze is here!

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Jan 28, 2018

    What's the status of this? Feature freeze is here!

    This issue is very to https://bugs.python.org/issue31233#msg311021 as it's status:

    • The commit aa8ec34 is backward incompatible and it's not documented
    • There is no option to opt-in for the previous behaviour (exit without waiting for child processes)

    For beta1, IMHO it's ok to keep the current status. But I would like to see this issue fixed before 3.7 final. Either revert my change, or do something else.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented Jan 29, 2018

    New changeset db8189b by Victor Stinner in branch 'master':
    bpo-31233, bpo-31151: Document socketserver changes (bpo-5417)
    db8189b

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented May 16, 2018

    I created bpo-33540: "socketserver: Add an opt-in option to get Python 3.6 behaviour on server_close()". I set the release blocker priority on this new issue.

    @vstinner
    Copy link
    Member Author

    @vstinner vstinner commented May 29, 2018

    The bug has been fixed in Python 3.7.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 performance tests
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants