Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing: the Resource Tracker process is never reaped #88887

Open
viktorivanov mannequin opened this issue Jul 23, 2021 · 2 comments
Open

multiprocessing: the Resource Tracker process is never reaped #88887

viktorivanov mannequin opened this issue Jul 23, 2021 · 2 comments
Labels
3.7 (EOL) end of life 3.9 only security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-bug An unexpected behavior, bug, or error

Comments

@viktorivanov
Copy link
Mannequin

viktorivanov mannequin commented Jul 23, 2021

BPO 44724
Nosy @moreati
Files
  • multi.py: Minimal program leaking resource tracker zombies
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-07-23.13:29:53.414>
    labels = ['3.7', 'library', '3.9', 'performance']
    title = 'multiprocessing: the Resource Tracker process is never reaped'
    updated_at = <Date 2021-11-24.23:24:39.596>
    user = 'https://bugs.python.org/viktorivanov'

    bugs.python.org fields:

    activity = <Date 2021-11-24.23:24:39.596>
    actor = 'Alex.Willmer'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2021-07-23.13:29:53.414>
    creator = 'viktor.ivanov'
    dependencies = []
    files = ['50173']
    hgrepos = []
    issue_num = 44724
    keywords = []
    message_count = 1.0
    messages = ['398053']
    nosy_count = 2.0
    nosy_names = ['Alex.Willmer', 'viktor.ivanov']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue44724'
    versions = ['Python 3.7', 'Python 3.9']

    @viktorivanov
    Copy link
    Mannequin Author

    viktorivanov mannequin commented Jul 23, 2021

    The multiprocessing.resource_tracker instance is never reaped, leaving zombie processes.

    There is a waitpid() call for the ResourceTracker's pid but it is in a private method _stop() which seems to be only called from some test modules.

    Usually environments have some process handling zombies but if python is the "main" process in a container, for example, and runs another python instance that does something leaking a ResourceTracker process, zombies start to accumulate.

    This is easily reproducible with a couple of small python programs as long as they are not run from a shell or another parent process that takes care of forgotten children.

    It was originally discovered in a docker container that has a python program as its entry point (celery worker in an airflow container) running other python programs (dbt).

    The minimal code is available on Github here: https://github.com/viktorvia/python-multi-issue

    The attached multi.py is leaking resource tracker processes, but just running it from a full-fledged development environment will not show the issue.

    Instead, run it via another python program from a Docker container:

    Dockerfile:
    ---
    FROM python:3.9

    WORKDIR /usr/src/multi

    COPY . ./

    CMD ["python", "main.py"]
    ---

    main.py:
    ---

    from subprocess import run
    from time import sleep
    
    while True:
        result = run(["python", "multi.py"], capture_output=True)
        print(result.stdout.decode('utf-8'))
        result = run(["ps", "-ef", "--forest"], capture_output=True)
        print(result.stdout.decode('utf-8'), flush=True)
        sleep(1)

    When the program is run it will accumulate 1 zombie on each run:
    ---

    $ docker run -it multi python main.py
    [1, 4, 9]

    UID PID PPID C STIME TTY TIME CMD
    root 1 0 11 11:33 pts/0 00:00:00 python main.py
    root 8 1 0 11:33 pts/0 00:00:00 [python] <defunct>
    root 17 1 0 11:33 pts/0 00:00:00 ps -ef --forest

    [1, 4, 9]

    UID PID PPID C STIME TTY TIME CMD
    root 1 0 6 11:33 pts/0 00:00:00 python main.py
    root 8 1 3 11:33 pts/0 00:00:00 [python] <defunct>
    root 19 1 0 11:33 pts/0 00:00:00 [python] <defunct>
    root 28 1 0 11:33 pts/0 00:00:00 ps -ef --forest

    [1, 4, 9]

    UID PID PPID C STIME TTY TIME CMD
    root 1 0 4 11:33 pts/0 00:00:00 python main.py
    root 8 1 1 11:33 pts/0 00:00:00 [python] <defunct>
    root 19 1 3 11:33 pts/0 00:00:00 [python] <defunct>
    root 30 1 0 11:33 pts/0 00:00:00 [python] <defunct>
    root 39 1 0 11:33 pts/0 00:00:00 ps -ef --forest

    [1, 4, 9]

    UID PID PPID C STIME TTY TIME CMD
    root 1 0 3 11:33 pts/0 00:00:00 python main.py
    root 8 1 1 11:33 pts/0 00:00:00 [python] <defunct>
    root 19 1 1 11:33 pts/0 00:00:00 [python] <defunct>
    root 30 1 4 11:33 pts/0 00:00:00 [python] <defunct>
    root 41 1 0 11:33 pts/0 00:00:00 [python] <defunct>
    root 50 1 0 11:33 pts/0 00:00:00 ps -ef --forest
    ---

    Running from a shell script, or just another python program that handles SIGCHLD by calling wait() takes care of the zombies.

    @viktorivanov viktorivanov mannequin added 3.7 (EOL) end of life 3.9 only security fixes stdlib Python modules in the Lib dir performance Performance or resource usage labels Jul 23, 2021
    @vstinner vstinner changed the title Resource Tracker is never reaped multiprocessing: the Resource Tracker process is never reaped Sep 15, 2021
    @vstinner vstinner changed the title Resource Tracker is never reaped multiprocessing: the Resource Tracker process is never reaped Sep 15, 2021
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @iritkatriel iritkatriel added type-bug An unexpected behavior, bug, or error and removed performance Performance or resource usage labels Sep 1, 2022
    @babaMar
    Copy link

    babaMar commented Nov 7, 2023

    The same happens for python3 -c from multiprocessing.semaphore_tracker import main;main(3), when the parent exits, this process doesn't get reaped

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.9 only security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-bug An unexpected behavior, bug, or error
    Projects
    Status: No status
    Development

    No branches or pull requests

    3 participants