Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start the deprecation cycle for subprocess preexec_fn #82616

Open
gpshead opened this issue Oct 10, 2019 · 19 comments
Open

Start the deprecation cycle for subprocess preexec_fn #82616

gpshead opened this issue Oct 10, 2019 · 19 comments
Assignees
Labels
3.13 new features, bugs and security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@gpshead
Copy link
Member

gpshead commented Oct 10, 2019

BPO 38435
Nosy @gpshead, @vstinner, @diekhans, @markmentovai, @serhiy-storchaka, @izbyshev
PRs
  • gh-82616: Add process_group support to subprocess.Popen #23930
  • bpo-38435: Detect preexec_fn=os.setsid as start_new_session=True #23936
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = None
    created_at = <Date 2019-10-10.18:18:17.382>
    labels = ['type-feature', 'library', '3.11']
    title = 'Start the deprecation cycle for subprocess preexec_fn'
    updated_at = <Date 2022-03-16.16:54:37.730>
    user = 'https://github.com/gpshead'

    bugs.python.org fields:

    activity = <Date 2022-03-16.16:54:37.730>
    actor = 'markmentovai'
    assignee = 'gregory.p.smith'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2019-10-10.18:18:17.382>
    creator = 'gregory.p.smith'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 38435
    keywords = ['patch']
    message_count = 18.0
    messages = ['354397', '354404', '354407', '354439', '383708', '383720', '383724', '383725', '383726', '383727', '383728', '383736', '383740', '383863', '383988', '400498', '403026', '415352']
    nosy_count = 6.0
    nosy_names = ['gregory.p.smith', 'vstinner', 'diekhans', 'markmentovai', 'serhiy.storchaka', 'izbyshev']
    pr_nums = ['23930', '23936']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue38435'
    versions = ['Python 3.11']

    @gpshead
    Copy link
    Member Author

    gpshead commented Oct 10, 2019

    subprocess's preexec_fn feature needs to enter PendingDeprecationWarning state in 3.9, to become a DeprecationWarning in 3.10, with a goal of removing it in 3.11.

    Rationale: We now live in a world full of threads, it is entirely unsafe to call back into the python interpreter within the forked child before exec per POSIX specification.

    We've also already made preexec_fn no longer supported from CPython subinterpreters in 3.8.

    If there are not already issues open for desired features of subprocess that do not yet have replacements or workarounds for *specific* actions that preexec_fn is being used for in your application, please open feature requests for those. (ex: calling umask is https://bugs.python.org/issue38417, and group, uid, gid setting has already landed in 3.9)

    @gpshead gpshead added the 3.9 only security fixes label Oct 10, 2019
    @gpshead gpshead self-assigned this Oct 10, 2019
    @gpshead gpshead added the type-feature A feature request or enhancement label Oct 10, 2019
    @vstinner
    Copy link
    Member

    What is the recommanded way to replace preexec_fn?

    @gpshead
    Copy link
    Member Author

    gpshead commented Oct 10, 2019

    With task specific arguments. cwd, start_new_session, group, extra_groups,
    user, etc..

    We cannot provide a general do everything replacement and should not try.
    It not possible.

    @vstinner
    Copy link
    Member

    We cannot provide a general do everything replacement and should not try. It not possible.

    Well, I proposed a solution at:
    https://bugs.python.org/issue38417#msg354242

    I know that this solution has multiple flaws, but a bad solution may be better than no solution: breaking applications when upgrading to Python 3.11.

    @diekhans
    Copy link
    Mannequin

    diekhans mannequin commented Dec 24, 2020

    calling setpgid() is a common post-fork task that would need to be an explicit parameter to Popen when preexec_fn goes away

    @gpshead gpshead added 3.10 only security fixes and removed 3.9 only security fixes labels Dec 25, 2020
    @gpshead
    Copy link
    Member Author

    gpshead commented Dec 25, 2020

    PR up to add setpgid support. From what I've come across, some setpgid() users can use setsid() already available via start_new_session= instead. But rather than getting into the differences between those, making both available makes sense to allow for anyone's case where setsid() isn't desired.

    @gpshead
    Copy link
    Member Author

    gpshead commented Dec 25, 2020

    https://bugs.python.org/issue42736 filed to track adding Linux prctl() support.

    @gpshead
    Copy link
    Member Author

    gpshead commented Dec 25, 2020

    Another preexec_fn use to consider:

     resource.setrlimit(resource.RLIMIT_CORE, (XXX, XXX))

    Using an intermediate shell script wrapper that changes the rlimit and exec's the actual process is also an alternative.

    @gpshead
    Copy link
    Member Author

    gpshead commented Dec 25, 2020

    I'm also seeing a lot of os.setpgrp() calls, though those are more likely able to use start_new_session to do setsid() as a dropin replacement.

    @gpshead
    Copy link
    Member Author

    gpshead commented Dec 25, 2020

    signal.signal use case:

    Calls to signal.signal(x, y) to sometimes to set a non SIG_DFL behavior on a signal. SIGINT -> SIG_IGN for example.

    I see a lot of legacy looking code calling signal.signal in prexec_fn that appears to set SIG_DFL for the signals that Python otherwise modifies. Which restore_signals=True should already be doing.

    @gpshead
    Copy link
    Member Author

    gpshead commented Dec 25, 2020

    Doing the code inspection of existing preexec_fn= users within our codebase at work is revealing. But those seem to be the bulk of uses.

    I expect this deprecation to take a while. Ex: if we mark it as PendingDeprecationWarning in 3.10, I'd still wait until _at least_ 3.13 to remove it.

    Code using it often has a long legacy and may be written to run on a wide variety of Python versions. It's painful to do so when features you need in order to stop using it are still only in modern versions.

    @gpshead gpshead added the stdlib Python modules in the Lib dir label Dec 25, 2020
    @vstinner
    Copy link
    Member

    Using an intermediate shell script wrapper that changes the rlimit and exec's the actual process is also an alternative.

    IMO using Python is more portable than relying on a shell.

    I dislike relying on a shell since shells are not really portable (behave differently), unless you restrict yourself to a strict POSIX subset of the shell programming language. While '/bin/sh' is available on most Unix, Android uses '/system/bin/sh', and Windows and VxWorks have no shell (Windows provides cmd.exe which uses Batch programming language, and there are other scripting languages like VBS or PowerShell: so you need a complete different implementation for Windows).

    For the oslo.concurrency project, I wrote the Python script prlimit.py: a wrapper calling resource.setrlimit() and then execute a new program. It's similar to the Unix prlimit program, but written in Python to be portable (the "prlimit" program is not available on all platforms).

    https://github.com/openstack/oslo.concurrency/blob/master/oslo_concurrency/prlimit.py

    I suggest to not provide a builtin wrapper to replace preexec_fn, but suggest replacements in the subprocess and What's New in Python 3.11 documentation (explain how to port existing code).

    More generally, the whole preeexc_fn feature could be reimplemented a third-party project by spawning a *new* Python process, run the Python code, and *then* execute the final process. The main feature of preexec_fn is to give the ability to run a function of the current process, whereas what I'm discussing would be code written as a string.

    --

    preexec_fn can be used for non-trivial issues like only sharing some Windows HANDLE, see:
    https://www.python.org/dev/peps/pep-0446/#only-inherit-some-handles-on-windows

    Note: This specific problem has been solved the proper way in Python by adding support for PROC_THREAD_ATTRIBUTE_HANDLE_LIST in subprocess.STARTUPINFO: lpAttributeList['handle_list'] parameter.

    @vstinner
    Copy link
    Member

    I just created bpo-42738: "subprocess: don't close all file descriptors by default (close_fds=False)".

    @gpshead
    Copy link
    Member Author

    gpshead commented Dec 27, 2020

    "using Python is more portable than relying on a shell."

    Not in environments I use. :) There isn't an installed python interpreter that can be executed when deploying Python as an embedded interpreter such as anyone using pyoxidizer or similar. Plus "using python" means adding a Python startup time delay to anything that triggered such an action. That added latency isn't acceptable in some situations.

    When I suggest a workaround for something as involving an intermediate shell script, read that to mean "the user needs an intermediate program to do this complicated work for them - how is up to them - we aren't going to provide it from the stdlib". A shell script is merely one easy pretty-fast solution - in environments where that is possible.

    TL;DR - there's no one size fits all solution here. But third party libraries could indeed implement any/all of these options including abstracting how and what gets used when if someone wanted to do that.

    @serhiy-storchaka
    Copy link
    Member

    Would not be more consistent with other parameters to name the new parameter "pgid" or "process_group"?

    And why not use None as default, like for user and group?

    @gpshead
    Copy link
    Member Author

    gpshead commented Aug 28, 2021

    A worthwhile general suggestion on a new path forward for the mess of things between (v)fork+exec from Victor is over in https://bugs.python.org/issue42736#msg383869

    TL;DR creating a subprocess.Preexec() recording object with specific interfaces for things that can be done, an instance of which gets passed in and the recorded actions are done as appropriate.

    @gpshead gpshead added 3.11 bug and security fixes and removed 3.10 only security fixes labels Aug 28, 2021
    @gpshead
    Copy link
    Member Author

    gpshead commented Oct 1, 2021

    Another use case someone had for preexec_fn popped up today:

     prctl(PR_SET_PDEATHSIG, SIGTERM)

    @markmentovai
    Copy link
    Mannequin

    markmentovai mannequin commented Mar 16, 2022

    Another use case for preexec_fn: establishing a new controlling terminal, typically in conjunction with start_new_session=True. A preexec_fn may do something like

    os.close(os.open(os.ttyname(sys.stdin.fileno()), os.O_RDWR))
    

    with discussion at https://chromium-review.googlesource.com/c/chromium/src/+/3524204/comments/59f03e7c_f103cd7e.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    gpshead added a commit that referenced this issue May 5, 2022
    One more thing that can help prevent people from using `preexec_fn`.
    
    Also adds conditional skips to two tests exposing ASAN flakiness on the Ubuntu 20.04 Address Sanitizer Github CI system. When that build is run on more modern systems the "problem" does not show up. It seems ASAN implementation related.
    
    Co-authored-by: Zackery Spytz <zspytz@gmail.com>
    Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
    @gpshead
    Copy link
    Member Author

    gpshead commented May 5, 2022

    calling setpgid() is a common post-fork task that would need to be an explicit parameter to Popen

    Landed as process_group= on Popen for 3.11.

    vstinner added a commit that referenced this issue May 27, 2022
    _posixsubprocess: add a static assertion to ensure that the pid_t
    type is signed.
    
    Replace _Py_IntegralTypeSigned() with _Py_IS_TYPE_SIGNED().
    @gpshead gpshead added 3.13 new features, bugs and security fixes and removed 3.11 bug and security fixes labels Sep 22, 2023
    vain added a commit to bundlewrap/bundlewrap that referenced this issue Nov 10, 2023
    Not very relevant for performance here, but preexec_fn is about to be
    deprecated, so let's just remove it while we're at it.
    
    python/cpython#82616
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.13 new features, bugs and security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants