Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expose posix_spawn(p) #64303

Closed
benjaminp opened this issue Jan 1, 2014 · 70 comments
Closed

expose posix_spawn(p) #64303

benjaminp opened this issue Jan 1, 2014 · 70 comments
Assignees
Labels
3.8 (EOL) end of life extension-modules C modules in the Modules dir type-feature A feature request or enhancement

Comments

@benjaminp
Copy link
Contributor

BPO 20104
Nosy @gpshead, @vstinner, @benjaminp, @ned-deily, @alex, @dhduvall, @vadmium, @serhiy-storchaka, @pablogsal, @miss-islington
PRs
  • bpo-20104: Expose posix_spawn in the os module #5109
  • bpo-20104: Expose posix_spawn in the os module #5109
  • bpo-20104: Fix memory leaks and error handling in posix spawn #5418
  • bpo-20104: Improve error handling and fix a reference leak in os.posix_spawn(). #6332
  • bpo-20104: Add os.posix_spawn documentation. #6334
  • [3.7] bpo-20104: Add os.posix_spawn documentation. (GH-6334) #6336
  • [3.7] bpo-20104: Improve error handling and fix a reference leak in os.posix_spawn(). (GH-6332) #6673
  • bpo-20104: Add flag capabilities to posix_spawn #6693
  • bpo-20104: Change the file_actions parameter of os.posix_spawn(). #6725
  • bpo-20104: Remove posix_spawn from 3.7 #6794
  • Dependencies
  • bpo-33191: Refleak in posix_spawn
  • bpo-33441: Expose the sigset_t converter via private API
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = <Date 2018-09-09.11:40:06.390>
    created_at = <Date 2014-01-01.18:13:20.774>
    labels = ['extension-modules', 'type-feature', '3.8']
    title = 'expose posix_spawn(p)'
    updated_at = <Date 2019-01-07.00:47:02.950>
    user = 'https://github.com/benjaminp'

    bugs.python.org fields:

    activity = <Date 2019-01-07.00:47:02.950>
    actor = 'vstinner'
    assignee = 'gregory.p.smith'
    closed = True
    closed_date = <Date 2018-09-09.11:40:06.390>
    closer = 'vstinner'
    components = ['Extension Modules']
    creation = <Date 2014-01-01.18:13:20.774>
    creator = 'benjamin.peterson'
    dependencies = ['33191', '33441']
    files = []
    hgrepos = []
    issue_num = 20104
    keywords = ['patch']
    message_count = 70.0
    messages = ['207137', '207139', '213344', '222570', '222571', '222572', '269684', '289758', '289769', '289775', '289791', '311040', '311041', '311051', '311081', '311086', '311087', '311089', '311094', '311100', '311102', '311144', '311164', '311165', '311166', '314732', '314770', '314774', '314775', '314777', '314779', '314781', '314783', '314784', '314826', '315795', '315995', '315997', '316014', '316046', '316063', '316068', '316088', '316150', '316235', '316238', '316246', '316247', '316268', '316282', '316489', '316495', '316503', '316522', '316525', '316551', '316588', '316592', '316595', '317071', '317076', '317084', '317510', '323639', '324765', '324832', '324833', '324881', '324883', '333129']
    nosy_count = 13.0
    nosy_names = ['gregory.p.smith', 'vstinner', 'benjamin.peterson', 'ned.deily', 'alex', 'dhduvall', 'neologix', 'gennad', 'martin.panter', 'serhiy.storchaka', 'John Jones', 'pablogsal', 'miss-islington']
    pr_nums = ['5109', '5109', '5418', '6332', '6334', '6336', '6673', '6693', '6725', '6794']
    priority = None
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue20104'
    versions = ['Python 3.8']

    @benjaminp
    Copy link
    Contributor Author

    posix_spawn is a nice, efficient replacement for fork()/exec(). We should expose it and possibly use it in subprocess where possible.

    @benjaminp benjaminp added performance Performance or resource usage extension-modules C modules in the Modules dir labels Jan 1, 2014
    @gpshead
    Copy link
    Member

    gpshead commented Jan 1, 2014

    Unless it could replace the fork+exec code path in its entirety, which I do not believe is possible, I see posix_spawn() as a distraction and additional maintenance burden with no benefit.

    http://pubs.opengroup.org/onlinepubs/7999959899/functions/posix_spawn.html

    Read the RATIONALE section. The posix_spawn API was not created to make subprocess creation easier (i'd argue that it is the same burden to setup a proper call to posix_spawn as it is to do everything right for fork and exec).

    One notable thing posix_spawn() does not support: setsid() (start_new_session=True) of the child process. Obviously it also couldn't handle the arbitrary preexec_fn but preexec_fn is in general considered harmful.

    @benjaminp
    Copy link
    Contributor Author

    Okay, this seems like a bad idea.

    @dhduvall
    Copy link
    Mannequin

    dhduvall mannequin commented Jul 8, 2014

    Our project (the Solaris packaging system, IPS), relies on posix_spawn() primarily for the ability to fork without making a large memory reservation (and possibly failing) because the forking process was itself very large. That's the (a?) bug benefit of posix_spawn() -- it's not a benefit for the programmer using it (who might have to fall back to fork/exec), but for the end-user that benefits from its streamlined operation.

    You're right that it doesn't handle everything that subprocess.Popen() does -- though at least on Solaris there's a way to change the cwd in the file actions, and I'm sure we'd consider adding a way to do the setsid() as well. The rest should be possible cross-platform.

    @alex
    Copy link
    Member

    alex commented Jul 8, 2014

    Danek, you might find https://github.com/dreid/posix_spawn/ useful, it provides bindings and a public API over posix_spawn (it's not complete yet, but if there's stuff missing, feel free to file a ticket!)

    @dhduvall
    Copy link
    Mannequin

    dhduvall mannequin commented Jul 8, 2014

    Cool. We implemented our own version as a straight-up native module (https://java.net/projects/ips/sources/pkg-gate/content/src/modules/pspawn.c), and our Popen replacement is not at present a complete replacement for the one in the stdlib, but it does what we need it to do. We'd absolutely switch if it came in to the stdlib, though, and I think there would be plenty of programs that would benefit from it (certainly a number of large Python programs in Solaris have run into exactly this problem).

    @dhduvall
    Copy link
    Mannequin

    dhduvall mannequin commented Jul 1, 2016

    Oh, for what it's worth, Solaris added setsid support to posix_spawn a few years ago, as a result of this conversation. I still think it would be worthwhile supporting this in the stdlib, since we keep running into processes which have a lot of memory reserved and can't fork some small program because they're running in a memory-limited environment.

    @JohnJones
    Copy link
    Mannequin

    JohnJones mannequin commented Mar 17, 2017

    To prevent subprocess/os.fork() doubling my memory after loading a large numpy array into memory, I now have to start my script with 650 calls to subprocess.Popen(), which just sit their waiting for some stdin to start doing something. This doesn't happen until after the numpy array is loaded, else I quickly run out of memory.

    I think this sort of situation results in more unnecessary complexity compared to using posix_spawn, in spite of the concerns about using posix_spawn. Perhaps subprocess.Popen just needs a "posix_spawn=True" parameter?

    @gpshead
    Copy link
    Member

    gpshead commented Mar 17, 2017

    I think someone wanting this will need to put forward a patch adding it to be reviewed and mulled over. As Alex mentioned in msg22571 - https://github.com/dreid/posix_spawn/ exists as does the code Danek pointed at in the next comment. try those.

    I suggest someone who actively cares about a limited available process address space environment contribute this. (ie: not your typical 64-bit system)

    fork()+exec() does not cause significant memory allocation, only a brief ~doubling of mapped address space, with backing pages being the originals just marked copy on write, but never written to by the child. The exec undoes that mapping.

    It is technically possible to cause the copy on writes to happen if you immediately write to a large amount of memory in the parent process after the fork has happened before the exec has, but that seems like a rare timing problem that could even be worked around by monitoring the forked child to see that the exec has occurred before continuing.

    @gpshead gpshead added the 3.7 (EOL) end of life label Mar 17, 2017
    @JohnJones
    Copy link
    Mannequin

    JohnJones mannequin commented Mar 17, 2017

    I agree with everything you're saying Gregory, however I don't think the significance of the memory doubling is as inconsequential as you might first think. For example, i have on my 64bit Linux system 128Gb of RAM, and a numpy table that's around 70Gb. Spawning a subprocess, even though memory is doubled for a very short period of time, is enough to raise a MemoryError, despite the subprocess i'm spawning using only 2 or 3Mb after the exec().

    I do appreciate that for most Python users however, they will not see much benefit from what I imagine is quite a lot of development work.

    FWIW, I did try the posix_spawn module, but i couldn't figure out how to write data to the stdin of a posix_spawn subprocess, and gave up in place of the commonly recommended solution to this problem (via StackExchange) of spawning lots of subprocesses before you put stuff in memory. Fortunately, for my problem, this was a possible solution. For others I think they're going to have to use posix_spawn, or an entirely different programming language if that doesn't work.

    @gpshead
    Copy link
    Member

    gpshead commented Mar 18, 2017

    All I'm really saying is that someone who wants this should provide a
    patch/PR with unittests. :) I can help review and go from there.

    It does make sense to me for it to be available as part of the subprocess
    API if it is available at all, likely an alternative implementation of or
    behavior flag to _posixsubprocess.fork_exec() with appropriate autoconf and
    conditional compilation based on availability ifdefs.

    On Fri, Mar 17, 2017 at 11:49 AM John Jones <report@bugs.python.org> wrote:

    John Jones added the comment:

    I agree with everything you're saying Gregory, however I don't think the
    significance of the memory doubling is as inconsequential as you might
    first think. For example, i have on my 64bit Linux system 128Gb of RAM, and
    a numpy table that's around 70Gb. Spawning a subprocess, even though memory
    is doubled for a very short period of time, is enough to raise a
    MemoryError, despite the subprocess i'm spawning using only 2 or 3Mb after
    the exec().

    I do appreciate that for most Python users however, they will not see much
    benefit from what I imagine is quite a lot of development work.

    FWIW, I did try the posix_spawn module, but i couldn't figure out how to
    write data to the stdin of a posix_spawn subprocess, and gave up in place
    of the commonly recommended solution to this problem (via StackExchange) of
    spawning lots of subprocesses before you put stuff in memory. Fortunately,
    for my problem, this was a possible solution. For others I think they're
    going to have to use posix_spawn, or an entirely different programming
    language if that doesn't work.

    ----------


    Python tracker <report@bugs.python.org>
    <http://bugs.python.org/issue20104\>


    @gpshead gpshead reopened this Jan 8, 2018
    @gpshead gpshead added type-feature A feature request or enhancement and removed performance Performance or resource usage labels Jan 8, 2018
    @gpshead
    Copy link
    Member

    gpshead commented Jan 29, 2018

    New changeset 6c6ddf9 by Gregory P. Smith (Pablo Galindo) in branch 'master':
    bpo-20104: Expose posix_spawn in the os module (GH-5109)
    6c6ddf9

    @gpshead
    Copy link
    Member

    gpshead commented Jan 29, 2018

    remaining work before closing this:
    Doc/library/os.rst needs to describe os.posix_spawn

    @vadmium
    Copy link
    Member

    vadmium commented Jan 29, 2018

    Pablo’s code looked unfinished to me. As well as missing documentation, I suspect there may be memory leaks and poor error handling.

    The two calls above the “fail:” label look like dead code. The “parse_envlist” result appears to be leaked.

    I’m curious why you never call “posix_spawn_file_actions_destroy”. I saw on Open BSD <https://man.openbsd.org/posix_spawn_file_actions_init.3\> it reclaims memory, and it seems sensible to call it on other platforms as well.

    No error checking on any of the “posix_spawn_file_actions_” methods.

    If “posix_spawn” fails, the “pid” variable will be returned uninitialized.

    @pablogsal
    Copy link
    Member

    I have opened a PR to address this issues:

    #5415

    @vadmium
    Copy link
    Member

    vadmium commented Jan 29, 2018

    Does the PySequence_Fast result need releasing if the following “for” loop fails? There is a Py_DECREF only in the successful case, which seems inconsistent.

    Does Python still support non-UTF-8 locales and bytes filenames? I haven’t been keeping up, but I assumed these things were still supported in many cases. It seems strange to only support UTF-8 in the “addopen” file action.

    Pablo’s second PR currently <https://github.com/python/cpython/pull/5418/commits/e50bdb9\> calls PyErr_SetString(PyExc_OSError, . . .) with a custom error message depending on which OS call failed. But I wonder if it is more important to set the “errno” attribute (which I think should choose an OSError subclass if appropriate). Perhaps you can do that by assigning the return values to “errno” and then calling PyErr_SetFromErrno. A disadvantage might be less context about which stage went wrong.

    @vadmium
    Copy link
    Member

    vadmium commented Jan 29, 2018

    TypeError if “posix_spawn_file_actions_init” fails doesn’t seem right. I suggest OSError, MemoryError, or even plain Exception instead.

    “File_actionsp” is set to point to a local variable “_file_actions”, but the variable goes out of scope before the pointer is used. This is technically undefined behaviour (even if it happens to work). Simplest solution would be to move the variables into the same outer scope.

    @pablogsal
    Copy link
    Member

    Regarding the leak, I was under the assumption that as File_actionsp is pointing to a stack initialized _file_actions and is this last variable the one that is passed to posix_spawn_file_actions_init, it was not needed to explicitly call posix_spawn_file_actions_destroy as the variable will go out of scope and the pointer is pointing to a stack variable. Am I missing something?

    @vadmium
    Copy link
    Member

    vadmium commented Jan 29, 2018

    Your assumption about calling “file_actions_destroy” would be okay if the posix_spawn_file_actions_t object was a simple object or structure. But I imagine most implementations would allocate memory when you call one of the “add” methods. Especially “addopen”, which is supposed to copy the filename string somewhere. Looking at “uclibc” <https://repo.or.cz/uclibc-ng.git/blob/v1.0.28:/include/spawn.h#l263\>, I see it calls “free”, because it allocates an array for all the file actions.

    @vstinner
    Copy link
    Member

    Python master doesn't build on macOS Tiger anymore:

    http://buildbot.python.org/all/#/builders/30/builds/581

    ./Modules/posixmodule.c:251:19: error: spawn.h: No such file or directory
    ./Modules/posixmodule.c: In function ‘os_posix_spawn_impl’:
    ./Modules/posixmodule.c:5174: error: ‘posix_spawn_file_actions_t’ undeclared (first use in this function)
    ./Modules/posixmodule.c:5174: error: (Each undeclared identifier is reported only once
    ./Modules/posixmodule.c:5174: error: for each function it appears in.)
    ./Modules/posixmodule.c:5174: error: ‘file_actionsp’ undeclared (first use in this function)
    ./Modules/posixmodule.c:5176: error: parse error before ‘_file_actions’
    ./Modules/posixmodule.c:5177: warning: implicit declaration of function ‘posix_spawn_file_actions_init’
    ./Modules/posixmodule.c:5177: error: ‘_file_actions’ undeclared (first use in this function)
    ./Modules/posixmodule.c:5232: warning: implicit declaration of function ‘posix_spawn_file_actions_addopen’
    ./Modules/posixmodule.c:5245: warning: implicit declaration of function ‘posix_spawn_file_actions_addclose’
    ./Modules/posixmodule.c:5262: warning: implicit declaration of function ‘posix_spawn_file_actions_adddup2’
    ./Modules/posixmodule.c:5274: warning: implicit declaration of function ‘posix_spawn’
    make: *** [Modules/posixmodule.o] Error 1

    See also bpo-32705: "Current Android does not have posix_spawn" (now fixed). In PR 5413 of this bpo, it was also asked if "#define HAVE_POSIX_SPAWN 1" of posixmodule.c makes sence since there is already a configure check for posix_spawn()?

    #5413 (comment)

    @pablogsal
    Copy link
    Member

    @vstinner I have removed the #define HAVE_POSIX_SPAWN 1 in PR 5418.

    @vstinner
    Copy link
    Member

    FYI bolen-dmg-3.x was also broken by this change:

    http://buildbot.python.org/all/#/builders/69/builds/114

    @vadmium
    Copy link
    Member

    vadmium commented May 14, 2018

    I suggested the “scheduler” tuple to bring the two related parameters (scheduling policy and sched_param) together, similar to how they are paired as the second and third parameters to “os.sched_setscheduler”, and because I thought it would imply that a scheduling policy without sched_param is not valid.

    But separate keyword parameters would also work if you prefer. I might call them “schedpolicy” and “schedparam”, but if you prefer the naming scheme in the current proposal, you could call them “setscheduler” and “setschedparam”. Although in that case, setting the “setscheduler” parameter on its own would not be sufficient to correctly set the POSIX_SPAWN_SETSCHEDULER flag.

    @pablogsal
    Copy link
    Member

    I think that is the biggest argument towards using a tuple: that just setting the priority is not enough (and also is decontextualized as different policies have different priorities). On the other hand one could say that the API is a low level API and therefore the user should be aware of such a problem. Also, using a tuple makes the API a bit asymmetric as Serhiy mention but I think this can compensate for the conceptual map the will be established (as in each priority is associated with a policy) if the tuple API is used.

    @serhiy-storchaka
    Copy link
    Member

    As for the scheduler interface, yet one option is using two mutually exclusive parameters setschedparam and setscheduler. The first take a sched_param, the second takes a pair: int and sched_param. This will not simplify the implementation, but they directly correspond to functions os.sched_setparam() and os.sched_setscheduler(). I don't say that this interface is better than alternatives, I just mention yet one option.

    posix_spawn() can be easily implemented via fork()/exec(). See the reference implementation: http://pubs.opengroup.org/onlinepubs/009695399/xrat/xsh_chap03.html . It can be directly translated to Python. Seems the only value of posix_spawn() is on systems that don't provide fork()/exec(), but provide posix_spawn(). Is posix_spawn() supported on Windows? What are other systems supported by Python that don't provide fork()/exec()?

    I'm for removing os.posix_spawn() in 3.7. Even if we will accept the current interface and merge PR 6693 we can find other problems with the interface or the implementation. And os.posix_spawnp() still is not implemented. It is just too dangerous to add such complex feature between the last beta and RC.

    @pablogsal
    Copy link
    Member

    Notice that #6794 is already open to remove posix_spawn from 3.7.

    @gpshead
    Copy link
    Member

    gpshead commented May 14, 2018

    Pablo, Victor, Ned and I synced up at the pycon2018 sprints. We're removing posix_spawn from 3.7 (#6794) while the API is worked on.

    @gpshead
    Copy link
    Member

    gpshead commented May 14, 2018

    New changeset 8e633a4 by Gregory P. Smith (Pablo Galindo) in branch '3.7':
    bpo-20104: Remove posix_spawn from 3.7 (GH-6794)
    8e633a4

    @gpshead gpshead added stdlib Python modules in the Lib dir and removed 3.7 (EOL) end of life release-blocker labels May 14, 2018
    @pablogsal
    Copy link
    Member

    Some interesting benchmarks of posix spawn:

    https://github.com/rtomayko/posix-spawn/blob/master/README.md

    @pablogsal pablogsal added 3.7 (EOL) end of life release-blocker and removed stdlib Python modules in the Lib dir labels May 14, 2018
    @pablogsal
    Copy link
    Member

    Also this reading may be relevant/interesting:

    https://about.gitlab.com/2018/01/23/how-a-fix-in-go-19-sped-up-our-gitaly-service-by-30x/

    @vstinner
    Copy link
    Member

    posix_spawn can still be found in Python 3.7:

    configure:11243: posix_fallocate posix_fadvise posix_spawn pread preadv preadv2 \
    configure.ac:3470: posix_fallocate posix_fadvise posix_spawn pread preadv preadv2 \
    pyconfig.h.in:710:/* Define to 1 if you have the `posix_spawn' function. */

    This check has been added by the commit 6c6ddf9 and should be removed from Python 3.7 as well, no? It's not a release blocker, but it may be good to remove posix_spawn from configure.ac (and related generated files).

    @pablogsal
    Copy link
    Member

    I originally removed it from the configure script in PR6794 but it was reintroduced in commit 5700952. Check the PR for Greg's rationale.

    @vstinner
    Copy link
    Member

    I originally removed it from the configure script in PR6794 but it was reintroduced in commit 5700952. Check the PR for Greg's rationale.

    Oh ok.

    @vstinner
    Copy link
    Member

    Tests fail on PPC64 Fedora 3.x: bpo-33630.

    @serhiy-storchaka
    Copy link
    Member

    Benjamin, Gregory, could you please look at PR 6693? Is it what you want?

    @pablogsal
    Copy link
    Member

    New changeset 254a466 by Pablo Galindo in branch 'master':
    bpo-20104: Add flag capabilities to posix_spawn (GH-6693)
    254a466

    @serhiy-storchaka
    Copy link
    Member

    New changeset d700f97 by Serhiy Storchaka in branch 'master':
    bpo-20104: Change the file_actions parameter of os.posix_spawn(). (GH-6725)
    d700f97

    @serhiy-storchaka
    Copy link
    Member

    Thank you for your contribution Pablo. This issue have appeared much more complex and less obvious than it looked initially.

    @vstinner
    Copy link
    Member

    vstinner commented Sep 9, 2018

    Thank you for your contribution Pablo. This issue have appeared much more
    complex and less obvious than it looked initially.

    Yeah, thanks Pablo for your tenacy! This function is interesting in term of
    performance and correctness ("atomic" function, signal safe?).

    @vstinner
    Copy link
    Member

    vstinner commented Sep 9, 2018

    I close the issue, it's now done.

    If someone wants to experiment in posix_spawn() in subprocess, please open a new issue.

    @vstinner vstinner closed this as completed Sep 9, 2018
    @vstinner
    Copy link
    Member

    vstinner commented Jan 7, 2019

    Serhiy:

    And os.posix_spawnp() still is not implemented.

    I created bpo-35674 to see how to expose this feature.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 (EOL) end of life extension-modules C modules in the Modules dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    10 participants