Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FreeBSD: Optimize subprocess.Popen(close_fds=True) using closefrom() #82242

Closed
vstinner opened this issue Sep 9, 2019 · 18 comments
Closed

FreeBSD: Optimize subprocess.Popen(close_fds=True) using closefrom() #82242

vstinner opened this issue Sep 9, 2019 · 18 comments
Labels
3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@vstinner
Copy link
Member

vstinner commented Sep 9, 2019

BPO 38061
Nosy @vstinner, @koobs
PRs
  • bpo-38061: os.closerange() uses closefrom() on FreeBSD #19696
  • bpo-38061: subprocess uses closefrom() on FreeBSD #19697
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-04-24.10:14:59.833>
    created_at = <Date 2019-09-09.07:25:36.109>
    labels = ['type-feature', 'library', '3.9']
    title = 'FreeBSD: Optimize subprocess.Popen(close_fds=True) using closefrom()'
    updated_at = <Date 2020-04-24.10:14:59.832>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2020-04-24.10:14:59.832>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-04-24.10:14:59.833>
    closer = 'vstinner'
    components = ['Library (Lib)']
    creation = <Date 2019-09-09.07:25:36.109>
    creator = 'vstinner'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 38061
    keywords = ['patch']
    message_count = 18.0
    messages = ['351351', '351353', '351355', '351356', '351357', '351358', '351367', '351369', '351383', '351457', '351461', '351465', '361750', '367161', '367163', '367183', '367184', '367186']
    nosy_count = 2.0
    nosy_names = ['vstinner', 'koobs']
    pr_nums = ['19696', '19697']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue38061'
    versions = ['Python 3.9']

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 9, 2019

    The default value of subprocess.Popen "close_fds" parameter changed to True in Python 3. Compared to Python 2, close_fds=True can make Popen 10x slower: see bpo-37790.

    A single close(fd) syscall is cheap. But when MAXFDS (maximum file descriptor number) is high, the loop calling close(fd) on each file descriptor can take several milliseconds.

    On FreeBSD, the _posixsubprocess could use closefrom() to reduce the number of syscalls.

    On Linux, _posixsubprocess lists the content of /proc/self/fd/ to only close open file descriptor, after fork() and before exec().

    @vstinner vstinner added 3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Sep 9, 2019
    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 9, 2019

    A workaround is to use close_fds=False which is only safe if all code used in Python implements the PEP-446.

    This PEP mentions the FreeBSD issue, mentioning bpo-11284 with MAXFD=655,000:

    "The operation can be slow if MAXFD is large. For example, on a FreeBSD buildbot with MAXFD=655,000, the operation took 300 ms: see issue bpo-11284: slow close file descriptors."

    Note: the subprocess module is now able to use posix_spawn() syscall, but it doesn't use it if close_fds=True, so it's outside the scope of this issue.

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 9, 2019

    See also bpo-13788: "os.closerange optimization".

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 9, 2019

    On Linux, _posixsubprocess lists the content of /proc/self/fd/ to only close open file descriptor, after fork() and before exec().

    This code is specific to Linux because it uses the SYS_getdents64 syscall.

    FreeBSD has a similar concept using /dev/fd "file-descriptor file system". See fdescfs manual page:
    https://www.freebsd.org/cgi/man.cgi?query=fdescfs&sektion=5

    I'm not sure how it is supposed to work. When I open a file in Python, I don't see its file descriptor in /dev/fd:

    vstinner@freebsd$ python3
    Python 3.6.9 (default, Aug  8 2019, 01:16:42) 
    >>> import os
    >>> os.listdir("/dev/fd")
    ['0', '1', '2']
    >>> f=open("/etc/motd")
    >>> os.listdir("/dev/fd")
    ['0', '1', '2']
    >>> f.fileno()
    3
    
    >>> os.set_inheritable(f.fileno(), True)
    >>> os.listdir("/dev/fd")
    ['0', '1', '2']
    
    >>> import sys, subprocess
    >>> rc=subprocess.call([sys.executable, "-c", "import os; print(os.listdir('/dev/fd')); print(os.fstat(3))"], close_fds=False)
    ['0', '1', '2']
    os.stat_result(st_mode=33188, st_ino=321134, st_dev=83, st_nlink=1, st_uid=0, st_gid=0, st_size=899, st_atime=1568014491, st_mtime=1566390614, st_ctime=1566390614)

    The file descriptor 3 is not listed in /dev/fd/. It is inherited: fstat(3) in a child process doesn't fail and it's not listed in /dev/fd/ in the child process.

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 9, 2019

    Oh, _posixsubprocess uses /dev/fd/ on FreeBSD, but only if it's mounted file system (expected to be fdescfs filesystem):

    #if defined(__FreeBSD__) || (defined(__APPLE__) && defined(__MACH__))
    # define FD_DIR "/dev/fd"
    #else
    # define FD_DIR "/proc/self/fd"
    #endif
    
    #if defined(__FreeBSD__)
    /* When /dev/fd isn't mounted it is often a static directory populated
     * with 0 1 2 or entries for 0 .. 63 on FreeBSD, NetBSD and OpenBSD.
     * NetBSD and OpenBSD have a /proc fs available (though not necessarily
     * mounted) and do not have fdescfs for /dev/fd.  MacOS X has a devfs
     * that properly supports /dev/fd.
     */
    static int
    _is_fdescfs_mounted_on_dev_fd(void)
    {
        struct stat dev_stat;
        struct stat dev_fd_stat;
        if (stat("/dev", &dev_stat) != 0)
            return 0;
        if (stat(FD_DIR, &dev_fd_stat) != 0)
            return 0;
        if (dev_stat.st_dev == dev_fd_stat.st_dev)
            return 0;  /* / == /dev == /dev/fd means it is static. #fail */
        return 1;
    }
    #endif

    On my FreeBSD 12 VM, MAXFD is around 230k which is quite large.

    vstinner@freebsd$ uname -a
    FreeBSD freebsd 12.0-RELEASE-p10 FreeBSD 12.0-RELEASE-p10 GENERIC amd64

    vstinner@freebsd$ ./python 
    Python 3.9.0a0 (heads/master:4db25d5c39, Sep  9 2019, 07:52:01) 
    >>> import os; os.sysconf("SC_OPEN_MAX")
    229284

    It's easy to measure the overhead of 230k close() syscalls:

    vstinner@freebsd$ python3 -m pyperf timeit -s 'import subprocess; args=["/usr/bin/true"]' 'subprocess.Popen(args, close_fds=False).wait()'
    .....................
    Mean +- std dev: 1.22 ms +- 0.12 ms
    vstinner@freebsd$ python3 -m pyperf timeit -s 'import subprocess; args=["/usr/bin/true"]' 'subprocess.Popen(args, close_fds=True).wait()'
    .....................
    Mean +- std dev: 53.3 ms +- 1.4 ms

    The overhead is 52.08 ms per Popen().wait() call (with close_fds=True).

    If I mount manually the fdescfs filesystem, suddenly, subprocess is efficient again:

    vstinner@freebsd$ sudo mount -t fdescfs /dev/fd
    vstinner@freebsd$ python3 -m pyperf timeit -s 'import subprocess; args=["/usr/bin/true"]' 'subprocess.Popen(args, close_fds=True).wait()'
    .....................
    Mean +- std dev: 1.20 ms +- 0.06 ms

    close_fds=True becomes as efficient as close_fds=False.

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 9, 2019

    FreeBSD has a similar concept using /dev/fd "file-descriptor file system". (...) I'm not sure how it is supposed to work.

    Sadly, on my FreeBSD VM, it seems like /dev/fd/ is not mounted with fdescfs by default, but as a regular directory with 3 hardcoded files 0, 1 and 2 which are character devices.

    I had to mount fdescfs filesystem manually at /dev/fd/ :-(

    $ cat /etc/fstab 
    # Custom /etc/fstab for FreeBSD VM images
    /dev/gpt/rootfs   /       ufs     rw      1       1
    /dev/gpt/swapfs  none    swap    sw      0       0

    Maybe it's an issue with "FreeBSD VM images" that I chose.

    --

    The FreeBSD CURRENT buildbot worker mounts /dev/fd:

    CURRENT-amd64% cat /etc/fstab
    # Device Mountpoint FStype Options Dump Pass#
    /dev/da0p2 none swap sw 0 0
    /dev/da0p3 / ufs rw 1 1
    fdescfs /dev/fd fdescfs rw 0 0

    CURRENT-amd64% mount|grep fd
    fdescfs on /dev/fd (fdescfs)

    @koobs
    Copy link

    koobs commented Sep 9, 2019

    We're tracking this in our downstream bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221700

    There's a patch there you may base something upstream on

    Open question:

    closefrom(2) test in ./configure && ifdef __FreeBSD__ ?

    @koobs
    Copy link

    koobs commented Sep 9, 2019

    @victor I mounted fdescfs on the buildbot workers to make the tests run faster. You're correct that it's not enabled by default.

    If we could look at the initial support being as simple as possible, we can look to backport this to 2.7 as well

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 9, 2019

    @victor I mounted fdescfs on the buildbot workers to make the tests run faster. You're correct that it's not enabled by default.

    Would it be possible to modify FreeBSD to enable it by default? Or is there a reason to not enable it by default?

    We're tracking this in our downstream bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221700 There's a patch there you may base something upstream on

    Which patch? https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221700#c3 or something else?

    Open question: closefrom(2) test in ./configure && ifdef __FreeBSD__ ?

    No need to make closefrom test conditional depending on the platform. We have plently of tests which are platform specific but run on all platforms.

    It's common that a function is first available on a single platform, but then available on more platforms.

    It seems like closefrom() is available on:

    @koobs
    Copy link

    koobs commented Sep 9, 2019

    Would it be possible to modify FreeBSD to enable it by default? Or is there a reason to not enable it by default?

    That's very unlikely to happen. I believe it was added as an opt-in feature for some linux compatibility situations. The correct solution is to use closefrom(2), as it is the optimal current solution, and in our case, safe to use.

    Which patch? https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221700#c3 or something else?

    Yep, that one.

    It seems like closefrom() is available on:

    Per bpo-8052 there was some concerns about closefrom(2) not being async-signal safe. [1]

    I can't speak to other implementations, but this is what I requested clarity from our FreeBSD developers on, and confirmed that our implementation is indeed safe. That is the reason why I thought scoping closefrom(2) to __FreeBSD__ may be warranted, just like like the linux specific bits in https://hg.python.org/cpython/rev/61aa484a3e54 from bpo-8052

    But I'll leave the decision as to how its implemented (configure checkls, ifdef'd or not) in your capable hands.

    Summary: FreeBSD's closefrom(2) is safe to anywhere in Python where it needs to close a range of fd's, including the subprocess module.

    [1] https://bugs.python.org/issue8052#msg132307

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 9, 2019

    That's very unlikely to happen. I believe it was added as an opt-in feature for some linux compatibility situations. The correct solution is to use closefrom(2), as it is the optimal current solution, and in our case, safe to use.

    If you have like 50k open file descriptors and want to pass fd 50000 to a child process using Popen(pass_fds={50000}), you will have to call close() 49 997 times: close(3)..close(49999), then closefrom(50001).

    For the pass_fds case, being able to enumerate open file descriptors (ex: fdescfs) is more efficient than closefrom().

    Well, the best would be a system call "close all file descriptors except of <list of fds>", but it's not available yet. Or maybe at least having a system like close_range(a, b).

    In May 2019, the close_range() syscall was in the Linux kernel !
    https://lwn.net/Articles/789023/
    ... but it seems like the feature was not accepted yet.

    I'm disappointed that posix_spawn() doesn't implement a feature like Python subprocess close_fds=True.

    @koobs
    Copy link

    koobs commented Sep 9, 2019

    On "close_except", close_range and posix_spawn(close_fds=True), I'll talk to my interested FreeBSD colleagues about potential inroads in that regard.

    In the meantime, we're good on closefrom(2) anywhere it can be used in Python

    @vstinner
    Copy link
    Member Author

    FreeBSD change has been merged:
    "lang/python{27,35,36,37,38}: Add closefrom(2) support"
    https://svnweb.freebsd.org/ports?view=revision&revision=518640

    @vstinner
    Copy link
    Member Author

    Links to the FreeBSD downstream patches:

    @vstinner
    Copy link
    Member Author

    Copy of the very interesting https://svnweb.freebsd.org/ports?view=revision&revision=518640 commit message by koobs:

    Log Message:

    lang/python{27,35,36,37,38}: Add closefrom(2) support

    A single close(fd) syscall is cheap, but when MAXFDS (maximum file
    descriptor number) is high, the loop calling close(fd) on each file
    descriptor can take several milliseconds.

    The default value of subprocess.Popen "close_fds" parameter changed to True
    in Python 3. Compared to Python 2, close_fds=True can make Popen 10x
    slower: see bpo-37790 [1]

    The present workaround on FreeBSD to improve performance is to load and
    mount the fdescfs kernel module, but this is not enabled by default.

    This change adds minimum viable (and upstreamable) closefrom(2) syscall
    support to Python's subprocess and posix modules, improving performance
    significantly for loads that involve working with many processes, such as
    diffoscope, ansible, and many others.

    For additional optimizations, upstream recently (3.8) landed posix_spawn(2)
    support [3] and has stated that they will adopt close_range(2) after Linux
    merges it [4]. Linux/FreeBSD developers are already collaborating on
    ensuring compatible implementations, with FreeBSD's implementation pending
    in D21627. [5]

    Thank you emaste, cem, kevans for providing analysis, input,
    clarifications, comms/upstream support and patches.

    [1] https://bugs.python.org/issue37790
    [2] https://bugs.python.org/issue38061
    [3] https://bugs.python.org/issue35537
    [4] https://lwn.net/Articles/789023/
    [5] https://reviews.freebsd.org/D21627

    Additional References:

    https://bugs.python.org/issue8052
    https://bugs.python.org/issue11284
    https://bugs.python.org/issue13788
    https://bugs.python.org/issue1663329
    https://www.python.org/dev/peps/pep-0446/

    PR: 242274, 221700
    Submitted by: kevans (emaste, cem)
    Approved by: koobs (python (maintainer), santa)

    @vstinner
    Copy link
    Member Author

    New changeset 162c567 by Victor Stinner in branch 'master':
    bpo-38061: os.closerange() uses closefrom() on FreeBSD (GH-19696)
    162c567

    @vstinner
    Copy link
    Member Author

    New changeset e6f8abd by Victor Stinner in branch 'master':
    bpo-38061: subprocess uses closefrom() on FreeBSD (GH-19697)
    e6f8abd

    @vstinner
    Copy link
    Member Author

    Thanks Ed Maste, Conrad Meyer, Kyle Evans and Kubilay Kocak!

    I ran a benchmark:

    vstinner@freebsd$ env/bin/python -m pyperf command -v -- ./python -c 'import subprocess, sys; args=[sys.executable, "-sS", "-c", "pass"]; subprocess.run(args)'

    The optimization made subprocess almost 4x faster on FreeBSD!

    vstinner@freebsd$ env/bin/python -m pyperf compare_to ref.json closefrom.json
    Mean +- std dev: [ref] 176 ms +- 1 ms -> [closefrom] 46.6 ms +- 1.2 ms: 3.77x faster (-74%)

    129 ms were wasted in calling 229 284 times close(fd)!

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants