Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stat cache for import bootstrap #63415

Closed
tiran opened this issue Oct 10, 2013 · 17 comments
Closed

stat cache for import bootstrap #63415

tiran opened this issue Oct 10, 2013 · 17 comments
Labels
performance Performance or resource usage

Comments

@tiran
Copy link
Member

tiran commented Oct 10, 2013

BPO 19216
Nosy @warsaw, @brettcannon, @pitrou, @vstinner, @tiran, @bitdancer, @ericsnowcurrently
Files
  • import_stat_cache.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-05-15.03:13:54.260>
    created_at = <Date 2013-10-10.11:20:50.397>
    labels = ['performance']
    title = 'stat cache for import bootstrap'
    updated_at = <Date 2019-05-15.03:13:54.259>
    user = 'https://github.com/tiran'

    bugs.python.org fields:

    activity = <Date 2019-05-15.03:13:54.259>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-05-15.03:13:54.260>
    closer = 'vstinner'
    components = []
    creation = <Date 2013-10-10.11:20:50.397>
    creator = 'christian.heimes'
    dependencies = []
    files = ['32032']
    hgrepos = []
    issue_num = 19216
    keywords = ['patch']
    message_count = 17.0
    messages = ['199378', '199385', '199387', '199400', '199401', '199410', '199424', '199430', '199431', '199436', '199437', '200496', '200498', '200500', '200502', '200503', '342550']
    nosy_count = 8.0
    nosy_names = ['barry', 'brett.cannon', 'pitrou', 'vstinner', 'christian.heimes', 'Arfrever', 'r.david.murray', 'eric.snow']
    pr_nums = []
    priority = 'normal'
    resolution = 'rejected'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue19216'
    versions = ['Python 3.5']

    @tiran
    Copy link
    Member Author

    tiran commented Oct 10, 2013

    The import library uses excessive stat() calls. I've implemented a simple cache for the bootstrap module that reduces the amount of stat() calls by almost 1/3 (236 -> 159 on Linux).

    @tiran tiran added the performance Performance or resource usage label Oct 10, 2013
    @vstinner
    Copy link
    Member

    See also bpo-14604.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 10, 2013

    Benchmarks?

    @brettcannon
    Copy link
    Member

    A cursory look at the patch suggests that the cache use is permanent and so any dynamic changes to a file or directory after an initial caching will not be picked up. Did you run the test suite with this patch as it should have failed.

    @tiran
    Copy link
    Member Author

    tiran commented Oct 10, 2013

    Is the content of the bootstrap module used after the interpreter is boot strapped? I see ... that's a problem. It's a proof of concept anyway and the speed up is minimal. On my computer with a SSD the speedup barely measurable. I'd like to see if it makes a difference on a Raspbarry Pi or a NFS shares

    I have another idea, too. Could we add an optional 'stat' argument to __init__() of FileLoader and ExtensionFileLoader so we can pass the stat object around and reuse it for loading?

    @brettcannon
    Copy link
    Member

    importlib/_bootstrap.py is importlib, period, so there is no separation of what is used to start Python and what is used after interpreter startup is completed.

    As for adding a 'stat' argument to the loaders, it's possible but as always it comes down to whether it will break someone or not. Since loaders do not necessarily execute immediately you are running the risk of a very stale cached stat object. Plus Eric Snow has his PEP where the API in terms of loader __init__ signature so you would want to look into that.

    @ericsnowcurrently
    Copy link
    Member

    With ModuleSpec (PEP-451), the finder creates the spec object (where it stores the loader). At that point the finder is free to store any stat object you like in spec.loader_state. The spec is made available to the loader during exec (if the loader supports it, which the importlib loaders will). So there is no need to add anything to any loader __init__.

    The only catch is the slim possibility that the stat object will be stale by the time it gets used. I seem to remember a case where something like this happened (related to distros building their system Python or something).

    @ericsnowcurrently
    Copy link
    Member

    For interpreter startup, stats are not involved for builtin and frozen modules[1]. They are tied to imports that involve traversing sys.path (a.k.a. PathFinder). Most stats happen in FileFinder.find_loader. The remainder are for source (.py) files (a.k.a. SourceFileLoader).

    Here's a rough sketch of what typically happens currently during the import of a path-based module[2], as related to stats (and other FS access):

    (lines with FS access start with *)

    def load_module(fullname):
        suffixes = ['.cpython-34m.so', '.abi3.so', '.so', '.py', '.pyc']
        tailname = fullname.rpartition('.')[2]
        for entry in sys.path:
    *       mtime = os.stat(entry).st_mtime
            if mtime != cached_mtime:
    *           cached_listdir = os.listdir(entry)
            if tailname in cached_listdir:
                basename = entry/tailname
    *           if os.stat(basename).st_mode implies directory:  # superfluous?
                    # package?
                    for suffix in suffixes:
                        full_path = basename + suffix
    *                   if os.stat(full_path).st_mode implies file:
                            if is_extension:
    *                           <dlopen>(full_path)
                            elif is_sourceless:
    *                           open(full_path).read()
                            else:
                                load_from_source(full_path)
                            return
            # ...non-package module?
            for suffix in suffixes:
                full_path = entry/tailname + suffix
                if tailname + suffix in cached_listdir:
    *               if os.stat(full_path).st_mode implies file:  # superfluous?
                        if is_extension:
    *                       <dlopen>(full_path)
                        elif is_sourceless:
    *                       open(full_path).read()
                        else:
                            load_from_source(full_path)
    
    def load_from_source(sourcepath):
    *   st = os.stat(sourcepath)
        if st:
    *       open(bytecodepath).read()
        else:
    *       open(sourcepath).read()
    *       os.stat(sourcepath).st_mode
            for parent in ancestor_dirs(sourcepath):
    *           os.stat(parent).st_mode  ->  missing_parents
            for parent in missing_parents:
    *           os.mkdir(parent)
    *       open(tempname).write()
    *       os.replace(tempname, bytecodepath)

    Obviously there are some unix-isms in there. Windows ends up not that different though.

    stat/FS count
    -------------

    load_module (per path entry):
    (add 1 listdir to each if the cache is stale)
    not found: 1 stat
    non-package dir: 7 (num_suffixes + 2 stats)

    package (best): 4/5-9+ (3 stats, 1 read or load_from_source)
    package (worst): 8/9-13+ (num_suffixes + 2 stats, 1 read or load_from_source)
    non-package module 3/4-8+ (best): (2 stats, 1 read or load_from_source)
    non-package module 7/8-12+ (worst): (num_suffixes + 1 stats, 1 read or load_from_source)
    non-package module + dir (best): 10/11-15+ (num_suffixes + 4 stats, 1 read or load_from_source)
    non-package module + dir (best): 14/15-19+ (num_suffixes * 2 + 3 stats, 1 read or load_from_source)
    

    load_from_source:
    cached: 2 (1 stat, 1 read)
    uncached, no parents: 4 (2 stats, 1 write, 1 replace)
    uncached, no missing parents: 5+ (num_parents + 2 stats, 1 write, 1 replace)
    uncached, missing parents: 6+ (num_parents + 2 stats, num_missing mkdirs, 1 write, 1 replace)

    Highlights:

    • the common case is not fast (for the sake of the slight possibility that files may change between imports)--not as much an issue during interpreter startup.
    • up to 5 different suffixes with a separate stat for each (with extension module suffixes tried first).
    • the size and ordering of sys.path has a decided impact on # stats.
    • if a module is cached, a lot less FS access happens.
    • the more nested a module, the more access happen.
    • namespace packages don't have much impact on performance.

    Possible improvements:

    • provide an internal mechanism to turn on/off caching all stats (don't worry about staleness) and maybe expose it via a context manager/API. (not unlike what Christian put in his patch.)
    • at least do some temporally local caching where the risk of staleness is particularly small.
    • Move .py ahead of extension modules (or just behind .cpython-34m.so)?
    • non-packages are more common than packages (?) so look for those first (hard to make effective without breaking key import semantics).
    • remove 2 possibly superfluous stats?

    [1] Maybe we should freeze the stdlib. <0.5 wink>
    [2] importing a module usually involves importing the module's parent and its parent and so forth. Each of those incurs the same stat hits all over again (though usually packages have only 1 path entry to traverse). The stdlib is pretty flat (particularly among modules involved during startup) so this is less of an issue for this ticket.

    @brettcannon
    Copy link
    Member

    So the 2 stat calls in the general case are superfluous, it's just a question of whether they make any performance difference. Turns out that at least on my Macbook their is no performance difference and thus not worth the cost of breaking semantics over it: http://bugs.python.org/issue18810 .

    As for completely turning off stat calls during interpreter startup, that would definitely buy us something, but the question is how much and how do we make it work reliably?

    @ericsnowcurrently
    Copy link
    Member

    I realized those two stats are not superfluous in the case that a directory name has a .py suffix or a file doesn't have any suffix. However, I expect that's pretty uncommon.

    Worst case, these cases cost 2 stats per path entry. In practice they cost nothing due to the dir caching we already do.

    @ericsnowcurrently
    Copy link
    Member

    I forgot to mention that optimizing the default composition of sys.path (from site) could help speed things up, though it might already be optimized in that regard.

    I also forgot to mention the idea of zipping up the stdlib.

    Sorry for the sidetrack. Now, back to the stat discussion...

    @brettcannon brettcannon removed their assignment Oct 18, 2013
    @pitrou
    Copy link
    Member

    pitrou commented Oct 19, 2013

    The real problem here is that the definition of "bootstrap" or "startup" is fuzzy. How do you decide when you stop caching?
    The only workable approach IMO is to adopt a time-based heuristic, which I did in bpo-14067.

    @ericsnowcurrently
    Copy link
    Member

    Would it be feasible to have an explicit (but private?) flag in importlib indicating stat checking (or even all FS checking) should be disabled, defaulting to True? runpy could set it to False after initializing importlib and then back to True when startup is done.

    If that was useful for more than just startup, we could also add a contextmanager for it in importlib.util.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 19, 2013

    Would it be feasible to have an explicit (but private?) flag in
    importlib indicating stat checking (or even all FS checking) should be
    disabled, defaulting to True? runpy could set it to False after
    initializing importlib and then back to True when startup is done.

    I don't really understand the algorithm you're proposing. Also, have you
    read what I've just posted?

    @ericsnowcurrently
    Copy link
    Member

    I don't really understand the algorithm you're proposing.

    In importlib._bootstrap:

    We have some global like "_CHECK_STAT=True". FileFinder would use it to decide on using stat checks or not.

    In Python/pythonrun.c:

    At the end of import_init(), we set importlib._bootstrap _CHECK_STAT to False. Then at the end of _Py_InitializeEx_Private() we set it back to True.

    (As an alternative, we could always not do stat checking for just the standard library)

    Also, have you read what I've just posted?

    About the fuzziness of when startup is finished? As implied above, I'd say at the end of Py_Initialize().

    @pitrou
    Copy link
    Member

    pitrou commented Oct 19, 2013

    > Also, have you read what I've just posted?

    About the fuzziness of when startup is finished? As implied above,
    I'd say at the end of Py_Initialize().

    You only have imported a handful of modules by then. Real-world
    applications will import many more afterwards.
    Here's a little experiment (done with a system install of Python 2.7):

    $ python -v -c pass 2>&1 | grep "^import" | wc -l
    33
    $ python -v `which hg` 2>&1 | grep "^import" | wc -l
    117

    Note that Mercurial has a lazy importer in order to improve startup
    time, otherwise the number would be higher yet.

    @vstinner
    Copy link
    Member

    The benefit of avoiding stat() calls seems to not be obvious to everybody. Moreover, importlib now implements a "path cache". I close the issue.

    The most efficient solution is to pack all your modules and the Python stdlib into a ZIP file: everything is done in memory, no more filesystem access.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants