stat cache for import bootstrap #63415

tiran · 2013-10-10T11:20:50Z

BPO	19216
Nosy	@warsaw, @brettcannon, @pitrou, @vstinner, @tiran, @bitdancer, @ericsnowcurrently
Files	import_stat_cache.patch

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2019-05-15.03:13:54.260>
created_at = <Date 2013-10-10.11:20:50.397>
labels = ['performance']
title = 'stat cache for import bootstrap'
updated_at = <Date 2019-05-15.03:13:54.259>
user = 'https://github.com/tiran'

bugs.python.org fields:

activity = <Date 2019-05-15.03:13:54.259>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2019-05-15.03:13:54.260>
closer = 'vstinner'
components = []
creation = <Date 2013-10-10.11:20:50.397>
creator = 'christian.heimes'
dependencies = []
files = ['32032']
hgrepos = []
issue_num = 19216
keywords = ['patch']
message_count = 17.0
messages = ['199378', '199385', '199387', '199400', '199401', '199410', '199424', '199430', '199431', '199436', '199437', '200496', '200498', '200500', '200502', '200503', '342550']
nosy_count = 8.0
nosy_names = ['barry', 'brett.cannon', 'pitrou', 'vstinner', 'christian.heimes', 'Arfrever', 'r.david.murray', 'eric.snow']
pr_nums = []
priority = 'normal'
resolution = 'rejected'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue19216'
versions = ['Python 3.5']

tiran · 2013-10-10T11:20:50Z

The import library uses excessive stat() calls. I've implemented a simple cache for the bootstrap module that reduces the amount of stat() calls by almost 1/3 (236 -> 159 on Linux).

vstinner · 2013-10-10T13:34:50Z

See also bpo-14604.

pitrou · 2013-10-10T14:07:25Z

Benchmarks?

brettcannon · 2013-10-10T16:50:31Z

A cursory look at the patch suggests that the cache use is permanent and so any dynamic changes to a file or directory after an initial caching will not be picked up. Did you run the test suite with this patch as it should have failed.

tiran · 2013-10-10T17:54:05Z

Is the content of the bootstrap module used after the interpreter is boot strapped? I see ... that's a problem. It's a proof of concept anyway and the speed up is minimal. On my computer with a SSD the speedup barely measurable. I'd like to see if it makes a difference on a Raspbarry Pi or a NFS shares

I have another idea, too. Could we add an optional 'stat' argument to __init__() of FileLoader and ExtensionFileLoader so we can pass the stat object around and reuse it for loading?

brettcannon · 2013-10-10T20:34:11Z

importlib/_bootstrap.py is importlib, period, so there is no separation of what is used to start Python and what is used after interpreter startup is completed.

As for adding a 'stat' argument to the loaders, it's possible but as always it comes down to whether it will break someone or not. Since loaders do not necessarily execute immediately you are running the risk of a very stale cached stat object. Plus Eric Snow has his PEP where the API in terms of loader __init__ signature so you would want to look into that.

ericsnowcurrently · 2013-10-10T22:11:25Z

With ModuleSpec (PEP-451), the finder creates the spec object (where it stores the loader). At that point the finder is free to store any stat object you like in spec.loader_state. The spec is made available to the loader during exec (if the loader supports it, which the importlib loaders will). So there is no need to add anything to any loader __init__.

The only catch is the slim possibility that the stat object will be stale by the time it gets used. I seem to remember a case where something like this happened (related to distros building their system Python or something).

ericsnowcurrently · 2013-10-11T00:53:45Z

For interpreter startup, stats are not involved for builtin and frozen modules[1]. They are tied to imports that involve traversing sys.path (a.k.a. PathFinder). Most stats happen in FileFinder.find_loader. The remainder are for source (.py) files (a.k.a. SourceFileLoader).

Here's a rough sketch of what typically happens currently during the import of a path-based module[2], as related to stats (and other FS access):

(lines with FS access start with *)

def load_module(fullname):
    suffixes = ['.cpython-34m.so', '.abi3.so', '.so', '.py', '.pyc']
    tailname = fullname.rpartition('.')[2]
    for entry in sys.path:
*       mtime = os.stat(entry).st_mtime
        if mtime != cached_mtime:
*           cached_listdir = os.listdir(entry)
        if tailname in cached_listdir:
            basename = entry/tailname
*           if os.stat(basename).st_mode implies directory:  # superfluous?
                # package?
                for suffix in suffixes:
                    full_path = basename + suffix
*                   if os.stat(full_path).st_mode implies file:
                        if is_extension:
*                           <dlopen>(full_path)
                        elif is_sourceless:
*                           open(full_path).read()
                        else:
                            load_from_source(full_path)
                        return
        # ...non-package module?
        for suffix in suffixes:
            full_path = entry/tailname + suffix
            if tailname + suffix in cached_listdir:
*               if os.stat(full_path).st_mode implies file:  # superfluous?
                    if is_extension:
*                       <dlopen>(full_path)
                    elif is_sourceless:
*                       open(full_path).read()
                    else:
                        load_from_source(full_path)

def load_from_source(sourcepath):
*   st = os.stat(sourcepath)
    if st:
*       open(bytecodepath).read()
    else:
*       open(sourcepath).read()
*       os.stat(sourcepath).st_mode
        for parent in ancestor_dirs(sourcepath):
*           os.stat(parent).st_mode  ->  missing_parents
        for parent in missing_parents:
*           os.mkdir(parent)
*       open(tempname).write()
*       os.replace(tempname, bytecodepath)

Obviously there are some unix-isms in there. Windows ends up not that different though.

stat/FS count
-------------

load_module (per path entry):
(add 1 listdir to each if the cache is stale)
not found: 1 stat
non-package dir: 7 (num_suffixes + 2 stats)

package (best): 4/5-9+ (3 stats, 1 read or load_from_source)
package (worst): 8/9-13+ (num_suffixes + 2 stats, 1 read or load_from_source)
non-package module 3/4-8+ (best): (2 stats, 1 read or load_from_source)
non-package module 7/8-12+ (worst): (num_suffixes + 1 stats, 1 read or load_from_source)
non-package module + dir (best): 10/11-15+ (num_suffixes + 4 stats, 1 read or load_from_source)
non-package module + dir (best): 14/15-19+ (num_suffixes * 2 + 3 stats, 1 read or load_from_source)

load_from_source:
cached: 2 (1 stat, 1 read)
uncached, no parents: 4 (2 stats, 1 write, 1 replace)
uncached, no missing parents: 5+ (num_parents + 2 stats, 1 write, 1 replace)
uncached, missing parents: 6+ (num_parents + 2 stats, num_missing mkdirs, 1 write, 1 replace)

Highlights:

the common case is not fast (for the sake of the slight possibility that files may change between imports)--not as much an issue during interpreter startup.
up to 5 different suffixes with a separate stat for each (with extension module suffixes tried first).
the size and ordering of sys.path has a decided impact on # stats.
if a module is cached, a lot less FS access happens.
the more nested a module, the more access happen.
namespace packages don't have much impact on performance.

Possible improvements:

provide an internal mechanism to turn on/off caching all stats (don't worry about staleness) and maybe expose it via a context manager/API. (not unlike what Christian put in his patch.)
at least do some temporally local caching where the risk of staleness is particularly small.
Move .py ahead of extension modules (or just behind .cpython-34m.so)?
non-packages are more common than packages (?) so look for those first (hard to make effective without breaking key import semantics).
remove 2 possibly superfluous stats?

[1] Maybe we should freeze the stdlib. <0.5 wink>
[2] importing a module usually involves importing the module's parent and its parent and so forth. Each of those incurs the same stat hits all over again (though usually packages have only 1 path entry to traverse). The stdlib is pretty flat (particularly among modules involved during startup) so this is less of an issue for this ticket.

brettcannon · 2013-10-11T01:02:39Z

So the 2 stat calls in the general case are superfluous, it's just a question of whether they make any performance difference. Turns out that at least on my Macbook their is no performance difference and thus not worth the cost of breaking semantics over it: http://bugs.python.org/issue18810 .

As for completely turning off stat calls during interpreter startup, that would definitely buy us something, but the question is how much and how do we make it work reliably?

ericsnowcurrently · 2013-10-11T01:46:33Z

I realized those two stats are not superfluous in the case that a directory name has a .py suffix or a file doesn't have any suffix. However, I expect that's pretty uncommon.

Worst case, these cases cost 2 stats per path entry. In practice they cost nothing due to the dir caching we already do.

ericsnowcurrently · 2013-10-11T01:54:58Z

I forgot to mention that optimizing the default composition of sys.path (from site) could help speed things up, though it might already be optimized in that regard.

I also forgot to mention the idea of zipping up the stdlib.

Sorry for the sidetrack. Now, back to the stat discussion...

pitrou · 2013-10-19T21:01:07Z

The real problem here is that the definition of "bootstrap" or "startup" is fuzzy. How do you decide when you stop caching?
The only workable approach IMO is to adopt a time-based heuristic, which I did in bpo-14067.

ericsnowcurrently · 2013-10-19T21:12:26Z

Would it be feasible to have an explicit (but private?) flag in importlib indicating stat checking (or even all FS checking) should be disabled, defaulting to True? runpy could set it to False after initializing importlib and then back to True when startup is done.

If that was useful for more than just startup, we could also add a contextmanager for it in importlib.util.

pitrou · 2013-10-19T21:14:42Z

Would it be feasible to have an explicit (but private?) flag in
importlib indicating stat checking (or even all FS checking) should be
disabled, defaulting to True? runpy could set it to False after
initializing importlib and then back to True when startup is done.

I don't really understand the algorithm you're proposing. Also, have you
read what I've just posted?

ericsnowcurrently · 2013-10-19T21:37:26Z

I don't really understand the algorithm you're proposing.

In importlib._bootstrap:

We have some global like "_CHECK_STAT=True". FileFinder would use it to decide on using stat checks or not.

In Python/pythonrun.c:

At the end of import_init(), we set importlib._bootstrap _CHECK_STAT to False. Then at the end of _Py_InitializeEx_Private() we set it back to True.

(As an alternative, we could always not do stat checking for just the standard library)

Also, have you read what I've just posted?

About the fuzziness of when startup is finished? As implied above, I'd say at the end of Py_Initialize().

pitrou · 2013-10-19T21:45:34Z

> Also, have you read what I've just posted?

About the fuzziness of when startup is finished? As implied above,
I'd say at the end of Py_Initialize().

You only have imported a handful of modules by then. Real-world
applications will import many more afterwards.
Here's a little experiment (done with a system install of Python 2.7):

$ python -v -c pass 2>&1 | grep "^import" | wc -l
33
$ python -v `which hg` 2>&1 | grep "^import" | wc -l
117

Note that Mercurial has a lazy importer in order to improve startup
time, otherwise the number would be higher yet.

vstinner · 2019-05-15T03:13:54Z

The benefit of avoiding stat() calls seems to not be obvious to everybody. Moreover, importlib now implements a "path cache". I close the issue.

The most efficient solution is to pack all your modules and the Python stdlib into a ZIP file: everything is done in memory, no more filesystem access.

tiran assigned brettcannon Oct 10, 2013

tiran added the performance Performance or resource usage label Oct 10, 2013

brettcannon removed their assignment Oct 18, 2013

vstinner closed this as completed May 15, 2019

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stat cache for import bootstrap #63415

stat cache for import bootstrap #63415

tiran commented Oct 10, 2013

tiran commented Oct 10, 2013

vstinner commented Oct 10, 2013

pitrou commented Oct 10, 2013

brettcannon commented Oct 10, 2013

tiran commented Oct 10, 2013

brettcannon commented Oct 10, 2013

ericsnowcurrently commented Oct 10, 2013

ericsnowcurrently commented Oct 11, 2013

brettcannon commented Oct 11, 2013

ericsnowcurrently commented Oct 11, 2013

ericsnowcurrently commented Oct 11, 2013

pitrou commented Oct 19, 2013

ericsnowcurrently commented Oct 19, 2013

pitrou commented Oct 19, 2013

ericsnowcurrently commented Oct 19, 2013

pitrou commented Oct 19, 2013

vstinner commented May 15, 2019

stat cache for import bootstrap #63415

stat cache for import bootstrap #63415

Comments

tiran commented Oct 10, 2013

tiran commented Oct 10, 2013

vstinner commented Oct 10, 2013

pitrou commented Oct 10, 2013

brettcannon commented Oct 10, 2013

tiran commented Oct 10, 2013

brettcannon commented Oct 10, 2013

ericsnowcurrently commented Oct 10, 2013

ericsnowcurrently commented Oct 11, 2013

brettcannon commented Oct 11, 2013

ericsnowcurrently commented Oct 11, 2013

ericsnowcurrently commented Oct 11, 2013

pitrou commented Oct 19, 2013

ericsnowcurrently commented Oct 19, 2013

pitrou commented Oct 19, 2013

ericsnowcurrently commented Oct 19, 2013

pitrou commented Oct 19, 2013

vstinner commented May 15, 2019