Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imports are slower than with CPython #2854

Closed
pitrou opened this issue Feb 20, 2019 · 13 comments
Closed

Imports are slower than with CPython #2854

pitrou opened this issue Feb 20, 2019 · 13 comments

Comments

@pitrou
Copy link
Contributor

pitrou commented Feb 20, 2019

If I compile the following functions using Cython:

def _noop_bench():
    pass

def _import_bench():
    import pyarrow.pandas_compat as pdcompat

And then measure their performance:

>>> %timeit lib._noop_bench()                                                                                                                                     
66.3 ns ± 0.379 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
>>> %timeit lib._import_bench()                                                                                                                                   
987 ns ± 21.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
>>> %timeit lib._import_bench()                                                                                                                                   
966 ns ± 17.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

The import overhead (~ 900 ns. per import statement) appears much slower than what CPython achieves:

>>> %timeit import pyarrow.pandas_compat as pdcompat                                                                                                              
253 ns ± 4.52 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
>>> %timeit import pyarrow.pandas_compat as pdcompat                                                                                                              
254 ns ± 3.66 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

This is with Python 3, Cython 0.29.4 and language_level = 3.

@scoder
Copy link
Contributor

scoder commented Feb 20, 2019

Hmm, interesting. My first thought was Py2-style imports, but you're using language_level=3, so that's not it.

Importing has never really been a priority for optimisation, more of a "just make it work across all Python versions" kind of task. But I can't see a reason why Cython should be slower than CPython here, let alone 4x as slow. Definitely worth investigating.

@scoder
Copy link
Contributor

scoder commented Feb 21, 2019

I investigated this. CPython imports the module by dotted name and then looks up attributes on the result until it found the final imported sub-sub-sub-…module. Cython (and probably Pyrex already) took the easy path of passing a non-empty name list into the import function (actually the list ['*']), which then takes that as a signal to return the imported submodule instead of the top-level module. That is much slower. I'll see what I can do about it.

scoder added a commit that referenced this issue Feb 22, 2019
…ally under Py3.7. Cython used to be 3x slower than CPython here.

Also enable absolute imports automatically in top-level (non-package) modules.
@scoder
Copy link
Contributor

scoder commented Feb 22, 2019

Ok, all not that easy, due to CPython's long-grown import C-API, which is entirely backwards. If I want to import a module, then please give me the module and don't leave me standing in the rain with the top-level module that I have to look up attributes on. :(

While many things can be worked around, it becomes tricky for circular and concurrent imports. If one module imports itself, then it might be in sys.modules already, but not yet in the package module, i.e. looking up pkg1.pkg2.module should work, but looking it up as attributed from pkg1.pkg2 -> pkg2.module might not. So, what I did for now, was to ignore the import lock-unlock cycle that CPython does in importlib (to allow other threads to finish their import), and just take the module if it's in sys.modules.

CPython:
$ python3.7 -m timeit 'import pyarrow.pandas_compat as pdcompat'
1 loop, best of 5: 592 nsec per loop
$ python3.6 -m timeit 'import pyarrow.pandas_compat as pdcompat'
10 loops, best of 3: 0.302 usec per loop
$ python3.5 -m timeit 'import pyarrow.pandas_compat as pdcompat'
10 loops, best of 3: 0.946 usec per loop

Cython before:
$ python3.7 -m timeit -s 'from importbench import _import_bench' '_import_bench()'
1 loop, best of 5: 1.37 usec per loop

Now:
$ python3.7 -m timeit -n 100000 -r 50 -s 'from importbench import _import_bench' '_import_bench()'
100000 loops, best of 50: 51.5 nsec per loop
$ python3.6 -m timeit -n 100000 -r 50 -s 'from importbench import _import_bench' '_import_bench()'
100000 loops, best of 50: 0.0424 usec per loop
$ python3.5 -m timeit -n 100000 -r 50 -s 'from importbench import _import_bench' '_import_bench()'
100000 loops, best of 50: 0.0454 usec per loop

@pitrou
Copy link
Contributor Author

pitrou commented Feb 22, 2019

Actually, there is already some code in CPython that you can copy to have a fast path while still guarding against uninitialized modules: see https://bugs.python.org/issue35943#msg335097

Great to see this advancing towards resolution :-)

@scoder scoder added this to the 3.0 milestone Feb 22, 2019
@pitrou
Copy link
Contributor Author

pitrou commented Feb 22, 2019

Of course, if you want to limit the reliance on private APIs, you could just defer to regular importing if the <module>.__spec__._initializing attribute is True.

@scoder
Copy link
Contributor

scoder commented Feb 22, 2019

Thanks, I had already found that code in Python/import.c. As you guessed, the problem is this bit: interp->importlib. We don't currently touch the interpreter state anywhere in Cython, and i would prefer to leave it that way. The C-API of importlib is a bit … limited.

I had already implemented the flag check, but then noticed that I would then also have to duplicate the whole "import importlib, call locking function" part, which is a lot of overhead. (Also, importing importlib in import code feels a bit … meta.)

// TODO: CPython guards against thread-concurrent imports in importlib,
// but importing and calling into importlib just for that rare case would
// be quite costly, so we skip it for now.
return module;
// -- Code copied from Py3.8-pre:
// PyObject *spec = __Pyx_PyObject_GetAttrStrNoError(module, PYIDENT("__spec__"));
// if (likely(spec)) {
// PyObject *unsafe = __Pyx_PyObject_GetAttrStrNoError(spec, PYIDENT("_initializing"));
// if (likely(!unsafe || !__Pyx_PyObject_IsTrue(unsafe))) {
// Py_DECREF(spec);
// spec = NULL;
// }
// Py_XDECREF(unsafe);
// }
// if (likely(!spec)) {
// PyErr_Clear();
// return module;
// }
// Py_DECREF(spec);
// // Need to let PyImport_ImportModuleLevelObject() handle the locking.
// Py_DECREF(module);

@pitrou
Copy link
Contributor Author

pitrou commented Feb 22, 2019

Well, the cost of calling into importlib will almost never be paid, so I'm not sure you should worry about that. Still, correctness is important IMHO.

As I wrote above, you needn't call importlib yourself, you can just fall back on the normal non-optimized path if the initializing attribute is True.

@scoder
Copy link
Contributor

scoder commented Feb 22, 2019

Well, "almost never" does not seem to take circular imports into account, which do not seem all that uncommon to me.

@pitrou
Copy link
Contributor Author

pitrou commented Feb 22, 2019

But you would only pay the price on the first import, not once the module is entirely loaded... The point of the optimization here is to make imports of already-loaded modules fast. Making fresh imports faster is not really achievable within Cython (unless you decide to break import semantics, that is ;-)).

@scoder
Copy link
Contributor

scoder commented Feb 22, 2019

As you can see from the code above, I've actually tried it. The problem then was to find a way how to make CPython use the right fallback. The import C-API is really horrible. Important functionality is scattered across different functions and entry points, some things are implemented in Python, and some are even in ceval.c.

I agree about the correctness and will give it another try.

@pitrou
Copy link
Contributor Author

pitrou commented Feb 22, 2019

Ah... well, I suppose you could do whatever you did before this optimization? :-)

scoder added a commit that referenced this issue Feb 22, 2019
@scoder
Copy link
Contributor

scoder commented Feb 22, 2019

So, the problem I had was really with circular imports. While resolving them, the module is already in sys.modules, but not yet available as an attribute of its package. Thus, the attribute lookup chain from the top-level module after the import fails. Long story short, I added another lookup in sys.modules right after the import to load the module from there. That's not 100% "correct" (at least, it differs from what CPython does), since the package attribute might be different from the module in sys.modules, but I'd argue that a setup where the module you get depends on the way you get at it is inherently broken anyway and thus not worth supporting.

@ncoghlan
Copy link
Contributor

@scoder Falling back to checking sys.modules is pretty close to what CPython does to handle circular imports - it's a last-gasp attempt in the import opcodes when looking up the module attribute throws AttributeError.

I agree on this being spectacularly convoluted though (and will comment further on that in Antoine's capi-sig thread).

scoder added a commit to scoder/cython that referenced this issue Mar 21, 2023
Since cython@4993ba6,
we returned the top-level package module instead of the module that was actually imported
with its dotted name ("collections" instead of "collections.abc").

Closes cython#5308
See cython#2854
scoder added a commit to scoder/cython that referenced this issue Mar 21, 2023
Since cython@4993ba6,
we returned the top-level package module instead of the module that was actually imported
with its dotted name ("collections" instead of "collections.abc").

Closes cython#5308
See cython#2854
scoder added a commit that referenced this issue Mar 22, 2023
GH-5329)

Since 4993ba6,
we returned the top-level package module instead of the module that was actually imported
with its dotted name ("collections" instead of "collections.abc").

Closes #5308
See #2854
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants