Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import time #11546

Closed
nschloe opened this issue Jul 1, 2018 · 6 comments
Closed

import time #11546

nschloe opened this issue Jul 1, 2018 · 6 comments

Comments

@nschloe
Copy link
Contributor

nschloe commented Jul 1, 2018

Here's an import profile created with Python 3.7's importtime and displayed with tuna:

python3.7 -X importtime -c "import matplotlib" 2> matplotlib.log
tuna matplotlib.log

Looks like lazy-loading urllib.request would pay off. Likewise for pyparsing.

matplotlib

@timhoffm
Copy link
Member

timhoffm commented Jul 1, 2018

Thanks for the analysis! It's really nice to see how the imports run.

urllib.requests could probably be inlined. It's only used in _open_file_or_url() - apart from tests.

pyparsing is a bit more difficult. It's used in FontconfigPatternParser.__init__(), which is executed at startup and is used for validating rc settings. So lazy-loading does not work because we need it at startup.

Generally, I'm not sure if we want to move imports to local contexts to scrap of a few milliseconds from importing matplotlib. That's for someone else to decide @tacaswell .

@anntzer
Copy link
Contributor

anntzer commented Jul 3, 2018

Note that support for directly loading from urls was added in #4256 with the following motivation:

I suppose I also should have commented on the motivation here. In Python 2, you could do something like the following to download an image from the internet:
plt.imread( urllib2.urlopen(url))
I noticed that for some reason with my Linux-based Python installations, the following call signature would break in Python 3:
plt.imread(urllib.request.urlopen(url))
However, the following does work in Python 3:
plt.imread(io.BytesIO(urllib.request.urlopen(url).read()))
It seems like rather than having the users tinker with trying to get the proper call signature for their system and python version, it might be more straightforward to implement the logic in imread to handle URLs directly.

which simply doesn't apply anymore today. I would thus like to suggest at least considering deprecating this functionality.

Also, a simple inlining of the import did not clearly yield measurable improvements to import time in a quick experiment, so some measurements (not necessarily in depth) actually establishing the improvement post-inlining would be nice.

@timhoffm
Copy link
Member

timhoffm commented Jul 3, 2018

There are actually two urllib.request uses:

  • The one in matplotlib._open_file_or_url() i mention above.
  • The one in matplotlib.image from @anntzer

Just removing or inlineing one does not give a performance benefit.

@anntzer
Copy link
Contributor

anntzer commented Jul 3, 2018

I did try inlining both.

Edit: Added "Needs confirmation" label as I simply cannot reproduce these numbers (in particular pyparsing imports very quickly for me, and I did try inlining the imports), but of course patches with performance numbers are still welcome.

@anntzer
Copy link
Contributor

anntzer commented Sep 6, 2018

I played a bit with -Ximporttime and noticed that in certain cases, a module taking longer to import was actually due to the gc kicking in; thus, one should call gc.disable() before doing these benchmarks.

@jklymak
Copy link
Member

jklymak commented Feb 9, 2019

I don't see an action on this, and it looks ephemeral, so I'm closing....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants