Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce import time #4649

Merged
merged 4 commits into from
Mar 8, 2016
Merged

Reduce import time #4649

merged 4 commits into from
Mar 8, 2016

Conversation

saimn
Copy link
Contributor

@saimn saimn commented Feb 29, 2016

Ref #4598

  • Avoid to import utils.console in the logger setup. The import will be done when the first logging message is printed.
  • I also moved the units import in utils.console to a local one. With the previous change it is useless, but I think it may still be useful on some cases or for command-line scripts (when Spinner or ProgressBar are used).

With these changes, the import time goes down from ~480ms to ~250ms on my laptop. The remaining time is mostly in developer specific stuff (subprocess call to get the git hash for the version string), so the import time will be even better for released package.

utils.console is responsible for almost half of the astropy import time (it
imports IPython and astropy.units). It is used in the logger module to colorize
the logging message, so here this import is moved to a local import.
As units import takes some time and is used only for one specific case in
`human_file_size`, it can be moved to a local import to avoid importing units
when it is not useful.
@eteq eteq added this to the v1.2.0 milestone Feb 29, 2016
@eteq
Copy link
Member

eteq commented Feb 29, 2016

Yep, I confirm the roughly ~2x improvement in import time. Nice find, @saimn!

Two suggestions:

  • Can you add a comment above both of the imports to indicate that they are there specifically as a performance measure? That will prevent someone from moving them back given that the usual import style we prefer is most-stuff-at-the-top.
  • Add a changelog entry, probably under "Other Changes" indicating that the import astropy time has been improved by a factor of 2x.

There's a bit of a question what version this goes in. I tend toward 1.2 because it's not specifically a bug/docs fix. But I'm open to other opinions.

@pllim
Copy link
Member

pllim commented Feb 29, 2016

I was going to ask @MSeifert04 to give this a try when the tests pass but the tests are taking forever.

@MSeifert04
Copy link
Contributor

It's almost 3x faster doing import astropy with these changes (but no improvement when doing e.g. from astropy.nddata import NDData but there units are imported so I guess that's ok).

@saimn
Copy link
Contributor Author

saimn commented Feb 29, 2016

@eteq - Done, 1.2 seems fine.

@MSeifert04 - Yes, it will have only a minor effect for subpackages. For nddata, a quick look seems to show that half of the import time comes from the import of coordinates, which is used only in Cutout2D.

eteq added a commit that referenced this pull request Mar 8, 2016
Reduce "import astropy" time
@eteq eteq merged commit 17247ea into astropy:master Mar 8, 2016
@eteq
Copy link
Member

eteq commented Mar 8, 2016

Alright, merged, thanks @saimn.

@MSeifert04 @saimn - yes, a lot of the long import effects come from the underlying situation that importing everything is always slower than importing some things.

I think that there's a balance to be considered here - there's often a clarity/performance tradeoff when it comes to imports: it may be clearer to put them all at the top at the cost of a bit of performance. This particular one is no-brainer because import astropy is fairly generic, but I'm not so sure we would want to move e.g. the nddata imports, as a lot of use cases for nddata might involve importing coordinates anyway. So in case you're tempted to go hunting for these kinds of optimizations, consider that they we may not want to always do them. Fair warning ;)

@saimn saimn deleted the import-time branch March 9, 2016 08:16
@saimn
Copy link
Contributor Author

saimn commented Mar 9, 2016

@eteq - I agree, this kind of optimization should be reserved for specific cases, e.g. when a module is imported once for an optional feature. It seems to be the case currently for coordinates inside nddata, at least currently, as Cutout is not used in the nddata classes. Actually Cutout and the other stuff in nddata.utils is not used anywhere in nddata, so maybe the whole module could be somewhere else or not imported in nddata, but it may be difficult to change this now. Anyway, @MSeifert04 and others know better that me about nddata and what is planned for this package, so it is just a suggestion ;).

@mwcraig mwcraig added the nddata label Mar 9, 2016
@mwcraig
Copy link
Member

mwcraig commented Mar 9, 2016

Labelled as nddata in the hopes we remember to look at it in a couple weeks...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants