Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making ObsPy load faster #2748

Closed
ThomasLecocq opened this issue Nov 11, 2020 · 11 comments · Fixed by #2758
Closed

Making ObsPy load faster #2748

ThomasLecocq opened this issue Nov 11, 2020 · 11 comments · Fixed by #2758
Labels
enhancement feature request
Milestone

Comments

@ThomasLecocq
Copy link
Contributor

This has been already discussed in the past, but for another project, I was checking solutions to check import-times.
From this I found out about tuna which allows to graphically see the import stacks/dependencies.

Running it on obspy : python -X importtime -c "import obspy" 2> obspy.log then, tuna obspy.log :

image
shows we have two major culpirts: UTCDateTime, and more surpisingly, obspy.geodetics which seems to have its load time strongly affected by loading a whole scipy.stats stack.

This comes from a single method required by obspy.geodetics.base.mean_longitude which calls scipy.stats.circmean. Moving the import inside the function, and re-running tuna:
image
We win more than 1 second in load time !

Continuing with UTCDateTime, looks like obspy.core.util.deprecation_helpers is responsible alone for 1.1 second (>33%) of the remaining load time:
image

I don't really understand the reason of that big stack, looks even a bit "circular" at some point... Anyway, removing the import of the from obspy.core.util.deprecation_helpers import ObsPyDeprecationWarning from obspy.core.utcdatetime (I know, that would break the code if the Deprecation needs to be raised, but wanted to test):
image

we win > 0.6s or so...

Remaining "heavy" (long) imports are Attribdict, then I guess some of the pkg_resources scanning the entry points etc...

What do you think? The first change (importing scipy.stats only if needed) shouldn't harm.

ps: running the importtime on "from obspy import read" gives +- the same timings.

@ThomasLecocq ThomasLecocq added the enhancement feature request label Nov 11, 2020
@krischer
Copy link
Member

👍 For moving a lot of imports into the respective functions. It seems to me like a large portion of the time is spent importing numpy, scipy, pyproj, requests, and matplotlib. The initial numpy import we'll probably have to just take but the others could likely be moved elsewhere, at least to the point where they are no longer imported by calling import obspy.

Also the plug-in resolution in the pkg_resources is quite expensive but, without a heavy refactoring, this will probably have to stay, at least for now.

@megies
Copy link
Member

megies commented Nov 11, 2020

Absolutely 👍 for the first change.

I doubt we can get around the imports of core.util though, it's used in many core submodules.

@ThomasLecocq
Copy link
Contributor Author

ThomasLecocq commented Nov 11, 2020

I don't really understand why the Deprecation stuff is so heavy, but am really not familiar with it...

this is calling it directly: python -X importtime -c "import obspy.core.util.deprecation_helpers" 2> depr.log

image

but this file is essentially empty...

class ObsPyDeprecationWarning(UserWarning):
    """
    Make a custom deprecation warning as deprecation warnings or hidden by
    default since Python 2.7 and 3.2 and we really want users to notice these.
    """
    pass

so why does calling it, actually calls what's under it ? in the module-listing ? obspy.core.util of course is heavy, but not required to run obspy.core.util.deprecation_helpers

@ThomasLecocq
Copy link
Contributor Author

is it because of obspy.core.util.__init__ ?

@megies
Copy link
Member

megies commented Nov 11, 2020

That import time is all about util in general or am I missing something? And like I said, I don't think there is a way to avoid importing that.

@ThomasLecocq
Copy link
Contributor Author

ThomasLecocq commented Nov 11, 2020

The thing is: for importing the empty Deprecation class, the whole stack is loaded:

- loading obspy.core.util.deprecation_helpers
- loading obspy.core.util.__init__ (and everything it imports)
- loading obspy.core.__init__ (and everything it imports)
- loading obspy.__init__ (and everything it imports)

that's what I don't really get...

edit:
https://stackoverflow.com/questions/21298833/how-to-only-import-sub-module-without-exec-init-py-in-the-package

ok, better... but still, there should be a way to avoid stupid heavy stuff

@ThomasLecocq
Copy link
Contributor Author

ThomasLecocq commented Nov 11, 2020

working on smaller other changes...

requests:

  • removing the check for the requests version in obspy.__init__ -1.2 s
  • moving import requests to the function in obspy.core.util.base : -0.4 s

scipy-related:

  • moving import scipy.interpolate to the right function in core.inventory.response: - 0.4 s

so in the end: only scipy.stats + the two here: from 4.3 to 2.3 s load time for from osbpy.core import read

@ThomasLecocq
Copy link
Contributor Author

do we really need from obspy.scripts.runtests import run_tests # NOQA in obspy.core.__init__.py ?

@ThomasLecocq
Copy link
Contributor Author

Index: obspy/__init__.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- obspy/__init__.py	(revision a8750f46132d937daec3a41ed6d5f38478a4ecc0)
+++ obspy/__init__.py	(date 1605101220991)
@@ -27,7 +27,7 @@
     GNU Lesser General Public License, Version 3
     (https://www.gnu.org/copyleft/lesser.html)
 """
-import requests
+# import requests
 import sys
 import warnings
 
@@ -60,11 +60,11 @@
 _add_format_plugin_table(Inventory.write, "inventory", "write", numspaces=8)
 
 
-if requests.__version__ in ('2.12.0', '2.12.1', '2.12.2'):
-    msg = ("ObsPy has some known issues with 'requests' version {} (see "
-           "github issue #1599). Please consider updating module 'requests' "
-           "to a newer version.").format(requests.__version__)
-    warnings.warn(msg)
+# if requests.__version__ in ('2.12.0', '2.12.1', '2.12.2'):
+#     msg = ("ObsPy has some known issues with 'requests' version {} (see "
+#            "github issue #1599). Please consider updating module 'requests' "
+#            "to a newer version.").format(requests.__version__)
+#     warnings.warn(msg)
 
 if int(sys.version[0]) < 3:
     raise ImportError("""You are running ObsPy >= 2.0 on Python 2
Index: obspy/core/inventory/response.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- obspy/core/inventory/response.py	(revision a8750f46132d937daec3a41ed6d5f38478a4ecc0)
+++ obspy/core/inventory/response.py	(date 1605101509430)
@@ -18,7 +18,6 @@
 import warnings
 
 import numpy as np
-import scipy.interpolate
 
 from .. import compatibility
 from obspy.core.util.base import ComparingObject
@@ -1110,6 +1109,7 @@
         :rtype: :tuple: ( :class:`numpy.ndarray`, chan )
         :returns: frequency response at requested frequencies
         """
+        import scipy.interpolate
         if not self.response_stages:
             msg = ("Can not use evalresp on response with no response "
                    "stages.")
Index: obspy/core/util/base.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- obspy/core/util/base.py	(revision a8750f46132d937daec3a41ed6d5f38478a4ecc0)
+++ obspy/core/util/base.py	(date 1605102003602)
@@ -23,7 +23,7 @@
 
 import numpy as np
 import pkg_resources
-import requests
+
 from pkg_resources import get_entry_info, iter_entry_points
 
 from obspy.core.util.misc import to_int_or_zero, buffered_load_entry_point
@@ -620,6 +620,7 @@
     :param chunk_size: The chunk size in bytes.
     :type chunk_size: int
     """
+    import requests
     # Workaround for old request versions.
     try:
         r = requests.get(url, stream=True)
Index: obspy/geodetics/base.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- obspy/geodetics/base.py	(revision a8750f46132d937daec3a41ed6d5f38478a4ecc0)
+++ obspy/geodetics/base.py	(date 1605093687418)
@@ -12,7 +12,6 @@
 import warnings
 
 import numpy as np
-from scipy.stats import circmean
 
 from obspy.core.util.misc import to_int_or_zero
 
@@ -389,6 +388,7 @@
     :param longitudes: Geographical longitude values ranging from -180 to 180
         in degrees.
     """
+    from scipy.stats import circmean
     mean_longitude = circmean(np.array(longitudes), low=-180, high=180)
     mean_longitude = _normalize_longitude(mean_longitude)
     return mean_longitude
Index: obspy/core/__init__.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- obspy/core/__init__.py	(revision a8750f46132d937daec3a41ed6d5f38478a4ecc0)
+++ obspy/core/__init__.py	(date 1605102300372)
@@ -121,7 +121,7 @@
 from obspy.core.util.attribdict import AttribDict  # NOQA
 from obspy.core.trace import Stats, Trace  # NOQA
 from obspy.core.stream import Stream, read  # NOQA
-from obspy.scripts.runtests import run_tests  # NOQA
+# from obspy.scripts.runtests import run_tests  # NOQA
 
 
 if __name__ == '__main__':

@megies
Copy link
Member

megies commented Nov 12, 2020

👍 on all the changes

The only one that hurts a little is getting rid of the requests warning message. But I guess given the fact that the problem is several years old, we probably can safely assume that people don't setup new environments with the old problematic requests version or have seen the warning message in existing, old environments. 🤷‍♂️

@megies
Copy link
Member

megies commented Nov 12, 2020

Uhm.. if you intend to target master branch, then the requests thing is obsolete anyway, since it was about future and six interacting in a bad way.

@megies megies mentioned this issue Nov 27, 2020
9 tasks
@trichter trichter added this to the 1.3.0 milestone Feb 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants