Defer heavy imports for faster package load times#762
Conversation
elenya-grant
left a comment
There was a problem hiding this comment.
Thanks for working on this! I think these changes are good!
| the same for each technology. This includes site information, project parameters, | ||
| control strategy, and finance parameters. | ||
| """ | ||
| import openmdao.api as om |
There was a problem hiding this comment.
does the om import really slow stuff down? If its used in multiple methods then I kind of think it should be left at the top of the file instead of in each method.
There was a problem hiding this comment.
It doesn't really; I was trying out a few different things and this was leftover. I removed all of these and put the import at the top, thank you!
| return new | ||
|
|
||
|
|
||
| supported_models = _ModelRegistry( |
There was a problem hiding this comment.
I think this is slick! I'll be bummed that we won't be able to right click on a package object and have it open the file in VSCode anymore - but I will survive! If this reduces the import time then I'm down with it!
There was a problem hiding this comment.
I agree this obfuscates things a little bit, but I do think the reduced import time is worthwhile. This might also point us towards having more standardized naming conventions for files so you can easily find them in VSCode, e.g. TidalResource could be defined in tidal_resource.py instead of the (as-is) tidal.py, and maintain this same naming throughout H2I.
Do you think that would help quickly find what you're looking for? What other solutions are possible to help here?
| return default | ||
|
|
||
| def values(self): | ||
| return [self[k] for k in super().keys()] |
There was a problem hiding this comment.
I'm not sure I understand why we wouldn't return list(super().values()) or not even override values in the first place. Could you provide a bit more context on this decision making?
There was a problem hiding this comment.
I'm so glad we have PR reviews because this is firmly not needed, good catch! I've removed the values and items overriding.
| return [self[k] for k in super().keys()] | ||
|
|
||
| def items(self): | ||
| return [(k, self[k]) for k in super().keys()] |
There was a problem hiding this comment.
Same general comment as the one I left for values.
RHammond2
left a comment
There was a problem hiding this comment.
I like the simplicity of this approach, and the massive improvements in import speedup. This should actually help a bit with #742 because of the sheer number of tests that import the model, which is a really nice secondary impact.
I have a couple of specific questions before I hit the approve button, and a couple of non-blocking general que
RHammond2
left a comment
There was a problem hiding this comment.
I really like the simplicity of this approach and the all the added speedups. This should actually help a reasonable amount with #742 because of how many times the model is imported throughout the tests.
There are a couple spots in _ModelRegistry where some docstrings would be appreciated, a couple questions about the reimplementation of dict methods, and a non-blocking general question about import structures.
Thanks, Rob! This is ready for re-review, I've implemented some good changes based on your comments. |
Co-authored-by: Rob Hammond <13874373+RHammond2@users.noreply.github.com>
Defer heavy imports for faster package load times
Dramatically reduces
import h2integrateandH2IntegrateModelimport times by deferring heavy transitive dependencies (OpenMDAO, HOPP, PySAM, pyomo, scipy, matplotlib, jsonschema, etc.) to the point of first use rather than loading them at module scope.Old timing:
New timing:
No behavioral changes - all deferred imports are resolved on first use and cached by Python's import system.
Section 1: Type of Contribution
Section 2: Draft PR Checklist
TODO:
h2integrate_model.pysupported_models.pyType of Reviewer Feedback Requested (on Draft PR)
Structural feedback: Is the
_ModelRegistrydict subclass approach acceptable, or would you prefer a different loading pattern?Implementation feedback: Are there any methods in
h2integrate_model.pywhere the localimport openmdao.api as ompattern feels problematic?Other feedback: N/A
Section 3: General PR Checklist
docs/files are up-to-date, or added when necessaryCHANGELOG.md"A complete thought. [PR XYZ]((https://github.com/NatLabRockies/H2Integrate/pull/XYZ)", where
XYZshould be replaced with the actual number.Section 4: Related Issues
N/A
Section 5: Impacted Areas of the Software
Section 5.1: New Files
N/A
Section 5.2: Modified Files
h2integrate/__init__.py__getattr__lazy loader to avoid triggering the full dependency tree onimport h2integrate.h2integrate/core/supported_models.py_ModelRegistry; values don't include theh2integrate.prefix (prepended automatically in_resolve).h2integrate/core/h2integrate_model.pyimport openmdao.api as omand several other heavy imports from module-level to local method-level imports.pyproject.toml*/supported_models.pyto[tool.ruff.lint.per-file-ignores]for E501.Section 6: Additional Supporting Information
The primary bottleneck was
supported_models.py, which imported all ~120 model classes and their transitive dependencies (HOPP, PySAM, pyomo, scipy, matplotlib, etc.) at module scope. The_ModelRegistrysubclass stores import paths as strings and resolves them on first dict access, so only the models actually used in a given run are imported. Python's import caching means each module is only loaded once regardless of how many methods contain local import statements.