Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework list of required dependencies #287

Merged
merged 6 commits into from
Dec 18, 2021
Merged

Rework list of required dependencies #287

merged 6 commits into from
Dec 18, 2021

Conversation

adamjstewart
Copy link
Collaborator

With this PR, I would like to introduce a new definition of required dependencies: anything that is required to import any part of the library. Any optional dependencies, such as those listed under the [options.extras_require] section of setup.cfg must be guarded by try-excepts to notify the user that a certain library is missing and is needed for the feature they are trying to use. Many package managers, including Spack and Conda, will try to import sections of the library to ensure that things were installed correctly. These tests shouldn't fail because optional dependencies are missing. To enforce this, I've added a test that doesn't install the optional dataset dependencies to ensure that the tests still pass.

Closes #72
Related to #230, #249

@adamjstewart adamjstewart added the testing Continuous integration testing label Dec 15, 2021
- name: Run mypy checks
run: mypy .
datasets:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Identical to our other tests but without the optional dataset dependencies to ensure that we are skipping them properly.

@@ -37,6 +37,8 @@ install_requires =
kornia>=0.5.4
matplotlib
numpy
# omegaconf 2.1+ required for to_object method
omegaconf>=2.1
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically omegaconf isn't used in torchgeo proper, only in train.py/benchmark.py and the unit tests, but it seemed weird to keep a train list of optional deps with only one dep...

@@ -57,6 +56,7 @@ def mocked_import(name: str, *args: Any, **kwargs: Any) -> Any:
)

def test_getitem(self, dataset: ADVANCE) -> None:
pytest.importorskip("scipy", minversion="0.9.0")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scipy is only needed for __getitem__, it isn't needed for any other part of the dataset.

@@ -19,6 +19,9 @@
import torchgeo.datasets.utils
from torchgeo.datasets import IDTReeS

pytest.importorskip("pandas", minversion="0.19.1")
pytest.importorskip("laspy", minversion="2")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the tests that don't use the dataset fixture still use IDTReeS and will fail if these deps aren't installed.

@adamjstewart adamjstewart added this to the 0.1.1 milestone Dec 15, 2021
@adamjstewart
Copy link
Collaborator Author

My only other complaint about our imports is that things are currently very slow. Because torchgeo/__init__.py imports things from torchgeo/datasets/__init__.py and torchgeo/trainers/__init__.py, and since these import every file in the library and all of our required deps, it can take close to 10 seconds just to import anything from TorchGeo:

$ time python -c 'from torchgeo.datasets import VHR10'

real	0m8.428s
user	0m5.748s
sys	0m1.856s

If we didn't use import aliases, this would be significantly faster. It's only really an issue when you want to test stuff on the command-line since scripts will take a long time to run regardless.

@isaaccorley
Copy link
Collaborator

I think right now most users are here for the datasets/samplers. Could we not import the trainers/models automatically? I'm wondering how much of a difference just not importing those makes.

@isaaccorley isaaccorley self-requested a review December 17, 2021 22:21
Copy link
Collaborator

@isaaccorley isaaccorley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@adamjstewart
Copy link
Collaborator Author

I'm wondering how much of a difference just not importing those makes.

Currently, because we import datasets/trainers in torchgeo/__init__.py:

$ time python -c 'from torchgeo.datasets.nwpu import VHR10'

real	0m7.549s
user	0m5.553s
sys	0m1.784s

If we remove all imports in torchgeo/__init__.py:

$ time python -c 'from torchgeo.datasets.nwpu import VHR10'

real	0m5.452s
user	0m3.975s
sys	0m1.451s

If we also remove all imports in torchgeo/datasets/__init__.py:

$ time python -c 'from torchgeo.datasets.nwpu import VHR10'

real	0m3.302s
user	0m2.377s
sys	0m0.909s

@adamjstewart
Copy link
Collaborator Author

adamjstewart commented Dec 17, 2021

The imports in torchgeo/__init__.py could be moved to speed things up, but the imports in torchgeo/*/__init__.py are needed if we want to have the same API as torchvision. Note that some of the imports in torchgeo/__init__.py may be necessary to avoid circular import issues: #276

@calebrob6 calebrob6 merged commit be36c1e into main Dec 18, 2021
@calebrob6 calebrob6 deleted the tests/dependencies branch December 18, 2021 00:28
adamjstewart added a commit that referenced this pull request Dec 19, 2021
* Rework list of required dependencies

* Update open3d import error msg

* Style fixes

* Remove extra empty line

* Increase test coverage

* Fix idtrees tests
yichiac pushed a commit to yichiac/torchgeo that referenced this pull request Apr 29, 2023
* Rework list of required dependencies

* Update open3d import error msg

* Style fixes

* Remove extra empty line

* Increase test coverage

* Fix idtrees tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Required dependencies, lazy imports
3 participants