Remove type ignores for PyTorch #460

adamjstewart · 2022-03-12T20:15:58Z

PyTorch 1.11 added type hints for most of the library. This PR removes type ignores for the majority of PyTorch functions and fixes other random typing mistakes I found along the way.

Closes #266 (no longer needed)

calebrob6 · 2022-03-13T19:04:34Z

what was the bug in ETCI2021?

adamjstewart

Added comments to everything I think was worth pointing out. 85% of the changes are simply removing "type: ignore" from PyTorch functions, 10% are fixes to type hinting mistakes like pytest and pytorch-lightning, and 5% are bugs in ETCI 2021 plotting that were uncovered by mypy. I can try to separate those 10% of changes into a separate PR if that makes it easier to review, but I think the other 90% have to be merged first in order to get our unit tests to pass with PyTorch 1.11.

It's crazy that simply removing "type: ignore" and reformatting code that had to be on multiple lines is enough to drop 500+ lines of code from TorchGeo!

adamjstewart · 2022-03-13T18:53:03Z

.readthedocs.yaml

+build:
+  os: ubuntu-20.04
+  tools:
+    python: "3.9"


Had to do some hacky type hint stuff that now requires Python 3.9+ to build the docs or run mypy. Basically, both typing and collections have an OrderedDict, but collections.OrderedDict doesn't support type hints until Python 3.9+ and typing.OrderedDict doesn't exist until Python 3.7.2+. So if we want to support Python 3.6 (will be dropped soon) or 3.7.1 we need to use collections.OrderedDict. This does not affect run-time since I wrapped it in quotes and it's only evaluated by mypy/sphinx.

adamjstewart · 2022-03-13T18:53:43Z

evaluate.py

-    device: torch.device,  # type: ignore[name-defined]
-    metrics: Metric,
+    device: torch.device,
+    metrics: MetricCollection,


We could use Union[Metric, MetricCollection] but currently we are only using MetricCollection arguments.

adamjstewart · 2022-03-13T18:54:10Z

evaluate.py

@@ -158,7 +159,7 @@ def main(args: argparse.Namespace) -> None:
            "loss": model.hparams["loss"],
        }
    elif issubclass(TASK, SemanticSegmentationTask):
-        val_row: Dict[str, Union[str, float]] = {  # type: ignore[no-redef]


No need to redefine this type and then ignore the fact that we redefined this type

adamjstewart · 2022-03-13T18:55:09Z

tests/datasets/test_advance.py

@@ -23,38 +23,30 @@ def download_url(url: str, root: str, *args: str) -> None:

 class TestADVANCE:
    @pytest.fixture
-    def dataset(
-        self, monkeypatch: Generator[MonkeyPatch, None, None], tmp_path: Path


I don't know why I was under the impression that monkeypatch is a Generator but it is in fact just a MonkeyPatch object. So we no longer need to ignore type hints relating to pytest!

This definitely makes more sense -- I skeptically copy+pasted the Generator type several times but was never skeptical enough to investigate.

adamjstewart · 2022-03-13T19:10:59Z

tests/datasets/test_spacenet.py

@@ -263,7 +234,7 @@ def dataset(

    def test_getitem(self, dataset: SpaceNet5) -> None:
        # Iterate over all elements to maximize coverage
-        samples = [i for i in dataset]  # type: ignore[attr-defined]
+        samples = [dataset[i] for i in range(len(dataset))]


I'm actually pretty confused why this used to work because dataset isn't iterable.

adamjstewart · 2022-03-13T19:41:19Z

torchgeo/trainers/byol.py

@@ -161,7 +159,7 @@ def __init__(
        model: Module,
        projection_size: int = 256,
        hidden_size: int = 4096,
-        layer: Union[str, int] = -2,


When we index self.model.children(), I believe this is a nn.ModuleList, so only integer indices are allowed. So far we're only using integer indices in our code. I don't know of a reason why we need to support string indices.

I believe there was an option here where you could look up a layer by name that I removed for simplicity and forgot to update

adamjstewart · 2022-03-13T19:42:03Z

torchgeo/trainers/byol.py

        # Copying the weights of the old layer to the extra channels
        for i in range(in_channels - layer.in_channels):
            channel = layer.in_channels + i
-            new_layer.weight[:, channel : channel + 1, :, :].data[
-                ...  # type: ignore[index]


mypy doesn't like ... for some reason, : should be equivalent as far as I know.

I think this is okay here

adamjstewart · 2022-03-13T19:43:09Z

torchgeo/trainers/byol.py

-        self.save_hyperparameters()  # creates `self.hparams` from kwargs
+
+        # Creates `self.hparams` from kwargs
+        self.save_hyperparameters()  # type: ignore[operator]


For some reason mypy thinks this is a Tensor, not a Callable. We could cast it but I'm not sure exactly what kind of Callable this is. I'm going to chalk this up to "pytorch-lightning is super hacky" and leave it for future work to figure out why this doesn't work.

adamjstewart · 2022-03-13T19:44:11Z

torchgeo/trainers/byol.py


        self.config_task()

-    def forward(self, x: Tensor) -> Any:  # type: ignore[override]
+    def forward(self, *args: Any, **kwargs: Any) -> Any:


It's best to avoid overriding type signatures of functions in subclasses

adamjstewart · 2022-03-13T19:46:51Z

torchgeo/trainers/utils.py

@@ -18,7 +18,7 @@
 Conv2d.__module__ = "nn.Conv2d"


-def extract_encoder(path: str) -> Tuple[str, Dict[str, Tensor]]:
+def extract_encoder(path: str) -> Tuple[str, "OrderedDict[str, Tensor]"]:


This is the OrderedDict hack I mentioned above. model.load_state_dict requires an OrderedDict, not Dict.

calebrob6 · 2022-03-13T20:11:12Z

Lol

tests/datasets/test_advance.py

torchgeo/datasets/xview.py

torchgeo/trainers/byol.py

torchgeo/transforms/transforms.py

calebrob6

For the hparams changes in the trainers, can we just reset self.hparams in the constructor after calling save_hyperparameters to avoid having to have make a local copy of the hyperparameters everywhere (just to keep mypy happy)?

adamjstewart · 2022-03-14T16:54:27Z

can we just reset self.hparams in the constructor after calling save_hyperparameters

That's what I tried originally but pytorch-lightning doesn't like this:

AttributeError: can't set attribute

I'm guessing it's a read only @property with no setter, just a getter.

calebrob6 · 2022-03-14T20:09:51Z

Can we just make a new self.hyperparameters = self.hparams then?

Note: this isn't a requirement for this PR (I can open a followup with the .long() and .float() changes too), just trying to figure out how to work around having these local hparams hanging around.

* Remove type ignores for PyTorch * Mypy fixes for pytest MonkeyPatch * Black * Ignore Identity * Generic fixes * Remove unused Generator import * More fixes * Fix remaining mypy errors * More typing cleanups * typing.OrderedDict isn't available until Python 3.7.2+ * Need Python 3.9 to build docs for fancy OrderedDict * Fix Python 3.8 and earlier support * Fix BigEarthNet tests * Fix bug in ETCI 2021 tests * Remove unused flake8 ignore * More robust and well-documented trainer steps * Many functions don't actually use batch_idx * Store cast hparams in trainers

Remove type ignores for PyTorch

42066dc

adamjstewart added this to the 0.2.1 milestone Mar 12, 2022

adamjstewart added 13 commits March 12, 2022 14:30

Mypy fixes for pytest MonkeyPatch

545dca3

Black

27ed15c

Ignore Identity

7750305

Generic fixes

2030126

Remove unused Generator import

23c4155

More fixes

b72f074

Fix remaining mypy errors

e4a84ba

More typing cleanups

38a1ac1

typing.OrderedDict isn't available until Python 3.7.2+

a381ca8

Need Python 3.9 to build docs for fancy OrderedDict

856a00d

Fix Python 3.8 and earlier support

528e7d4

Fix BigEarthNet tests

ac9e7f4

Fix bug in ETCI 2021 tests

0dd5612

Remove unused flake8 ignore

710571c

calebrob6 mentioned this pull request Mar 13, 2022

USAVars: implementing DataModule #441

Merged

adamjstewart commented Mar 13, 2022

View reviewed changes

adamjstewart marked this pull request as ready for review March 13, 2022 19:59