port affine tests #7708

pmeier · 2023-06-28T11:58:25Z

cc @vfdev-5

pytorch-bot · 2023-06-28T11:58:27Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7708

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures

As of commit 980e77b:

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

test/test_transforms_v2_refactored.py

pmeier · 2023-06-28T21:39:21Z

test/test_transforms_v2_refactored.py

@@ -162,7 +165,7 @@ def _check_dispatcher_dispatch(dispatcher, kernel, input, *args, **kwargs):
    if isinstance(input, datapoints._datapoint.Datapoint):
        # Due to our complex dispatch architecture for datapoints, we cannot spy on the kernel directly,
        # but rather have to patch the `Datapoint.__F` attribute to contain the spied on kernel.
-        spy = mock.MagicMock(wraps=kernel)
+        spy = mock.MagicMock(wraps=kernel, name=kernel.__name__)


pmeier · 2023-06-28T21:39:36Z

test/test_transforms_v2_refactored.py

+        ("dtype", "make_mask"), [(torch.uint8, make_segmentation_mask), (torch.bool, make_detection_mask)]
    )
-    def test_kernel_mask(self, dtype_and_make_mask):
-        dtype, make_mask = dtype_and_make_mask
+    def test_kernel_mask(self, dtype, make_mask):


pmeier · 2023-06-28T21:39:55Z

test/test_transforms_v2_refactored.py

@@ -744,7 +746,7 @@ def _make_input(self, input_type, *, dtype=None, device="cpu", spatial_size=(17,
    @pytest.mark.parametrize("dtype", [torch.float32, torch.uint8])
    @pytest.mark.parametrize("device", cpu_and_cuda())
    def test_kernel_image_tensor(self, dtype, device):
-        check_kernel(F.horizontal_flip_image_tensor, self._make_input(torch.Tensor))
+        check_kernel(F.horizontal_flip_image_tensor, self._make_input(torch.Tensor, dtype=dtype, device=device))


We didn't spot that before

pmeier · 2023-06-28T21:40:19Z

test/test_transforms_v2_refactored.py

+            (torch.Tensor, F.horizontal_flip_image_tensor),
+            (PIL.Image.Image, F.horizontal_flip_image_pil),
+            (datapoints.Image, F.horizontal_flip_image_tensor),
+            (datapoints.BoundingBox, F.horizontal_flip_bounding_box),
+            (datapoints.Mask, F.horizontal_flip_mask),
+            (datapoints.Video, F.horizontal_flip_video),
        ],
    )
    def test_dispatcher_signature(self, kernel, input_type):
-        check_dispatcher_signatures_match(F.resize, kernel=kernel, input_type=input_type)
+        check_dispatcher_signatures_match(F.horizontal_flip, kernel=kernel, input_type=input_type)


We need to pay more attention when copy-pasting / reviewing. This went undetected.

pmeier · 2023-06-28T21:40:54Z

test/test_transforms_v2_refactored.py

+
+        return input
+
+    def _adapt_fill(self, value, *, dtype):


This will likely need to be factored out later, since we need this for every transform that supports fill.

pmeier · 2023-06-28T21:42:58Z

test/test_transforms_v2_functional.py

@@ -949,18 +878,6 @@ def test_correctness_crop_bounding_box(device, format, top, left, height, width,
    torch.testing.assert_close(output_spatial_size, spatial_size)


-@pytest.mark.parametrize("device", cpu_and_cuda())
-def test_correctness_horizontal_flip_segmentation_mask_on_fixed_input(device):


Forgot about this last time

NicolasHug

The affine tests run in 1.5 minutes on my laptop. Conversely, the Resize tests take 5 seconds (I understand there's less combinations to check there).

Can we do anything to reduce the duration of these tests?

NicolasHug · 2023-06-29T09:35:28Z

Can we do anything to reduce the duration of these tests?

A significant portion of these tests are tests that:

are extremely unlikely to be go red once we've checked that they're green
do not need to be checked on every single PR. We probably don't even need to check them every single day.

I'm thinking in particular about tests like check_dispatcher*(), check_transform(), and probably others.
Should we start thinking of a test job that would run those on a regular basis, but not on PRs byt default? Something similar to test_dataset_download.py? We can also add some kind of GH label that woud trigger those tests on a PR, if explicitly requested.

pmeier · 2023-06-29T09:56:59Z

Regarding the long test time for this class: test_kernel_image_tensor contributes the largest part with its massive parametrization:

❯ pytest --co -q test/test_transforms_v2_refactored.py::TestAffine | tail -n1
40899 tests collected in 3.50s
❯ pytest --co -q test/test_transforms_v2_refactored.py::TestAffine::test_kernel_image_tensor | tail -n1
35200 tests collected in 2.74s
❯ pytest -q test/test_transforms_v2_refactored.py::TestAffine::test_kernel_image_tensor | tail -n1
35200 passed in 104.84s (0:01:44)
❯ pytest -q test/test_transforms_v2_refactored.py::TestAffine -k "not test_kernel_image_tensor" | tail -n1
5699 passed, 35200 deselected in 19.33s

Second to that is test_kernel_bounding_box:

❯ pytest -q --co test/test_transforms_v2_refactored.py::TestAffine::test_kernel_bounding_box | tail -n1
4800 tests collected in 2.09s
❯ pytest -q test/test_transforms_v2_refactored.py::TestAffine::test_kernel_bounding_box | tail -n1
4800 passed in 17.18s

If we can bring both tests down to a more reasonable duration, the total time of TestAffine is not too bad in general:

❯ pytest -q test/test_transforms_v2_refactored.py::TestAffine -k "not test_kernel_image_tensor and not test_kernel_bounding_box" | tail -n1
899 passed, 40000 deselected in 6.10s
❯ pytest -q test/test_transforms_v2_refactored.py::TestResize | tail -n1
767 passed in 3.23s

To achieve that, we could move away from the full matrix of parameters. Meaning, we will fix most of the parameters and only parametrize over one or a few at the same time. This will drastically cut down the overall amount of tests, while only leaving minimal gaps that will likely never result in a bug. Note that we are not talking about the correctness tests here. I'll send a commit to implement this strategy.

None of that is orthogonal to running test on a schedule. However, implementing that will take more time and thus I'll go for the low hanging fruit first.

pmeier · 2023-06-29T10:46:22Z

After 39b91c8

❯ pytest -q --co test/test_transforms_v2_refactored.py::TestAffine | tail -n1
1287 tests collected in 0.25s
❯ pytest -q test/test_transforms_v2_refactored.py::TestAffine | tail -n1
1287 passed in 4.32s
❯ pytest -q test/test_transforms_v2_refactored.py::TestAffine -k "not cuda" | tail -n1
1087 passed, 200 deselected in 3.28s
❯ pytest -q test/test_transforms_v2_refactored.py::TestAffine -k "cuda" | tail -n1
200 passed, 1087 deselected in 2.76s

I can live with that without going for a schedule solution for now.

NicolasHug

Thanks Philip, a few comments but LGTM anyway

test/test_transforms_v2_refactored.py

NicolasHug · 2023-06-29T16:20:00Z

test/test_transforms_v2_refactored.py

+    )
+    @pytest.mark.parametrize("fill", _CORRECTNESS_FILL)
+    @pytest.mark.parametrize("seed", list(range(10)))
+    def test_transform_image_correctness(self, center, interpolation, fill, seed):


I guess we can't unify this test with the functional one, because the parameters are too different (i.e. they're randomly sampled in the transform)?

That being said I'm curious why the MAE threshold are different in both test

I guess we can't unify this test with the functional one, because the parameters are too different (i.e. they're randomly sampled in the transform)?

Correct.

That being said I'm curious why the MAE threshold are different in both test

I just put the minimal needed threshold for the tests to pass. I guess in our case transform test sampled other affine params that ultimately led to a higher MAE. I'll unify them.

test/test_transforms_v2_refactored.py

NicolasHug · 2023-06-29T16:26:44Z

test/test_transforms_v2_refactored.py

+            kwargs["degrees"] = 0
+
+        with pytest.raises(
+            ValueError if isinstance(value, list) else TypeError, match=f"{param} should be a sequence of length 2"


Probably not worth addressing in this PR because it's not related to tests, but do you have any thought on unifying or parameter error types to always be ValueError? Distinguishing between ValueError and TypeError doesn't seem worth it and leads to that kind of special-casing.

I started out with the strong believe that one should differentiate between the two based on the documentation:

Passing arguments of the wrong type (e.g. passing a list when an int is expected) should result in a TypeError, but passing arguments with the wrong value (e.g. a number outside expected boundaries) should result in a ValueError.

However, in practice it turns out that this distinction is not really helpful. I doubt anyone is doing

try: torchvision.transforms.functional.affine(...) except TypeError: # do something except ValueError: # do something else

The most likely scenario is that no one is using the exceptions in their regular flow, but rather only if there is actual an error. And in that case a human is going to look into it and thus the message is much more important.

So overall I see few reasons to differentiate the two in torchvision. That might be different for other packages, where the users actually rely on the exception behavior. However, it doesn't bother me enough to actually go and address it. I'll review / stamp a PR if you want to bring one.

As a side note: the distinction between TypeError and ValueError here is terrible IMO, because we even use the same error message.

github-actions · 2023-06-30T08:30:55Z

Hey @pmeier!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

Reviewed By: vmoens Differential Revision: D47186575 fbshipit-source-id: 93dcb9c8c60dc702870dfe6cc5c0d2a2a178a77c

port affine tests

0ee347e

pmeier added module: transforms module: tests labels Jun 28, 2023

facebook-github-bot added the cla signed label Jun 28, 2023

pmeier commented Jun 28, 2023

View reviewed changes

test/test_transforms_v2_refactored.py Outdated Show resolved Hide resolved

pmeier mentioned this pull request Jun 28, 2023

fix formatting in test_functional_tensor.py #7709

Merged

add remaining tests

b18d531

pmeier commented Jun 28, 2023

View reviewed changes

Merge branch 'main' into port/affine

76fc367

pmeier marked this pull request as ready for review June 28, 2023 21:44

pmeier requested a review from NicolasHug June 28, 2023 21:44

NicolasHug reviewed Jun 29, 2023

View reviewed changes

Merge branch 'main' into port/affine

262e9c2

break down massive parametrizations

39b91c8

pmeier added 3 commits June 29, 2023 13:01

[PoC] re-unify kernel tests

5520fdd

[PoC] make affine params explicit in the parametrization

2619a34

[PoC] fix

ae089a8

pmeier requested a review from NicolasHug June 29, 2023 11:32

NicolasHug approved these changes Jun 29, 2023

View reviewed changes

address review

980e77b

pmeier merged commit 0e49615 into pytorch:main Jun 30, 2023
56 of 61 checks passed

pmeier deleted the port/affine branch June 30, 2023 08:30

facebook-github-bot pushed a commit that referenced this pull request Jul 3, 2023

[fbsync] port affine tests (#7708)

541dbe2

Reviewed By: vmoens Differential Revision: D47186575 fbshipit-source-id: 93dcb9c8c60dc702870dfe6cc5c0d2a2a178a77c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

port affine tests #7708

port affine tests #7708

pmeier commented Jun 28, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Jun 28, 2023 •

edited

Loading

pmeier Jun 28, 2023

pmeier Jun 28, 2023

pmeier Jun 28, 2023

pmeier Jun 28, 2023

pmeier Jun 28, 2023

pmeier Jun 28, 2023

NicolasHug left a comment

NicolasHug commented Jun 29, 2023

pmeier commented Jun 29, 2023 •

edited

Loading

pmeier commented Jun 29, 2023

NicolasHug left a comment

NicolasHug Jun 29, 2023

pmeier Jun 30, 2023

NicolasHug Jun 29, 2023

pmeier Jun 30, 2023

github-actions bot commented Jun 30, 2023

port affine tests #7708

port affine tests #7708

Conversation

pmeier commented Jun 28, 2023 • edited by pytorch-bot bot Loading

pytorch-bot bot commented Jun 28, 2023 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/7708

❌ 5 New Failures

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

NicolasHug left a comment

Choose a reason for hiding this comment

NicolasHug commented Jun 29, 2023

pmeier commented Jun 29, 2023 • edited Loading

pmeier commented Jun 29, 2023

NicolasHug left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Jun 30, 2023

pmeier commented Jun 28, 2023 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Jun 28, 2023 •

edited

Loading

pmeier commented Jun 29, 2023 •

edited

Loading