-
Notifications
You must be signed in to change notification settings - Fork 7.2k
refactor prototype transforms functional tests #5879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
__all__ = ["assert_close"] | ||
|
||
|
||
class PILImagePair(TensorLikePair): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a superset of what the old ImagePair
did. It includes the options to only test the aggregated difference or check the percentage of differing pixels. That is on par with what we are currently doing in our stable functional tests:
Lines 172 to 174 in a67cc87
def _assert_approx_equal_tensor_to_pil( | |
tensor, pil_image, tol=1e-5, msg=None, agg_method="mean", allowed_percentage_diff=None | |
): |
def from_loader(loader_fn): | ||
def wrapper(*args, **kwargs): | ||
loader = loader_fn(*args, **kwargs) | ||
return loader.load(kwargs.get("device", "cpu")) | ||
|
||
return wrapper | ||
|
||
|
||
def from_loaders(loaders_fn): | ||
def wrapper(*args, **kwargs): | ||
loaders = loaders_fn(*args, **kwargs) | ||
for loader in loaders: | ||
yield loader.load(kwargs.get("device", "cpu")) | ||
|
||
return wrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions are mostly for "BC" with our current tests. With them we can turn make_*_loader{s}
back into make_*
. For example, make_images = from_loaders(make_image_loaders)
. This makes the transition period easer, since we don't need to touch the old files.
In the future, most tests should use the loader architecture. Those that don't could simply invoke TensorLoader(...).load(device)
manually.
return self.kernel.__name__ | ||
|
||
|
||
def pil_reference_wrapper(pil_kernel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reference defined in the KernelInfo
will be passed the same inputs as the kernel. Since we use the PIL kernel as reference for its tensor counterpart, this is simple wrapper to avoid defining the same kind of reference function over and over.
|
||
def sample_inputs_horizontal_flip_image_tensor(): | ||
for image_loader in make_image_loaders(dtypes=[torch.float32]): | ||
yield ArgsKwargs(image_loader.unwrap()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to make_image_loaders
, these functions don't need to yield
, but it makes the definition less verbose.
@vfdev-5 The failure comes from a test you wanted to fix:
Given that this PR does not touch this test at all, my best guess is that it was flaky before and depended on a random seed. Plus, we seem to have missed a debug statement:
|
@pmeier it was fixed. The issue is with input image size that is too small for given bounding boxes. Probably, you changed that again somewhere. |
@vfdev-5 You were right and I fixed that in a49f0db. However, now we get this failure:
In CI this is only failing on macOS, but this also fails for me locally. Given that we have only a single mismatched element, the test is probably flaky. |
I do not see from your commit where the size was fixed. More general question, how to find the code of failing test ? It is totally obfuscated to me :) |
The issue was that while refactoring the
Not sure what you mean. Could you elaborate? Right now there is no failing test in the new tests, so I'll construct one to show what it looks like. Imagine
Traceback
From the test name and parametrization you should find everything:
To debug, you can run this exact test with
|
I was talking about |
Top-most error in the traceback:
To reproduce:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to me
|
||
|
||
def make_bounding_box(*, format, image_size=(32, 32), extra_dims=(), dtype=torch.int64): | ||
def make_bounding_box_loader(*, extra_dims=(), format, image_size=None, dtype=torch.float32): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why bounding box does not have num_objects
arg ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we defined a bounding box as (*, 4)
, i.e. features.BoundingBox([0, 0, 10, 10], ...)
is valid although it only has a single dimension. Thus, if you want to have multiple boxes, set extra_dims=(num_objects,)
Reviewed By: jdsgomes Differential Revision: D39543278 fbshipit-source-id: 413bc5160188c5423d39d9f73387e9a5f25d8af7
This PR refactors our prototype transforms functional tests. They are currently located in
test/test_prototype_transforms_functional.py
. To ease the reviewing process I've added a newtest/test_prototype_transforms_kernels.py
module that contains the refactored tests from this PR. In the end that should replace most parts of the old file, but doing it in a separate module avoids GH diff hell.Status quo
Our current implementation was the first attempt to automate our tests. I took some inspiration from the
OpInfo
framework from PyTorch core. The basic idea is to define aFunctionalInfo
's for each functional that stores some metadata about it.The most important info (and for now the only metadata we store) is the
sample_inputs_fn
. It yields call argumentsWith that we can write common tests that can be
@pytest.mark.parametrize
'd over the kernel-call-args combinations. For example, a test that checks thetorch.jit.script
'ed output against its eager counterpart looks like thisvision/test/test_prototype_transforms_functional.py
Lines 583 to 595 in f36f351
Pros / Cons
This architecture has two main benefits over manually writing these tests:
Plus, and in contrast to the
OpInfo
's from PyTorch core, the test are easier to debug, since we use@pytest.mark.parametrize
overfor
loops in the test body to iterate over the call args. If one of our tests fails, we can reproduce the parametrization from the log, whereas in PyTorch core one needs manually find which call args are responsible for the failure.However, there are also downsides:
@pytest.mark.parametrize
instantiates everything upfront. Especially with tensor inputs that can quickly become a big chunk of memory. Right nowtest/test_prototype_transforms_functional.py
includes ~23k tests that instantiate tensors during collection (pytest --co test/test_prototype_transforms_functional.py::test_eager_vs_scripted
). They all come from a common test, namely eager vs scripted. If we add more tests or more ops, this number will grow fast. This is the reason why PyTorch core does not rely on parametrization for their sample inputs, but rather fall back tofor
loops inside the test.affine_image_tensor
kernel matches its eager counterpart, we only need to test a single set of affine parameters (as long as there is no branching based on them). However, for reference testing, we should test multiple parameter sets to make sure the kernel actually behaves like its reference.Design goal
This PR sets out to solve the problems detailed above while retaining all the positive aspects of the current implementation.
Introduce the
TensorLoader
class: it wraps another callable that in the end will create the tensor, but it knows theshape
,dtype
and possible other feature metadata ahead of time. Thedevice
will only be passed at runtime to allow us to parametrize over different devices. With this we can continue to rely on the tensor attributes during sample input generation, e.g.vision/test/test_prototype_transforms_functional.py
Line 248 in b5c961d
vision/test/test_prototype_transforms_functional.py
Line 264 in b5c961d
without actually instantiating the tensors.
At test time, the tensor can simply instantiated with
TensorLoader(...).load(device)
. For convenience,ArgsKwargs
was made aware ofTensorLoader
and got a.load(device)
method as well. With these, the common tests will look somewhat like this:This approach of "lazy loading" is similar to the concept of lazy tensors although stripping everything we don't need. To avoid confusion, I preferred to use the term "load" over "lazy" here.
Introduce a
reference_inputs_fn
alongside thesample_inputs_fn
. As the name implies, the former will only be used for reference tests and should be comprehensive with respect to the tested values. In contrast, thesample_inputs_fn
should only cover all valid code paths. This is on par with PyTorch core does with theirOpInfo
framework although they have even more diverse sample inputs functions, like theerror_inputs_func
.Limitations
There are two things that are not included in the current design:
Todo
This PR mostly introduces the new framework while adding some kernels as examples. There are three ways to add the remaining ones:
My preference is 3. -> 2. -> 1. but I'll leave that up to the reviewers. Here is the list of what kernels are done or missing:
clamp_bounding_box
convert_bounding_box_format
convert_color_space
image
adjust_brightness
image
adjust_contrast
image
adjust_gamma
image
adjust_hue
image
adjust_saturation
image
adjust_sharpness
image
autocontrast
image
equalize
image
invert
image
posterize
image
solarize
image
affine
bounding_box
image
segmentation_mask
center_crop
bounding_box
image
segmentation_mask
crop
bounding_box
image
segmentation_mask
elastic
bounding_box
image
segmentation_mask
five_crop
image
horizontal_flip
bounding_box
image
segmentation_mask
pad
bounding_box
image
segmentation_mask
perspective
bounding_box
image
segmentation_mask
resize
bounding_box
image
segmentation_mask
resized_crop
bounding_box
image
segmentation_mask
rotate
bounding_box
image
segmentation_mask
ten_crop
bounding_box
image
segmentation_mask
vertical_flip
bounding_box
image
segmentation_mask