Make the pointwise pooling code more modular #87

kalekundert · 2023-11-01T17:14:44Z

While trying to write a Fourier max pooling layer, I realized that a lot of the general pooling functionality I needed was already implemented by the various pointwise pooling layers, but unfortunately not in a way that could be used by outside classes. To make the pooling code easier to extend, and to reduce code duplication, I made the following changes:

I moved the Gaussian blurring functionality (used by the "antialiased" layers) into it's own standalone module. This eliminates a significant amount of existing code duplication between the PointwiseAvgPoolAntialiased2D, PointwiseAvgPoolAntialiased3D, and _PointwiseMaxPoolAntialiasedND classes, and makes it easier to create new pooling layers that involve Gaussian blurring (which is important for maintaining translational equivariance).
I consolidated the logic in the pointwise evaluate_output_shape() methods into a single standalone function. This again eliminates a significant amount code duplication, this time between pretty much every pointwise pooling class, and again makes it easier to create new pooling layers. ¹
I made an effort to implement forward() using the exact same object returned by export(), to the extent possible. This just helps ensure that the exported model doesn't behave differently from the original model. I also implemented the export() method for the antialiased pooling layers.
I reduced the number of pointwise pooling base classes to three:
- _PointwisePoolND: for directly applying a pytorch pooling layer to a GeometricTensor.
- _PointwiseAvgPoolAntialiasedND: for applying a Gaussian blur.
- _PointwiseMaxPoolAntialiasedND: for applying a max pool followed by a Gaussian blur.
This makes the inheritance hierarchy easier to understand, and eliminates some code duplication between mostly redundant base classes. Note that each of these new base classes pools in a fundamentally different way. A key to being able to architect the code like this is the use of the standalone modules/functions described above, rather than inheritance, to share functionality between the base classes.

To guarantee that this PR does not change any behavior of any of the refactored pooling layers, I added a number of additional tests. These tests fall into two categories:

Starting with a known input tensor, make sure that every element of the output tensor is exactly what it should be. In other words, check that the pooling operation is calculated correctly. These tests are very rigorous, but were tedious to write, because for each I had to calculate the expected result by hand. As a result, I don't test large tensors (i.e. batch and channel dimensions of 1, spatial dimensions of ≈5) and I only test one reasonable/representative set of input parameters for each type of pointwise pooling layer.
Check that every input parameter has the expected result on the shape of the output tensor. While not as rigorous as the above tests, these tests are still good, because the primary effect of most of the input parameters is to change the size of the output. Plus, it's easy to manually calculate expected output sizes for many input parameter variations.

I confirmed that the new tests (and the old tests, which mostly cover equivariance) pass both for the existing code and for this PR. This makes me confident that the proposed changes do not change any existing behavior, which is important.

Minor comments:

The antialiased max pooling layers have a ceil_mode argument that doesn't do anything. This argument affects whether some integer expression divided by the stride is rounded up or down when calculating the output size for a pooling layer. In the antialiased pooling layers, the actual pooling is always done with stride=1, in which case the above quotient is always an integer and rounding has no effect. In the name of backwards compatibility, I kept the argument and added a warning if the user actually specifies it. Eventually, I think this argument should be removed.

Although as an aside, I don't really like that every equivariant module is forced to calculate an output size. This isn't always easy to do, and as far as I can tell nothing in the code base actually uses this information, so it always feels like a waste of time. ↩

- Move the Gaussian blurring functionality (used by the "antialiased" layers) into it's own module. - Consolidate the logic in all of the pointwise `evaluate_output_shape()` methods into a single standalone function. - Implement `forward()` using the exact same object returned by `export()`, to the extent possible. Signed-off-by: Kale Kundert <kale@thekunderts.net>

Signed-off-by: Kale Kundert <kale@thekunderts.net>

kalekundert added 4 commits October 31, 2023 22:05

Make it easier to customize the kernel size of the Gaussian blur

cee4e58

Signed-off-by: Kale Kundert <kale@thekunderts.net>

Add an option to account for edge effects when blurring

ce99035

Signed-off-by: Kale Kundert <kale@thekunderts.net>

fix: use correct device

948307d

Signed-off-by: Kale Kundert <kale@thekunderts.net>

kalekundert force-pushed the feat-pool branch 2 times, most recently from 33a5d8f to 0d230db Compare November 21, 2023 22:58

kalekundert added 2 commits November 21, 2023 17:59

Accommodate johnmarktaylor91/torchlens#18

0d230db

Signed-off-by: Kale Kundert <kale@thekunderts.net>

Allow inputs with different batch sizes

476725c

Signed-off-by: Kale Kundert <kale@thekunderts.net>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the pointwise pooling code more modular #87

Make the pointwise pooling code more modular #87

kalekundert commented Nov 1, 2023

Make the pointwise pooling code more modular #87

Are you sure you want to change the base?

Make the pointwise pooling code more modular #87

Conversation

kalekundert commented Nov 1, 2023

Footnotes