Fixes for `fvdb.nn.SimpleUNet` #336

swahtz · 2025-11-17T00:41:45Z

While working on fixing fvdb.nn.SimpleUNet for the issues reported in #335 where the previously removed fvdb.nn.ReLU module was being used, a few other issues were discovered. This PR proposes to fix the following:

In SimpleUNet, replace the use of fvdb.nn.ReLU(inplace=True) with the functional fvdb.relu_
Remove an erroneous check in SparseConvPackInfo::SparseConvPackInfo that threw an exception if a kmap was being built for a 1x1 conv
Fix the semantics of the parameters to transposed conv to match PyTorch's conventions. Reading PyTorch's Transposed Conv module docs, their convention is that to express a convolution A's transpose convolution B (i.e. the convolution B that inverts the operation performed by A), the channel parameters for B should be the reverse of A (e.g. A's in/out 4/16, B's in/out 16/4) and the 'topological' parameters should be carried over from A (i.e. A's stride=2, B's stride=2; not 1/2, etc.).
In our API, we seem to expect inverse expressions of both types of parameters (but our SimpleUNet code was expressing neither inversely).
Therefore, in this PR, SparseConvolutionKernelMap::forward and SparseConvolutionKernelMap::backward have been fixed to treat normal and transposed in/out channel specifications the same way (it is up to the user to reverse them if that's the desired effect per PyTorch's pattern) and SimpleUNet was changed to reverse the ordering of the supplied source/target grid, to match what the source/target girds are of the convolution it wants to invert (also per PyTorch's pattern of specifying arguments like stride from the inverse convolution).

Fixes #335

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

…ation, I believe in/out channels should be expressed as the desired in/out (not transposed) and the topology parameters (like stride, source/target grid) should be expressed as if they are the 'original' conv operator this transposed conv is meant to invert. Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

blackencino

I want to hold off on the transpose argument reordering until we validate the default transposed convolution with the same rigor that we validated the regular default convolution. I think the errors you're encountering here are in our transpose implementation.

The relu changes are great.

blackencino · 2025-11-19T02:05:24Z

fvdb/nn/simple_unet.py

    def forward(self, data: JaggedTensor, padded_grid: GridBatch, grid: GridBatch) -> JaggedTensor:
        plan = ConvolutionPlan.from_grid_batch_transposed(
-            kernel_size=self.kernel_size, stride=1, source_grid=padded_grid, target_grid=grid
+            kernel_size=self.kernel_size, stride=1, source_grid=grid, target_grid=padded_grid


This looks backwards to me, and counter to the intention. I need to look at the torch.convtranspose3d code more carefully, and create a testing framework that confirms the ordering. Generally speaking, we should not be reordering arguments like this, it indicates a semantic mismatch. If the transpose plans are wrong compared to pytorch, then we should fix it in the plan, not in the unet.

Currently, with source_grid=paded_grid, target_grid=grid, an exception will be thrown as if the order is incorrect. Investigating whether that's a mistake in the transposed convolution code or the ordering of the arguments, is what I was explaining in the PR description above, as to which direction to take this in. If you look at the 'topological' arguments to a PyTorch transposed conv (like stride), a stride of 2 will upsample (insert zeros in the input) in the inverse way a stride of 2 in a conv operator will downsample. So does that imply that we should order our source and target arguments to the transposed conv as if they were the source and target of the conv operator (since we'd also use stride=2, etc. to have the same meaning and not expect stride=1/2)?

I'm fine with not doing this and source/target take the natural meanings, I'm just trying to determine what is the intent both of PyTorch and the state of the transposed conv kmap code. You could also read the PyTorch docs for transposed conv as redefining what 'stride' means in transposed conv as basically 'inverse stride'.

blackencino · 2025-11-19T02:07:00Z

src/fvdb/detail/autograd/SparseConvolutionKernelMap.cpp

-        kernels   = kernels.permute({2, 3, 4, 0, 1}).reshape({-1, inC, outC}).contiguous();
-    }
+
+    TORCH_CHECK_VALUE(!transposed ? inFeatures.size(0) == sizes[0] : inFeatures.size(0) == sizes[1],


This looks backwards to me. I need to confirm what is expected in torch before agreeing that this is how it should behave. It may be that we need to switch how our transposed convolution works.

swahtz added 4 commits November 17, 2025 13:39

Using torch.nn.ReLU instead of fvdb.nn.ReLU; some import cleanups

3806340

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Switch to fvdb.relu_ for functional inplace ReLU

66c2373

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

Remove 1x1 conv kmap building error

c3299a0

Signed-off-by: Jonathan Swartz <jonathan@jswartz.info>

swahtz changed the title ~~Fix use of fvdb.nn.ReLU in SimpleUNet~~ Fixes for fvdb.nn.SimpleUNet Nov 18, 2025

swahtz marked this pull request as ready for review November 18, 2025 23:43

swahtz requested a review from a team as a code owner November 18, 2025 23:43

swahtz requested review from areidmeyer and blackencino November 18, 2025 23:43

swahtz closed this Nov 18, 2025

swahtz reopened this Nov 18, 2025

swahtz added the core library Core fVDB library. i.e. anything in the _Cpp module (C++) or fvdb python module label Nov 18, 2025

swahtz self-assigned this Nov 18, 2025

swahtz added this to fVDB Nov 18, 2025

swahtz added the bug Something isn't working label Nov 18, 2025

blackencino requested changes Nov 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixes for `fvdb.nn.SimpleUNet` #336

Fixes for `fvdb.nn.SimpleUNet` #336

Uh oh!

swahtz commented Nov 17, 2025 •

edited

Loading

Uh oh!

blackencino left a comment

Uh oh!

blackencino Nov 19, 2025

Uh oh!

swahtz Nov 19, 2025

Uh oh!

blackencino Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fixes for fvdb.nn.SimpleUNet #336

Are you sure you want to change the base?

Fixes for fvdb.nn.SimpleUNet #336

Uh oh!

Conversation

swahtz commented Nov 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blackencino left a comment

Choose a reason for hiding this comment

Uh oh!

blackencino Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

swahtz Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

blackencino Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fixes for `fvdb.nn.SimpleUNet` #336

Fixes for `fvdb.nn.SimpleUNet` #336

swahtz commented Nov 17, 2025 •

edited

Loading