The problem of inconsistent effects between torchsparse's conv3d downsampling and minkowski's #242

Jovendish · 2023-09-21T05:44:15Z

I'm using torchsparse's conv3d to do a downsampling operation with stride 2, but found that this operation not only reduces the size of the feature tensor, but also the coordinates, which is inconsistent with minkowski's performance. I was hoping to find a way to make torchsparse's conv3d's downsampling operation consistent with minkowski.

I checked the documentation of torchsparse but didn't find a relevant solution.

Is there any other parameter setting or custom operation method that can make torchsparse's conv3d downsampling operation consistent with minkowski? If you have any suggestions or guidance I would be very grateful.

Details

![WechatIMG11](https://github.com/mit-han-lab/torchsparse/assets/25397930/d9d041e8-1903-
4308-847c-b3987b67c739)

zhijian-liu · 2023-09-29T19:24:14Z

When applying a downsampling operation with a stride of 2, the coordinates are effectively halved. If you wish to maintain the original coordinate scale, you can easily achieve this by multiplying the coordinates by 2.

Jovendish · 2023-10-04T08:26:22Z

thanks for your reply.

In my code, I have three layers of downsampling operations. I have tried to scale the coordinates back to their original scale by multiplying them by two after each downsampling layer. However, I have noticed that this operation only works for the first downsampling layer, as the subsequent downsampling layers yield the same results. I'm unsure if I made a mistake in my implementation or if I have encountered some specific mechanism in torchsparse.

Partial code

`

F.set_conv_mode(2)
F.set_kmap_mode('hashmap')
F.set_downsample_mode('minkowski')

class Encoder(torch.nn.Module):
    def __init__(self, channels=[1, 16, 32, 64, 32, 8]):
        super().__init__()

        self.stack_0 = nn.Sequential(
            spnn.Conv3d(channels[0], channels[1], 3, 1, bias=True),
            spnn.ReLU(inplace=True),
            spnn.Conv3d(channels[1], channels[2], 2, 2, bias=True),  # DownScale
            spnn.ReLU(inplace=True),
        )

        self.stack_1 = nn.Sequential(
            spnn.Conv3d(channels[2], channels[2], 3, 1, bias=True),
            spnn.ReLU(inplace=True),
            spnn.Conv3d(channels[2], channels[3], 2, 2, bias=True),  # DownScale
            spnn.ReLU(inplace=True),
        )

        self.stack_2 = nn.Sequential(
            spnn.Conv3d(channels[3], channels[3], 3, 1, bias=True),
            spnn.ReLU(inplace=True),
            spnn.Conv3d(channels[3], channels[4], 2, 2, bias=True), # DownScale
            spnn.ReLU(inplace=True),
        )

    def forward(self, x):
        out_0 = self.stack_0(x)
        out_0.C[:, 1:] *= 2

        out_1 = self.stack_1(out_0)
        out_1.C[:, 1:] *= 2

        out_2 = self.stack_2(out_1)
        out_2.C[:, 1:] *= 2

        return [out_2, out_1, out_0]

`

Result

ys-2020 · 2023-10-09T14:29:31Z

Hi @Jovendish , in the 2nd and 3rd layers, you are downsampling by stride=2 with the coordinates that has been multiplied by 2. Thus, the number of points will remain the same as the previous layers.

A potential solution might be:

def forward(self, x):
        out_0 = self.stack_0(x)
        out_1 = self.stack_1(out_0)
        out_2 = self.stack_2(out_1)

        out_0.C[:, 1:] *= 2
        out_1.C[:, 1:] *= 4
        out_2.C[:, 1:] *= 8

        return [out_2, out_1, out_0]

Jovendish · 2023-10-11T08:25:50Z

Thank you very much for your patience. But actually, I want to be able to scale back in the middle of each layer, because I need to do some extra work in the middle of each layer, and I am wondering why torchsparsev2.1 changed the behavior of the downsampling layer，are there any considerations ?

zhijian-liu · 2023-10-14T01:22:25Z

You can follow @ys-2020 's approach to clone the coordinate tensor and do the scaling in the middle. We change this behavior to follow SpConv.

zhijian-liu closed this as completed Dec 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The problem of inconsistent effects between torchsparse's conv3d downsampling and minkowski's #242

The problem of inconsistent effects between torchsparse's conv3d downsampling and minkowski's #242

Jovendish commented Sep 21, 2023 •

edited

Loading

zhijian-liu commented Sep 29, 2023

Jovendish commented Oct 4, 2023 •

edited

Loading

ys-2020 commented Oct 9, 2023

Jovendish commented Oct 11, 2023

zhijian-liu commented Oct 14, 2023

The problem of inconsistent effects between torchsparse's conv3d downsampling and minkowski's #242

The problem of inconsistent effects between torchsparse's conv3d downsampling and minkowski's #242

Comments

Jovendish commented Sep 21, 2023 • edited Loading

zhijian-liu commented Sep 29, 2023

Jovendish commented Oct 4, 2023 • edited Loading

ys-2020 commented Oct 9, 2023

Jovendish commented Oct 11, 2023

zhijian-liu commented Oct 14, 2023

Jovendish commented Sep 21, 2023 •

edited

Loading

Jovendish commented Oct 4, 2023 •

edited

Loading