Several Questions #383

qsisi · 2021-11-20T05:52:54Z

Hello! Thanks for open-sourcing this amazing repository. I got several fundamental questions,

Under my understanding, spconv is actually another implementation of SparseConvNet, so basically there is not much difference between these two implementations right?
The SparseConv3d is a regular convolution definition, but for SubMConv3d, the convolution output will be counted only when the kernel center covers an input site. So I don't understand how the param stride affects the output in SubMConv3d, or based on my understanding, there is actually no stride in SubMConv3d?
Could you kindly offer a code example for something like UNet(encoder-decoder)? I found that after I stack SparseConv3d and SparseConvTranspose3d the resolution of the point cloud just can't recover.

Hoping to get your answers!

FindDefinition · 2021-11-20T06:09:17Z

Yes, both spconv and SparseConvNet can get same result.
There are several open-source implementation: SparseConvNet (the first open-source library with CUDA support), spconv 1.x , MinkowskiEngine, torchsparse and spconv 2.x. spconv 2.x is the fastest implementation and greatly faster than other implementation when use fp16.
you are right, the stride and padding argument isn't used, they are calculated here
the SparseConvTranspose3d is equivalent to dense version. for UNet, you should use SparseInverseConv3d. see this for a short description. more examples are working in progress, not available for now. you can check PointGroup (based on spconv 1.x) to see a example. torch-point3d have more examples, but it doesn't support spconv.

qsisi · 2021-11-20T07:15:06Z

Thanks for your explanations, I totally got your points right there.

Further, I have several issues here, would you mind taking a look:

I'm currently pre-processing the input point cloud like this,

pts = 10 * np.random.uniform(0, 1, (5000, 3))

### voxelization ###
xyzmin = np.percentile(pts, 0, axis=0)
xyzmax = np.percentile(pts, 100, axis=0)
voxel_size = 0.025
voxel_generator = Point2VoxelCPU3d(vsize_xyz=[voxel_size] * 3,
                                       coors_range_xyz=[xyzmin[0], xyzmin[1], xyzmin[2], xyzmax[0],
                                                        xyzmax[1], xyzmax[2]],
                                       num_point_features=3,
                                       max_num_voxels=500000,  # a large number
                                       max_num_points_per_voxel=1)
pts = tv.from_numpy(np.asarray(pts))
voxels_tv, indices_tv, num_p_in_vx_tv = voxel_generator.point_to_voxel_empty_mean(pts)

### prepare for sparse input ###
indices_np = indices_tv.numpy_view()
features = torch.ones((len(indices_np), 1))
coords = torch.as_tensor(indices_np)
coords = torch.cat((torch.zeros((len(coords), 1)), coords), dim=-1)
coords[:, 1], coords[:, 3] = coords[:, 3], coords[:, 1]
coords_min = np.floor(np.percentile(coords, 0, axis=0)).astype(np.int32)
coords_max = np.ceil(np.percentile(coords, 100, axis=0)).astype(np.int32)
sparse_shape = (coords_max - coords_min)[1:]
input_sp_tensor = spconv.SparseConvTensor(features.cuda(), coords.int().cuda(), sparse_shape, batch_size=1)

Actually, I don't understand what the param sparse_shape means here, am I doing it right here by just passing the coordinates range into it? Or is there any better way of doing this?

All of the layers need to be put inside the SparseSequential, right?

Again, thank you so much for your help.

FindDefinition · 2021-11-20T07:52:23Z

@qsisi

the spatial shape (sparse_shape) is used for:

bound spatial range of coords
convert coords to scalar int and convert scalar back to coords.

No, the SparseSequential is only used for auto-conversion when you use nn.BatchNorm1d from torch.nn. you can call batchnorm/relu/... in forward by

x = x.replace_feature(F.relu(self.bn(x.features)))

Actually we can rewrite BatchNorm to make it accept SparseConvTensor and just use nn.Sequential.

Note: you shouldn't use Point2VoxelCPU3d because it can't be pickled, use spconv.pytorch.utils.PointToVoxel instead.

qsisi · 2021-11-20T08:29:22Z

I'm now computing the exact range for x,y,z coordinates and passing them into the param spatial shape, but what if I pass into a random large or small range list to the spatial_shape? Looks like constructing the sparse tensor with random spatial_shape would not report any error by doing so.
Actually I don't understand what does the 'pickled' mean, but I just change the Point2VoxelCPU3d to spconv.pytorch.utils.PointToVoxel, and it works fine.

qsisi · 2021-11-20T08:37:18Z

Same situation for the param coors_range_xyz in PointToVoxel, passing into two different large range lists will get two different numbers of voxels as output, so is there any suggested coors_range_xyz selection strategy?

FindDefinition · 2021-11-20T09:09:01Z

The "PointToVoxel" process is actually quantization of real pointcloud. it convert points from real-valued coords to quantized coords starts with [0, 0, 0], ends with spatial shape.
You can select a large range that can cover your point cloud.
The only limit of coord range its that volume of spatial shape need to smaller than 2 ^ 31 for now, I plan to release this limit in future.

qsisi · 2021-11-23T03:32:55Z

Many thanks for your attention, here I have a new problem:

I'm voxelizing the input point cloud to build sparse tensors, here the input point cloud (17397 points) looks like this:

Here I'm using the following script for voxelization:

# build sparse tensor
xyzmin, xyzmax = np.percentile(pcd, 0, axis=0), np.percentile(pcd, 100, axis=0)
voxel_generator = PointToVoxel(vsize_xyz=[self.voxel_size] * 3,
                                            coors_range_xyz=[xyzmin[0], xyzmin[1], xyzmin[2], xyzmax[0],
                                                        xyzmax[1], xyzmax[2]],
                                            num_point_features=3,
                                            max_num_voxels=500000,
                                            max_num_points_per_voxel=1)
voxels_tv, indices_tv, _ = voxel_generator(torch.from_numpy(pcd).contiguous())
voxels_pts, voxels_coords = voxels_tv.numpy().squeeze(1), indices_tv.numpy()
voxels_coords[:, 0], voxels_coords[:, 2] = voxels_coords[:, 2], voxels_coords[:, 0]

I assume that voxels_coords (11593 points) is the discrete coordinates for point cloud, but when I visualize it using Open3D, here is the visualization result:

It is absolutely a PLANE but not the expected voxelization result of the input point cloud, I don't know how to solve it, could you kindly help me with this?

Thanks.

FindDefinition · 2021-11-23T03:56:03Z

@qsisi this is due to this line:
voxels_coords[:, 0], voxels_coords[:, 2] = voxels_coords[:, 2], voxels_coords[:, 0]
you can only perform this kind of swap in python instance level, not in-place level (modify a python object)
I recommend to use this:
voxels_coords = voxels_coords[:, [2, 1, 0]]

qsisi · 2021-11-23T07:51:17Z

Thank you so much for your help! Now it works for me.

Also, I have a naive question here, can spconv be utilized for dense feature extraction or point cloud semantic segmentation? It is a little bit confusing to me because for every input point cloud, the voxelization process will remove some points in the original point cloud, let's say from N->N' during the voxel quantization, and constructing a Encoder-Decoder network will only output N' features, so how to recover the resolution at the output end from N' to N?

Thanks for taking up your time for these questions.

Number1JT · 2021-11-23T08:04:28Z

@qsisi
An naive solution :
N is the number of points, M is the number of Voxel . since you know the map relation between each point and each voxel, query the corresponding voxel features and append to points feature, the concat features can been used for further segmentation task.

FindDefinition · 2021-11-23T08:12:25Z

To achieve method above, I need to add voxel_id_of_pc output in PointToVoxel, will be added in next bug-fix release (v2.1.12).

qsisi · 2021-11-23T08:12:30Z

@qsisi An naive solution : N is the number of points, M is the number of Voxel . since you know the map relation between each point and each voxel, query the corresponding voxel features and append to points feature, the concat features can been used for further segmentation task.

Thanks for your answer! Actually I've already done this while I was using MinkowskiEngine as my network backbone, and Mink has the API leaved for the situation like this:

### MinkowskiEngine
output.features_at_coordinates(query)

As for the spconv, I assume that there has to be an another(better) way of doing this, so I'm just come straight and ask here.

qsisi · 2021-11-23T08:18:19Z

To achieve method above, I need to add voxel_id_of_pc output in PointToVoxel, will be added in next bug-fix release (v2.1.12).

That would be nice to work around instead of writing a slow python interface for such querying operation.

Looking forward to that functionality.

FindDefinition · 2021-11-24T01:43:43Z

@qsisi voxel id is available now. see this example

qsisi · 2021-11-24T01:51:11Z

Thank you so much for the quick support, can't wait to try it out.

qsisi · 2021-11-26T07:41:24Z

I noticed that when using spconv==2.1.13 this error:

RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

exits again, and when we downgrade the spconv version to 2.1.11 the bug is fixed.

FindDefinition · 2021-11-26T08:12:18Z

my mistake, I add "get_cuda_stream" back in cpu voxel generater in spconv 2.1.12...
For now you can remove this line in package folder installed in your env, I need to debug a serious bug in another issue, and fix both in next release.

qsisi closed this as completed Jan 4, 2022

zyl1336110861 mentioned this issue Jun 6, 2022

About the inference time mit-han-lab/bevfusion#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Several Questions #383

Several Questions #383

qsisi commented Nov 20, 2021

FindDefinition commented Nov 20, 2021 •

edited

qsisi commented Nov 20, 2021

FindDefinition commented Nov 20, 2021 •

edited

qsisi commented Nov 20, 2021

qsisi commented Nov 20, 2021

FindDefinition commented Nov 20, 2021

qsisi commented Nov 23, 2021

FindDefinition commented Nov 23, 2021

qsisi commented Nov 23, 2021

Number1JT commented Nov 23, 2021

FindDefinition commented Nov 23, 2021

qsisi commented Nov 23, 2021 •

edited

qsisi commented Nov 23, 2021

FindDefinition commented Nov 24, 2021

qsisi commented Nov 24, 2021

qsisi commented Nov 26, 2021

FindDefinition commented Nov 26, 2021

Several Questions #383

Several Questions #383

Comments

qsisi commented Nov 20, 2021

FindDefinition commented Nov 20, 2021 • edited

qsisi commented Nov 20, 2021

FindDefinition commented Nov 20, 2021 • edited

qsisi commented Nov 20, 2021

qsisi commented Nov 20, 2021

FindDefinition commented Nov 20, 2021

qsisi commented Nov 23, 2021

FindDefinition commented Nov 23, 2021

qsisi commented Nov 23, 2021

Number1JT commented Nov 23, 2021

FindDefinition commented Nov 23, 2021

qsisi commented Nov 23, 2021 • edited

qsisi commented Nov 23, 2021

FindDefinition commented Nov 24, 2021

qsisi commented Nov 24, 2021

qsisi commented Nov 26, 2021

FindDefinition commented Nov 26, 2021

FindDefinition commented Nov 20, 2021 •

edited

FindDefinition commented Nov 20, 2021 •

edited

qsisi commented Nov 23, 2021 •

edited