Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several Questions #383

Closed
qsisi opened this issue Nov 20, 2021 · 17 comments
Closed

Several Questions #383

qsisi opened this issue Nov 20, 2021 · 17 comments

Comments

@qsisi
Copy link

qsisi commented Nov 20, 2021

Hello! Thanks for open-sourcing this amazing repository. I got several fundamental questions,

  1. Under my understanding, spconv is actually another implementation of SparseConvNet, so basically there is not much difference between these two implementations right?
  2. The SparseConv3d is a regular convolution definition, but for SubMConv3d, the convolution output will be counted only when the kernel center covers an input site. So I don't understand how the param stride affects the output in SubMConv3d, or based on my understanding, there is actually no stride in SubMConv3d?
  3. Could you kindly offer a code example for something like UNet(encoder-decoder)? I found that after I stack SparseConv3d and SparseConvTranspose3d the resolution of the point cloud just can't recover.

Hoping to get your answers!

@FindDefinition
Copy link
Collaborator

FindDefinition commented Nov 20, 2021

  1. Yes, both spconv and SparseConvNet can get same result.
    There are several open-source implementation: SparseConvNet (the first open-source library with CUDA support), spconv 1.x , MinkowskiEngine, torchsparse and spconv 2.x. spconv 2.x is the fastest implementation and greatly faster than other implementation when use fp16.
  2. you are right, the stride and padding argument isn't used, they are calculated here
  3. the SparseConvTranspose3d is equivalent to dense version. for UNet, you should use SparseInverseConv3d. see this for a short description. more examples are working in progress, not available for now. you can check PointGroup (based on spconv 1.x) to see a example. torch-point3d have more examples, but it doesn't support spconv.

@qsisi
Copy link
Author

qsisi commented Nov 20, 2021

Thanks for your explanations, I totally got your points right there.

Further, I have several issues here, would you mind taking a look:

  1. I'm currently pre-processing the input point cloud like this,
pts = 10 * np.random.uniform(0, 1, (5000, 3))

### voxelization ###
xyzmin = np.percentile(pts, 0, axis=0)
xyzmax = np.percentile(pts, 100, axis=0)
voxel_size = 0.025
voxel_generator = Point2VoxelCPU3d(vsize_xyz=[voxel_size] * 3,
                                       coors_range_xyz=[xyzmin[0], xyzmin[1], xyzmin[2], xyzmax[0],
                                                        xyzmax[1], xyzmax[2]],
                                       num_point_features=3,
                                       max_num_voxels=500000,  # a large number
                                       max_num_points_per_voxel=1)
pts = tv.from_numpy(np.asarray(pts))
voxels_tv, indices_tv, num_p_in_vx_tv = voxel_generator.point_to_voxel_empty_mean(pts)

### prepare for sparse input ###
indices_np = indices_tv.numpy_view()
features = torch.ones((len(indices_np), 1))
coords = torch.as_tensor(indices_np)
coords = torch.cat((torch.zeros((len(coords), 1)), coords), dim=-1)
coords[:, 1], coords[:, 3] = coords[:, 3], coords[:, 1]
coords_min = np.floor(np.percentile(coords, 0, axis=0)).astype(np.int32)
coords_max = np.ceil(np.percentile(coords, 100, axis=0)).astype(np.int32)
sparse_shape = (coords_max - coords_min)[1:]
input_sp_tensor = spconv.SparseConvTensor(features.cuda(), coords.int().cuda(), sparse_shape, batch_size=1)

Actually, I don't understand what the param sparse_shape means here, am I doing it right here by just passing the coordinates range into it? Or is there any better way of doing this?

  1. All of the layers need to be put inside the SparseSequential, right?

Again, thank you so much for your help.

@FindDefinition
Copy link
Collaborator

FindDefinition commented Nov 20, 2021

@qsisi

  1. the spatial shape (sparse_shape) is used for:
  • bound spatial range of coords
  • convert coords to scalar int and convert scalar back to coords.
  1. No, the SparseSequential is only used for auto-conversion when you use nn.BatchNorm1d from torch.nn. you can call batchnorm/relu/... in forward by
x = x.replace_feature(F.relu(self.bn(x.features)))

Actually we can rewrite BatchNorm to make it accept SparseConvTensor and just use nn.Sequential.

Note: you shouldn't use Point2VoxelCPU3d because it can't be pickled, use spconv.pytorch.utils.PointToVoxel instead.

@qsisi
Copy link
Author

qsisi commented Nov 20, 2021

  1. I'm now computing the exact range for x,y,z coordinates and passing them into the param spatial shape, but what if I pass into a random large or small range list to the spatial_shape? Looks like constructing the sparse tensor with random spatial_shape would not report any error by doing so.
  2. Actually I don't understand what does the 'pickled' mean, but I just change the Point2VoxelCPU3d to spconv.pytorch.utils.PointToVoxel, and it works fine.

@qsisi
Copy link
Author

qsisi commented Nov 20, 2021

Same situation for the param coors_range_xyz in PointToVoxel, passing into two different large range lists will get two different numbers of voxels as output, so is there any suggested coors_range_xyz selection strategy?

@FindDefinition
Copy link
Collaborator

The "PointToVoxel" process is actually quantization of real pointcloud. it convert points from real-valued coords to quantized coords starts with [0, 0, 0], ends with spatial shape.
You can select a large range that can cover your point cloud.
The only limit of coord range its that volume of spatial shape need to smaller than 2 ^ 31 for now, I plan to release this limit in future.

@qsisi
Copy link
Author

qsisi commented Nov 23, 2021

Many thanks for your attention, here I have a new problem:

I'm voxelizing the input point cloud to build sparse tensors, here the input point cloud (17397 points) looks like this:
image
Here I'm using the following script for voxelization:

# build sparse tensor
xyzmin, xyzmax = np.percentile(pcd, 0, axis=0), np.percentile(pcd, 100, axis=0)
voxel_generator = PointToVoxel(vsize_xyz=[self.voxel_size] * 3,
                                            coors_range_xyz=[xyzmin[0], xyzmin[1], xyzmin[2], xyzmax[0],
                                                        xyzmax[1], xyzmax[2]],
                                            num_point_features=3,
                                            max_num_voxels=500000,
                                            max_num_points_per_voxel=1)
voxels_tv, indices_tv, _ = voxel_generator(torch.from_numpy(pcd).contiguous())
voxels_pts, voxels_coords = voxels_tv.numpy().squeeze(1), indices_tv.numpy()
voxels_coords[:, 0], voxels_coords[:, 2] = voxels_coords[:, 2], voxels_coords[:, 0]

I assume that voxels_coords (11593 points) is the discrete coordinates for point cloud, but when I visualize it using Open3D, here is the visualization result:
image
It is absolutely a PLANE but not the expected voxelization result of the input point cloud, I don't know how to solve it, could you kindly help me with this?

Thanks.

@FindDefinition
Copy link
Collaborator

@qsisi this is due to this line:
voxels_coords[:, 0], voxels_coords[:, 2] = voxels_coords[:, 2], voxels_coords[:, 0]
you can only perform this kind of swap in python instance level, not in-place level (modify a python object)
I recommend to use this:
voxels_coords = voxels_coords[:, [2, 1, 0]]

@qsisi
Copy link
Author

qsisi commented Nov 23, 2021

Thank you so much for your help! Now it works for me.

Also, I have a naive question here, can spconv be utilized for dense feature extraction or point cloud semantic segmentation? It is a little bit confusing to me because for every input point cloud, the voxelization process will remove some points in the original point cloud, let's say from N->N' during the voxel quantization, and constructing a Encoder-Decoder network will only output N' features, so how to recover the resolution at the output end from N' to N?

Thanks for taking up your time for these questions.

@Number1JT
Copy link

@qsisi
An naive solution :
N is the number of points, M is the number of Voxel . since you know the map relation between each point and each voxel, query the corresponding voxel features and append to points feature, the concat features can been used for further segmentation task.

@FindDefinition
Copy link
Collaborator

To achieve method above, I need to add voxel_id_of_pc output in PointToVoxel, will be added in next bug-fix release (v2.1.12).

@qsisi
Copy link
Author

qsisi commented Nov 23, 2021

@qsisi An naive solution : N is the number of points, M is the number of Voxel . since you know the map relation between each point and each voxel, query the corresponding voxel features and append to points feature, the concat features can been used for further segmentation task.

Thanks for your answer! Actually I've already done this while I was using MinkowskiEngine as my network backbone, and Mink has the API leaved for the situation like this:

### MinkowskiEngine
output.features_at_coordinates(query) 

As for the spconv, I assume that there has to be an another(better) way of doing this, so I'm just come straight and ask here.

@qsisi
Copy link
Author

qsisi commented Nov 23, 2021

To achieve method above, I need to add voxel_id_of_pc output in PointToVoxel, will be added in next bug-fix release (v2.1.12).

That would be nice to work around instead of writing a slow python interface for such querying operation.

Looking forward to that functionality.

@FindDefinition
Copy link
Collaborator

@qsisi voxel id is available now. see this example

@qsisi
Copy link
Author

qsisi commented Nov 24, 2021

Thank you so much for the quick support, can't wait to try it out.

@qsisi
Copy link
Author

qsisi commented Nov 26, 2021

I noticed that when using spconv==2.1.13 this error:

RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

exits again, and when we downgrade the spconv version to 2.1.11 the bug is fixed.

@FindDefinition
Copy link
Collaborator

my mistake, I add "get_cuda_stream" back in cpu voxel generater in spconv 2.1.12...
For now you can remove this line in package folder installed in your env, I need to debug a serious bug in another issue, and fix both in next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants