Details about the code in model.py #14

taylover-pei · 2021-08-18T12:24:51Z

Thanks a lot for sharing the code. You have done a great work!

I have some questions about your code: In the model.py file, can you provide more details about the get_geometry function and the voxel_pooling function? I'm so confused about how they actually work.

Thanks a lot!

manueldiaz96 · 2022-08-22T15:30:13Z

The get_geometry function uses the intrinsic and extrinsic matrices of the camera together with the different defined depths. This way we can see where each pixel is looking to using the projection ray.
The voxel_pooling method first does a cumulative sum of the features (akin to an integral image), then filters out which features look at the same cells (called ranks in the function) and removes them. Finally, finds the sum of the features projected over the cell by subtracting the cumulative values to the original cumulative ones shifting the array by one position. Let me give you an example with some pseudo code:

Lets say we have some features and do the cumulative sum:

feats = [ [1,1], [1,1], [2,2], [2,2], [0,0], [0,0],  [1,1], [1,1], [2,2], [2,2] ]
ft_cumsum = feats.cumsum(0)
>>> [[1,1], [2,2], [4,4], [6,6], [6,6], [6,6],  [7,7], [8,8], [10,10], [12,12] ]

Now, the rank array (which tell us which feats correspond to each cell in a flattened indexing) is filtered using kept by checking if there are repeated ranks. When they are repeated, only the right-most is kept, since this one has the sum of the features that fall in it:

ranks = [0,0,2,3,3,4,5,5,6,7]
kept = ones(feats.shape[0], dtype=bool)
kept[:-1] = (ranks[1:] != ranks[:-1])
>>> [False,  True,  True, False,  True,  True, False,  True,  True,  True]

So as you can see, since cells with indexes 0 and 1, 3 and 4, 6 and 7 fall into the same cells (cells 0, 3 and 5 respectively), we only keep the position in the array which has the sum of all the features that fall within the cell, this is why it is called cumulative sum pooling.

Having the features sum pooled, we just do the difference between the new feature tensor without the first features (since there aren't any before it) and itself removing the last position, which allows us to recover the real sum of the features easily:

ft_cumsum = ft_cumsum[kept]
ft_cumsum = cat(ft_cumsum[:1], ft_cumsum[1:] - ft_cumsum[:-1])
>>> [ [2,2], [2,2], [2,2], [0,0], [2,2], [2,2], [2,2] ]

Which if you do the sum of the features that fall into the same cell, you will find that they match with our result.

I know the answer was a bit late to the question, but I hope it'll help others!
Good luck!

Deephome · 2022-08-24T09:31:43Z

@manueldiaz96 Well done!

VeeranjaneyuluToka · 2022-09-23T08:30:04Z

@manueldiaz96 , Wondering if get_geometry implementation is based on some formula which you can point out me? thanks!

manueldiaz96 · 2022-09-23T11:29:04Z

Take a look at this paper, where in equation 2 they describe how the projection is done from camera to 3D.

In their case, they get the depth from a stereo depth estimation. For Lift Splat Shoot, as I responded you on issue #31 , we are predicting the certainty the network has that the pixel is located at plane D. For LSS, we do not have only one depth value, we have a set of depths from 4m to 45m separated by 1m, which we project the scaled context vector by the classification score for each depth.

If you want to understand better how this works, modify the code in this line to multiply for a ones vector instead of x[:, self.D:(self.D + self.C)].unsqueeze(2), and see how does the variable x looks after being delivered by the get_voxels and before being processed by the bevencode module. Use matplotlib or another visualization library to see how the arrays look and you will see what I am explaining to you.

If you want to better understand this, I would recommend you see how an image is formed in a camera, to see how the geometry works to take something in 3D to project it back to a 2D image. I linked also on issue #31 a series of blogposts which explain this using the camera matrices.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Details about the code in model.py #14

Details about the code in model.py #14

taylover-pei commented Aug 18, 2021

manueldiaz96 commented Aug 22, 2022 •

edited

Loading

Deephome commented Aug 24, 2022

VeeranjaneyuluToka commented Sep 23, 2022 •

edited

Loading

manueldiaz96 commented Sep 23, 2022

Details about the code in model.py #14

Details about the code in model.py #14

Comments

taylover-pei commented Aug 18, 2021

manueldiaz96 commented Aug 22, 2022 • edited Loading

Deephome commented Aug 24, 2022

VeeranjaneyuluToka commented Sep 23, 2022 • edited Loading

manueldiaz96 commented Sep 23, 2022

manueldiaz96 commented Aug 22, 2022 •

edited

Loading

VeeranjaneyuluToka commented Sep 23, 2022 •

edited

Loading