Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Point cloud transform block different from paper #16

Closed
patrick-llgc opened this issue Mar 13, 2019 · 1 comment
Closed

Point cloud transform block different from paper #16

patrick-llgc opened this issue Mar 13, 2019 · 1 comment

Comments

@patrick-llgc
Copy link

patrick-llgc commented Mar 13, 2019

Hi!

Thanks for sharing the code. I am reading the paper and found that the implementation of the point cloud transform block (aka T-Net in the original point net paper) is different from what is mentioned in the code.

In the paper, as shown in Fig. 3, the coordinate difference the k nearest neighbor and the coordinates of the point is concatenated (therefor n x k x (3+3) = n x k x 6).

However in the code, there is one additional max pooling along the number of points axis, as compared to the original point net implementation. I do not quite understand why.

If someone can enlighten me or share their thought on this, I would be very grateful.

Edit:
I think I figured out why the input of the t-net is n x k x 6 as the input already went through feature transformation defined here. However I am still puzzled by the additional max pooling operation as compared to the original point net implementation.

@WangYueFt
Copy link
Owner

Hi!

Thanks for sharing the code. I am reading the paper and found that the implementation of the point cloud transform block (aka T-Net in the original point net paper) is different from what is mentioned in the code.

In the paper, as shown in Fig. 3, the coordinate difference the k nearest neighbor and the coordinates of the point is concatenated (therefor n x k x (3+3) = n x k x 6).

However in the code, there is one additional max pooling along the number of points axis, as compared to the original point net implementation. I do not quite understand why.

If someone can enlighten me or share their thought on this, I would be very grateful.

Edit:
I think I figured out why the input of the t-net is n x k x 6 as the input already went through feature transformation defined here. However I am still puzzled by the additional max pooling operation as compared to the original point net implementation.

We initially wanted to use EdgeConv in the spatial transformer as we did for the main backbone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants