Added parallel implementation of PrincipalCurvaturesEstimation#6048
Added parallel implementation of PrincipalCurvaturesEstimation#6048mvieth merged 4 commits intoPointCloudLibrary:masterfrom alexnavtt:master
Conversation
|
Hey @alexnavtt I just notized this comment at the documentation: I wonder if it returns the same information? I haven't looked deeply into this. Edit: |
|
As per having both PrincipalCurvaturesEstimation and a PrincipalCurvaturesEstimationOMP - I think the latest direction was moving the algorithm to be only in the base class and then in the OMP subclass only the methods for setting number of threads etc. So could the PrincipalCurvaturesEstimation be made non-statefull as well and then only keep a single implementation. Whats your opinion @mvieth ? |
|
I think it could make sense to just parallelize the existing class, and not add another class for the parallel implementation. The only concern I would have is whether changing |
|
Alright, this all sounds good. I'll run a quick benchmark today or tomorrow to see if it makes any difference (and personally I'm expecting it won't since statically sized Eigen matrices are allocated on the stack) and then I'll move the edits to the base class. The only remaining question now is whether the parallelized PrincipalCurvaturesEstimation should have default thread number of 0 or 1. I'm leaning towards leaving it at 1 to maintain API stability and people can manually change that if they want parallelization. |
That is true, but I could imagine that it might make a difference due to the
I am fine with either since the output is the same, and I would not consider parallelization a breaking change. But if you want 1 to be the default, that is also okay. |
|
I've migrated the changes to the base class. Benchmarking showed about a 1% slowdown using local variables compared to storing them as class members, but that seems like a fair trade to me with the option for parallelization. I kept the default thread number at 1. |
|
The above is only suggestions as we try to move towards https://clang.llvm.org/extra/clang-tidy/checks/modernize/use-default-member-init.html |
| * low will lead to more parallelization overhead. Setting it too high | ||
| * will lead to a worse balancing between the threads. | ||
| */ | ||
| PrincipalCurvaturesEstimation (unsigned int nr_threads = 1, int chunk_size = 256) : |
There was a problem hiding this comment.
This would mean that it is no longer possible to set nr_threads in the constructor, and that it is no longer possible to change chunk_size at all? Btw the clang-tidy CI already checks use-default-member-init https://github.com/PointCloudLibrary/pcl/blob/master/.clang-tidy#L18
I was using PrincipalCurvaturesEstimation for a project and noticed that it didn't have an OMP variant like a lot of the other features, and the runtime on this algorithm is fairly slow. I took inspiration from the NormalEstimation -> NormalEstimationOMP comparison. I've tested my implementation on a sample pointcloud, and verified that all curvatures were exactly equal to those calculated using the single-threaded version, and it was calculated in parallel as expected.
I do have a couple questions to just double check with any reviewer to make sure I did this right:
getNormalVector3fMap()where applicable. I just want to verify that any class with a normal is guaranteed to have that method available.