New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Small Feature] Parallelize KD Tree build #4559
[Small Feature] Parallelize KD Tree build #4559
Conversation
I would suggest to remove the parallel tag in the very first example. There should be a section at the end on parallelism, where you build with the parallel tag and at the same time show a parallel for loop for performing parallel queries. |
Why is |
I updated the doc with a Version 2 following @afabri's review. I'm currently investigating the performances behavior with respect to the bucket size: we present results with the default value (10) and with 20, and the performances with 20 are always better in our performance table (it was already the case before this small feature), which probably makes users wonder why 20 is not the default value. |
So I did some more experiment regarding the computation time VS bucket size, and it turns out that for the 800k points case used in the manual, the bucket size that gives the shortest computation time is more around 100: (Violet is sequential, green is parallel.) The manual does explain that the bucket size should increase with the size of the point cloud, which is confirmed by the following experiment: if I randomly simplify the point cloud to keep only 10% of points (80k), then the optimal bucket size drops to around 50. I'm not sure what we should do about this. It looks like the bucket size could be roughly estimated as a factor of the logarithm of the number of points, but do we want to include such an automatic selection in the KD Tree? Or do we keep it as is, meaning it's the responsibility of users to choose the best bucket size according to their needs? |
…ree_build-GF' into Spatial_searching-Parallelize_kd_tree_build-GF
The optimal bucket size may also depend on the kernel (and in particular the dimension). |
|
…lize_kd_tree_build-GF [Small Feature] Parallelize KD Tree build
Successfully tested in https://cgal.geometryfactory.com/CGAL/testsuite/results-5.1-Ic-130.shtml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change.md must be updated + I agree that the commented code in remove should be enabled.
@@ -51,15 +51,12 @@ namespace CGAL { | |||
typedef typename Kdt::iterator iterator; | |||
typedef typename Kdt::D D; | |||
|
|||
bool leaf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that change required in this PR ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not clear to me that this will not negatively impact the performance so except if you show that it is faster you should undo that change.
Co-Authored-By: Sebastien Loriot <sloriot.ml@gmail.com>
…ree_build-GF' into Spatial_searching-Parallelize_kd_tree_build-GF
Did you read my comment in this PR? Do you disagree with it? |
Sorry I missed it. |
Successfully tested in https://cgal.geometryfactory.com/CGAL/testsuite/results-5.1-Ic-139.shtml. |
Rationale
Many algorithms (notably in the Point Set Processing package) rely on the Spatial Searching package for fast neighbor queries. Although many PSP algorithms take advantage of trivial parallelisation to speed-up computation by querying the
kd
tree in parallel for each point, a substantial time can be spend building thekd
tree, especially for very large point clouds. Up to a third of the computation time of algorithms likeCGAL::jet_estimate_normals()
can be spent building thekd
tree when queries are done in parallel.This feature adds a parallel variant of the
KD_tree::build()
member function. On a quad-core processor, this parallel algorithm is experimentally 2 to 3 times faster than the sequential one, depending on the point cloud.Summary of API changes
A template parameter
ConcurrencyTag
is added to theKD_tree:build()
member function, with possible values beingCGAL::Sequential_tag
,CGAL::Parallel_tag
andCGAL::Parallel_if_available_tag
.For backward compatibility, an overload of
KD_tree::build()
without template parameter is provided, falling back to the sequential version. Thebuild()
called internally if the tree is not built at first query is also the sequential one. The rational is that parallelism should always be explicitly specified by users (also to avoid hidden changes of behavior in existing codes).License and copyright ownership
(No change)
CHANGES.md
(todo)
Submission
Version 1 (outdated)
build()
withCGAL::Parallel_if_available_tag
kd
tree design sectionVersion 2
build()
withCGAL::Parallel_tag
and a TBB loop for queriesStatus