Implement nearest query for BruteForce #1053

aprokop · 2024-04-03T21:02:47Z

This is a straightforward not-optimized version of the nearest query for BruteForce. I think, k=1 case should be separated out in the future, as it can be performed in a tiling manner similar to the spatial search. The same, however, can't be said about k > 1 case.

Right now, a single thread is allocated per predicate, that goes through all indexables. This limitation is due to the fact that we don't have a multi-thread PriorityQueue.

I reenabled the nearest queries tests in the tests.

My overall motivation for implementing this is to try using BruteForce as the top tree in the DistributedTree.

test/tstKokkosToolsAnnotations.cpp

test/CMakeLists.txt

dalg24 · 2024-04-03T23:26:50Z

src/details/ArborX_DetailsNearestBufferProvider.hpp

+  Kokkos::View<PairIndexDistance *, MemorySpace> _buffer;
+  Kokkos::View<int *, MemorySpace> _offset;
+
+  NearestBufferProvider() = default;


Why do we need a default constructor?

Technically, we don't need to allocate the storage when called for an empty tree. So, we could avoid doing the scan over primitives k's in that case.

But we never call it do we?

We implicitly do, in the TreeTraversal constructor.

src/details/ArborX_DetailsNearestBufferProvider.hpp

src/details/ArborX_DetailsBruteForceImpl.hpp

dalg24 · 2024-04-03T23:34:20Z

src/details/ArborX_DetailsBruteForceImpl.hpp

+          if (k < 1)
+            return;


Note to self: this really should be a precondition
This was brought up in the past but I see we still do not enforce in TreeTraversal.

dalg24 · 2024-04-03T23:42:00Z

src/details/ArborX_DetailsBruteForceImpl.hpp

+          using PairIndexDistance =
+              typename NearestBufferProvider<MemorySpace>::PairIndexDistance;


As a follow up we should think of making this a struct with named parameters.
This is really ugly below when we refer to the "second" to signify the distance.

That would be fine with me. The thing I am starting to dislike is having all those PairValueIndex, PairIndexRank, PairIndexDistance thingies floating around. Wonder if there's a better way to handle that.

dalg24 · 2024-04-03T23:47:32Z

src/details/ArborX_DetailsBruteForceImpl.hpp

+          while (!heap.empty())
+          {
+            callback(predicate, values(heap.top().first));
+            heap.pop();
+          }


We should probably comment that this is sorting the heap.
We technically did not intend to guarantee any order for nearest queries but I suppose we do sort as well in TreeTraversal

Except it does not sort the heap in the increasing order. Rather, it is in decreasing order. So, the callbacks here would be called in a different order than in BVH (where they would be called in increasing distance order).

Right...
Why did you choose to do that instead of just looping over the elements of the underlying storage?
I know this escapes the control of the data structure but it is more efficient because it skips the heap operations.
(not blocking nor asking you to change at this time, just trying to figure out why you did it this way)

Wonder if we should do

sortHeap(heap.data(), heap.data() + heap.size(), heap.valueComp()); for (decltype(heap.size()) i = 0; i < heap.size(); ++i) _callback(predicate, values(heap.data() + i)->first);

We could skip sortHeap, but I wonder if we should. If we don't, we would replicate behavior of the BVH in that the callback will be called in the order from the nearest to further.

Implemented with sorting.

dalg24 · 2024-04-03T23:48:32Z

src/details/ArborX_DetailsNearestBufferProvider.hpp

+{
+
+template <typename MemorySpace>
+struct NearestBufferProvider


Note to self to get back to this.

masterleinad · 2024-04-04T16:31:45Z

src/details/ArborX_DetailsNearestBufferProvider.hpp

+    Kokkos::parallel_for(
+        "ArborX::NearestBufferProvider::scan_queries_for_numbers_of_neighbors",
+        Kokkos::RangePolicy<ExecutionSpace>(space, 0, n_queries),
+        KOKKOS_CLASS_LAMBDA(int i) { _offset(i) = getK(predicates(i)); });
+    KokkosExt::exclusive_scan(space, _offset, _offset, 0);
+    int const buffer_size = KokkosExt::lastElement(space, _offset);


Any good reason not to do all of this in one parallel_scan call? Do we expect getK to be more expensive than launching another kernel?

We need to measure that the performance gain is worth the added code complexity but yes that is a good suggestion to use a parallel_scan with a trailing return value argument.

The main thing is that we don't have a function with a good interface that returns the trailing value. It certainly is not a performance critical thing.

If we do decide to do something about it, we should talk about the interface. I would propose not doing it in this PR.

dalg24

I think it is good enough

aprokop · 2024-04-04T19:19:13Z

I'm not sure why CUDA-Clang failed. Seems totally unrelated.

masterleinad · 2024-04-04T19:43:40Z

/opt/boost/include/boost/mpl/assert.hpp:83:5: note: candidate function template not viable: no known conversion from 'mpl_::failed ************(boost::mpl::is_sequence<std::tuple<Kokkos::Device<Kokkos::Cuda, Kokkos::CudaSpace>, Kokkos::Device<Kokkos::Threads, Kokkos::HostSpace>>>::************)' to 'typename assert<false>::type' (aka 'mpl_::assert<false>') for 1st argument
int assertion_failed( typename assert<C>::type );
    ^
1 error generated when compiling for sm_70.
test/CMakeFiles/ArborX_Test_DetailsTreeConstruction.exe.dir/build.make:91: recipe for target 'test/CMakeFiles/ArborX_Test_DetailsTreeConstruction.exe.dir/tstIndexableGetter.cpp.o' failed
make[2]: *** [test/CMakeFiles/ArborX_Test_DetailsTreeConstruction.exe.dir/tstIndexableGetter.cpp.o] Error 1
make[2]: Leaving directory '/var/jenkins/workspace/ArborX_PR-1053/build'

aprokop · 2024-04-04T22:10:58Z

@masterleinad Right. I saw that and it does not make sense to me. It dowa not have to do anything with this PR.

aprokop · 2024-04-05T04:45:12Z

I'm not sure why CUDA-Clang failed. Seems totally unrelated.

It was likely introduced in e21c55a. This is a failure already in master.

aprokop added 4 commits April 3, 2024 15:55

Implement nearest query for BruteForce

8e88b76

Reenable nearest query for testing with BruteForce

668c9b7

Move out the common nearest buffer allocation part

c0deec3

Improve performance by removing some if statements

bd02e09

aprokop added the enhancement New feature or request label Apr 3, 2024

aprokop requested a review from dalg24 April 3, 2024 21:05

aprokop mentioned this pull request Apr 3, 2024

Changelog 1.6 #985

Closed

dalg24 reviewed Apr 3, 2024

View reviewed changes

aprokop added 3 commits April 4, 2024 00:33

Few minor changes based on review

9167cf7

Few more changes

d43722c

Remove old code artifact

6fd2cfa

masterleinad reviewed Apr 4, 2024

View reviewed changes

dalg24 approved these changes Apr 4, 2024

View reviewed changes

masterleinad approved these changes Apr 4, 2024

View reviewed changes

Add missing headers

7e95982

aprokop force-pushed the brute_force_nearest branch from 650d077 to 7e95982 Compare April 4, 2024 18:40

aprokop merged commit d48fe3d into arborx:master Apr 5, 2024
1 of 2 checks passed

aprokop deleted the brute_force_nearest branch April 5, 2024 04:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement nearest query for BruteForce #1053

Implement nearest query for BruteForce #1053

aprokop commented Apr 3, 2024

dalg24 Apr 3, 2024

aprokop Apr 4, 2024

dalg24 Apr 4, 2024

aprokop Apr 4, 2024

dalg24 Apr 3, 2024

dalg24 Apr 3, 2024

aprokop Apr 4, 2024

dalg24 Apr 3, 2024

aprokop Apr 4, 2024

dalg24 Apr 4, 2024

aprokop Apr 4, 2024

aprokop Apr 4, 2024

dalg24 Apr 3, 2024

masterleinad Apr 4, 2024

dalg24 Apr 4, 2024 •

edited

aprokop Apr 4, 2024 •

edited

dalg24 Apr 4, 2024

dalg24 left a comment

aprokop commented Apr 4, 2024

masterleinad commented Apr 4, 2024

aprokop commented Apr 4, 2024

aprokop commented Apr 5, 2024

		using PairIndexDistance =
		typename NearestBufferProvider<MemorySpace>::PairIndexDistance;

Implement nearest query for BruteForce #1053

Implement nearest query for BruteForce #1053

Conversation

aprokop commented Apr 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalg24 Apr 4, 2024 • edited

Choose a reason for hiding this comment

aprokop Apr 4, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dalg24 left a comment

Choose a reason for hiding this comment

aprokop commented Apr 4, 2024

masterleinad commented Apr 4, 2024

aprokop commented Apr 4, 2024

aprokop commented Apr 5, 2024

dalg24 Apr 4, 2024 •

edited

aprokop Apr 4, 2024 •

edited