ANN: Optimize host-side refine #1651

achirkin · 2023-07-18T14:29:19Z

Prior to this change, raft's host-side implementation of raft::neighbors::refine operation uses non-optimal OpenMP thread config by default, spawning as many threads as there are available cores, even if only one thread is used (per-query parallelism with batch size one).
This change fixes that and adds a few optimizations alongside:

Use the number of threads = min(cores, queries)
Use templates to push the metric-type condition outside the distance computation loop (this should also make it easier to implement new metrics later)
Force tree-optimize compilation flag in the hopes compiles does the vectorization
Split out the host implementation in separate files to be able to compile it without nvcc.

cpp/CMakeLists.txt

tfeher

Thanks Artem for the PR, it looks good to me (apart from the issue with the unnecessary extra compile flag)!

achirkin · 2023-07-19T09:34:02Z

On the performance note: the new code gives some ridiculous speed up of up to x500-x1000 in the case of small batches (n_query = 1), I'm not sure why. Also one cannot rely on the perf reporting in the current prims bench, because it uses cuda events to measure the time, while the host-only refine obviously does not use cuda streams (#1653). I will bring up the necessary updates to the benchmarks in a follow-up PR.

cpp/CMakeLists.txt

cjnolet · 2023-07-19T17:44:48Z

/merge

neighbors: Optimize host-side refine

96ca91a

achirkin added 3 - Ready for Review improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 18, 2023

achirkin self-assigned this Jul 18, 2023

github-actions bot added cpp CMake labels Jul 18, 2023

achirkin added 2 - In Progress Currenty a work in progress and removed 3 - Ready for Review cpp CMake labels Jul 18, 2023

Fix missing includes

dfac331

github-actions bot added cpp CMake labels Jul 18, 2023

achirkin marked this pull request as ready for review July 18, 2023 16:58

achirkin requested review from a team as code owners July 18, 2023 16:58

achirkin added 3 - Ready for Review and removed 2 - In Progress Currenty a work in progress labels Jul 18, 2023

achirkin commented Jul 18, 2023

View reviewed changes

cpp/CMakeLists.txt Outdated Show resolved Hide resolved

tfeher approved these changes Jul 18, 2023

View reviewed changes

Merge branch 'branch-23.08' into enh-optimize-host-refine

2e5aeb6

achirkin commented Jul 19, 2023

View reviewed changes

cpp/CMakeLists.txt Outdated Show resolved Hide resolved

achirkin added 2 commits July 19, 2023 15:46

Update cpp/CMakeLists.txt

bd895b1

Merge branch 'branch-23.08' into enh-optimize-host-refine

379210c

rapids-bot bot merged commit f0e75f2 into rapidsai:branch-23.08 Jul 19, 2023
59 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ANN: Optimize host-side refine #1651

ANN: Optimize host-side refine #1651

achirkin commented Jul 18, 2023

tfeher left a comment

achirkin commented Jul 19, 2023 •

edited

Loading

cjnolet commented Jul 19, 2023

ANN: Optimize host-side refine #1651

ANN: Optimize host-side refine #1651

Conversation

achirkin commented Jul 18, 2023

tfeher left a comment

Choose a reason for hiding this comment

achirkin commented Jul 19, 2023 • edited Loading

cjnolet commented Jul 19, 2023

achirkin commented Jul 19, 2023 •

edited

Loading