Skip to content

Commit 311171c

Browse files
mikejuliet13svoj
authored andcommitted
MDEV-37107 - Optimise dot_product by loop-unrolling by a factor of 4
This patch introduces loop unrolling by a factor of 4 in the dot_product() function used in vector-based distance calculations. The goal is to improve SIMD utilization and overall performance during high-throughput vector operations, particularly in indexing and search routines that rely on this function. Observations from benchmarking (ann-benchmark): - Query Performance (QPS) improved by 4–10% across datasets. - Indexation time reduced by 22–28%. - Loop unrolling factors of 8 or 16 yielded similar performance to factor-4 but made the code less readable. Hence, a factor of 4 was chosen to maintain a balance between performance and code clarity. This change is architecture-specific (PowerPC) and should not introduce any behavioral regressions or side effects in unrelated parts of the codebase. Signed-off-by: Manjul Mohan <manjul.mohan@ibm.com>
1 parent 311b444 commit 311171c

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

sql/vector_mhnsw.cc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -241,6 +241,7 @@ struct FVector
241241
// Round up to process full vector, including padding
242242
size_t base= ((len + POWER_dims - 1) / POWER_dims) * POWER_dims;
243243

244+
#pragma GCC unroll 4
244245
for (size_t i= 0; i < base; i+= POWER_dims)
245246
{
246247
vector short x= vec_ld(0, &v1[i]);

0 commit comments

Comments
 (0)