Skip to content

Commit

Permalink
util/bufferiszero: Remove useless prefetches
Browse files Browse the repository at this point in the history
Use of prefetching in bufferiszero.c is quite questionable:

- prefetches are issued just a few CPU cycles before the corresponding
  line would be hit by demand loads;

- they are done for simple access patterns, i.e. where hardware
  prefetchers can perform better;

- they compete for load ports in loops that should be limited by load
  port throughput rather than ALU throughput.

Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Mikhail Romanov <mmromanov@ispras.ru>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240206204809.9859-5-amonakov@ispras.ru>
  • Loading branch information
amonakov authored and rth7680 committed Apr 8, 2024
1 parent 782ff6a commit 82b35c7
Showing 1 changed file with 0 additions and 3 deletions.
3 changes: 0 additions & 3 deletions util/bufferiszero.c
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,6 @@ static bool buffer_is_zero_integer(const void *buf, size_t len)
const uint64_t *e = (uint64_t *)(((uintptr_t)buf + len) & -8);

for (; p + 8 <= e; p += 8) {
__builtin_prefetch(p + 8);
if (t) {
return false;
}
Expand Down Expand Up @@ -80,7 +79,6 @@ buffer_zero_sse2(const void *buf, size_t len)

/* Loop over 16-byte aligned blocks of 64. */
while (likely(p <= e)) {
__builtin_prefetch(p);
t = _mm_cmpeq_epi8(t, zero);
if (unlikely(_mm_movemask_epi8(t) != 0xFFFF)) {
return false;
Expand Down Expand Up @@ -111,7 +109,6 @@ buffer_zero_avx2(const void *buf, size_t len)

/* Loop over 32-byte aligned blocks of 128. */
while (p <= e) {
__builtin_prefetch(p);
if (unlikely(!_mm256_testz_si256(t, t))) {
return false;
}
Expand Down

0 comments on commit 82b35c7

Please sign in to comment.