Skip to content

Commit

Permalink
eal/x86: improve multiple of 64 bytes memcpy performance
Browse files Browse the repository at this point in the history
[ upstream commit 2ef17be88e8b26f871cfb0265227341e36f486ea ]

In rte_memcpy_aligned(), one redundant round is taken in the 64 bytes
block copy loops if the size is a multiple of 64. So, let the catch-up
copy the last 64 bytes in this case.

Fixes: f547270 ("eal: optimize aligned memcpy on x86")

Suggested-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
  • Loading branch information
Ninja-Mobius authored and bluca committed Jun 14, 2023
1 parent 206434a commit 4154fc9
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion lib/librte_eal/x86/include/rte_memcpy.h
Original file line number Diff line number Diff line change
Expand Up @@ -846,7 +846,7 @@ rte_memcpy_aligned(void *dst, const void *src, size_t n)
}

/* Copy 64 bytes blocks */
for (; n >= 64; n -= 64) {
for (; n > 64; n -= 64) {
rte_mov64((uint8_t *)dst, (const uint8_t *)src);
dst = (uint8_t *)dst + 64;
src = (const uint8_t *)src + 64;
Expand Down

0 comments on commit 4154fc9

Please sign in to comment.