Skip to content

Commit a048d3a

Browse files
ubizjakIngo Molnar
authored andcommitted
x86/percpu: Rewrite arch_raw_cpu_ptr() to be easier for compilers to optimize
Implement arch_raw_cpu_ptr() as a load from this_cpu_off and then add the ptr value to the base. This way, the compiler can propagate addend to the following instruction and simplify address calculation. E.g.: address calcuation in amd_pmu_enable_virt() improves from: 48 c7 c0 00 00 00 00 mov $0x0,%rax 87b7: R_X86_64_32S cpu_hw_events 65 48 03 05 00 00 00 add %gs:0x0(%rip),%rax 00 87bf: R_X86_64_PC32 this_cpu_off-0x4 48 c7 80 28 13 00 00 movq $0x0,0x1328(%rax) 00 00 00 00 to: 65 48 8b 05 00 00 00 mov %gs:0x0(%rip),%rax 00 8798: R_X86_64_PC32 this_cpu_off-0x4 48 c7 80 00 00 00 00 movq $0x0,0x0(%rax) 00 00 00 00 87a6: R_X86_64_32S cpu_hw_events+0x1328 The compiler also eliminates additional redundant loads from this_cpu_off, reducing the number of percpu offset reads from 1668 to 1646 on a test build, a -1.3% reduction. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20231015202523.189168-1-ubizjak@gmail.com
1 parent e29aad0 commit a048d3a

File tree

1 file changed

+4
-2
lines changed

1 file changed

+4
-2
lines changed

arch/x86/include/asm/percpu.h

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,9 +56,11 @@
5656
#define arch_raw_cpu_ptr(ptr) \
5757
({ \
5858
unsigned long tcp_ptr__; \
59-
asm ("add " __percpu_arg(1) ", %0" \
59+
asm ("mov " __percpu_arg(1) ", %0" \
6060
: "=r" (tcp_ptr__) \
61-
: "m" (__my_cpu_var(this_cpu_off)), "0" (ptr)); \
61+
: "m" (__my_cpu_var(this_cpu_off))); \
62+
\
63+
tcp_ptr__ += (unsigned long)(ptr); \
6264
(typeof(*(ptr)) __kernel __force *)tcp_ptr__; \
6365
})
6466
#else /* CONFIG_SMP */

0 commit comments

Comments
 (0)