Skip to content

Commit

Permalink
add missing memory barrier for Apple Silicon targets
Browse files Browse the repository at this point in the history
Signed-off-by: Anjan Roy <hello@itzmeanjan.in>
  • Loading branch information
itzmeanjan committed Jan 27, 2024
1 parent 5943af1 commit a964b6e
Showing 1 changed file with 7 additions and 5 deletions.
12 changes: 7 additions & 5 deletions src/dudect.h
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ extern "C" {
#include <x86intrin.h>
#elif defined __APPLE__
#include <mach/mach_time.h>
#include <stdatomic.h>
#else
#include <time.h>
#endif
Expand Down Expand Up @@ -293,11 +294,11 @@ static inline int64_t cpucycles(void) {
/*
Returns current CPU cycle count from aarch64 *P*erformance *M*onitors *C*ycle Counter (PMCCNTR_EL0).
To enforce CPU to complete all pending memory access operations, appearing before PMCCTR_EL0, we issue a
To enforce CPU to complete all pending memory access operations, appearing before PMCCTR_EL0, we issue a
*D*ata *S*ynchronization *B*arrier instruction right before reading CPU cycle counter.
Note, issuing PMCCTR_EL0 instruction from the userspace will probably result in panicing with
a message "illegal instruction executed". So we've to install a Linux Kernel Module. I've tested the
Note, issuing PMCCTR_EL0 instruction from the userspace will probably result in panicing with
a message "illegal instruction executed". So we've to install a Linux Kernel Module. I've tested the
LKM @ https://github.com/jerinjacobk/armv8_pmu_cycle_counter_el0 and it works fine.
See PMCCTR_EL0 documentation @ https://developer.arm.com/documentation/ddi0595/2021-09/External-Registers/PMCCNTR-EL0--Performance-Monitors-Cycle-Counter?lang=en
Expand All @@ -315,17 +316,18 @@ static inline int64_t cpucycles(void) {

/*
Returns the number of "mach time units" elapsed since system startup, on non-x86_64 Apple targets.
See https://github.com/google/benchmark/blob/4682db08/src/cycleclock.h#L63-L73
*/
static inline int64_t cpucycles(void) {
atomic_thread_fence(memory_order_seq_cst);
return (int64_t)mach_absolute_time();
}

#else

/*
Returns approximate measurement of CPU time consumed by the calling process (i.e., CPU time
Returns approximate measurement of CPU time consumed by the calling process (i.e., CPU time
consumed by all threads in the process).
See https://www.man7.org/linux/man-pages/man3/clock_gettime.3.html
Expand Down

0 comments on commit a964b6e

Please sign in to comment.