Skip to content
Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

False Sharing Microbenchmark

This is an adjustable false sharing tool, eventually to be used for evaluating incoherent architectures. It allocates a shared buffer which is sized as the L1 cache line size, and aligns it on the same boundary. A variable number of threads (up to the cache line size) write to a given byte within the cache line. There are no overlapping reads are writes, thus the coherence traffic (at least for the memory references contained within the worker() function) will be due purely to false sharing.

You can run it like so:

[you@machine] make
[you@machine] ./false -n 64 -a 1000000

This will create 64 threads (which will most often correspond to one thread owning a single byte within the cacheline), and each thread will repeatedly issue a write to that cache line.

Thread counts greater than the cacheline size will (currently) be clipped, thus on a machine with a standard 64-byte line size, this:

[you@machine] ./false -n 200 -a 100000

will be the same as:

[you@machine] ./false -n 64 -a 100000

To make sure things are working as expected, you can use perf and PEBS to get an idea of how many inter-core line transfers are occuring. Note that AFAIK this will only work on an Intel machine with a fairly recent Linux kernel (with PEBS support). Here's an example on a 64-core Skylake (Linux 4.19.9):

kyle@jebe ~/false-sharing> perf c2c record --user ./false -n 32 -a 500000000000
# false sharing experiment config:
#                threads : 32
#           L1 line size : 64B
#            memory refs : 500000000000
Allocated aligned shared buf (0x17a0cc0)
[ perf record: Woken up 362 times to write data ]
[ perf record: Captured and wrote 107.782 MB (441253 samples) ]
kyle@jebe ~/false-sharing> perf c2c repor


microbenchmark for looking at false sharing






No releases published


No packages published