Compare the performance of different methods of striping across the data in a large buffer.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


The purpose of this project is to test different memory access patterns across large buffers. Generally, caches work best with sequential access. They can prefetch the next cache line, for instance. So reading from start to finish ought to be fastest, but that isn't always an option. So besides the sequential case, what is fast and what is slow?

I figured, let's measure different patterns on the same buffer. In this project, patterns are essentially different groupings of one buffer into subarrays. The data is striped over in column order in two, three, or four-axis runs. Each pattern accesses the same set of addresses in the buffer and so does the exact same work but in different orders.

To see the measurements, you want to run this using the Time Profiler instrument, locate the +runTests method in the call tree, and drill in to view the code of that method. Each line of +runTests is a different pattern of memory access, and since the profiler will show you how much time is spent on each line, it will show you the relative cost of each pattern. There is a way to configure the Time Profiler to show you total samples (which by default are one per millisecond), and I found that more meaningful than just a percentage of the total.

Read KCMemoryAccessTests.m for more details and test configuration. In particular, you can adjust the size of the buffer to approximate the problem you're trying to solve, and then see how different patterns of memory access affect your performance. Don't forget to measure your own app. :)