Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

libc - Add poor man's cache coloring optimization to nmalloc module.

* A series of large allocations in excess of 32KB will be offset by 4K from
  each other.  This fixes performance issues on SandyBridge and later cpus
  related to large matrix operations.

  This eats an extra 4K of VM for such allocations but does not eat any
  additional real memory.

* Greatly improves large FP matrix benchmarks.  Real-world effects are more
  questionable.

* The Sandybridge and later cpus use a virtually indexed, physically tagged
  L1 cache, and tend to be sensitive to substantially different memory
  addresses winding up on the same cache line.  Matrix operations (primarily
  benchmarks) can cause these sorts of effects.

Reported-by: alexh
  • Loading branch information...
commit 8120f5e2a46e669c06a7afdd7de60fa6d6996f9d 1 parent 65221c7
Matthew Dillon authored
Showing with 8 additions and 0 deletions.
  1. +8 −0 lib/libc/stdlib/nmalloc.c
View
8 lib/libc/stdlib/nmalloc.c
@@ -821,7 +821,15 @@ _slaballoc(size_t size, int flags)
bigalloc_t big;
bigalloc_t *bigp;
+ /*
+ * Page-align and cache-color in case of virtually indexed
+ * physically tagged L1 caches (aka SandyBridge). No sweat
+ * otherwise, so just do it.
+ */
size = (size + PAGE_MASK) & ~(size_t)PAGE_MASK;
+ if ((size & 8191) == 0)
+ size += 4096;
+
chunk = _vmem_alloc(size, PAGE_SIZE, flags);
if (chunk == NULL)
return(NULL);
Please sign in to comment.
Something went wrong with that request. Please try again.