Transparent Huge Pages Support #2149

michaeleisel · 2024-05-01T19:21:55Z

When working with CPU tensors on Linux, transparent huge pages (THP) can provide a big speedup. For example, I see a 15% increase in speed for my code when I turn it on. However, for many operating systems such as Ubuntu, the default behavior for THP is "madvise", which means that madvise must be called with the proper flags on each memory segment that we want THP for. In NumPy, they turn on THP with madvise for any arrays that are 4 MB or larger, when they're on Linux 4.6+ (initial commit: numpy/numpy@7180479). It would be great to have similar behavior for candle. I'm not exactly sure how this would be implemented, as CpuStorage takes a Vec that may have already been paged in, but maybe calling madvise somehow for all usages of CpuStorage in cpu_backend.rs would cover the broad strokes.

The text was updated successfully, but these errors were encountered:

LaurentMazare · 2024-05-01T19:43:22Z

Interesting, could you maybe provide a way to replicate your 15% speedups? I'm pretty curious about which parts get actually accelerated by THP, if it's more loading the tensors vs the actual ops and if it's in the ops which ones benefit the most of it.

michaeleisel · 2024-05-01T20:49:12Z

Here are some operations and their speeds without THP (left) and with THP (right):
Tensor::ones((5000, 5000), ...): 22 vs. 63 iters/sec
a + a, where a is a 5,000x5,000 tensor: 19 vs. 42 iters/sec
a.matmul(a), where a is a 5,000x5,000 tensor: 1.65 vs. 1.73 iters/sec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transparent Huge Pages Support #2149

Transparent Huge Pages Support #2149

michaeleisel commented May 1, 2024

LaurentMazare commented May 1, 2024

michaeleisel commented May 1, 2024 •

edited

Transparent Huge Pages Support #2149

Transparent Huge Pages Support #2149

Comments

michaeleisel commented May 1, 2024

LaurentMazare commented May 1, 2024

michaeleisel commented May 1, 2024 • edited

michaeleisel commented May 1, 2024 •

edited