This is a drop-in replacement for malloc/free that is designed from the ground up to be used in scalable server software.
Why another memory allocator?
Here are the advantages of our allocator:
It's thread-friendly. It supports a practically-unlimited number of concurrent threads, without locking or performance degradation.
It's efficient, especially in a multi-threaded environment. Compared to a stock libc allocator, we see a significant performance boost.
It does NOT fragment or leak memory, unlike a stock libc allocator.
It wastes less memory. For small objects (less than 8kb in size), the overhead is around 0 bytes. (!)
It is designed from the ground-up for 64-bit architectures.
It is elegant. The whole codebase is only around 700 lines of fairly clean C++. (!)
It fully stand-alone; it does not rely on pthreads or libc at runtime.
Works only with a x86_64 CPU.
It does not try to economise on virtual memory mapping space.
Memory used for small objects (those less than 128kb in size) is cached within the allocator and is never released back to the OS. In practice, this means that your app will stabilize at some peak usage of memory and will, from that moment on, rarely ask the OS for memory allocation or deallocation.
Note: there is only limited support for the functions memalign, posix_memalign and valloc.
For building the library, you will need:
- g++ or clang with C++11 support.
- An x86_64 CPU.
- An OS with sane mmap/munmap support. We have production-tested only on Linux. (But it should work with FreeBSD too.)
Edit Makefile to set your build options; type
make to build a static and a shared version of the library.
Note: we recommend integrating the allocator into your own project(s). The enclosed Makefile is intended only as a demonstration, not a complete build solution.
Using the library
There are five options:
Compile into a statically-linked application. For this, you must:
- Compile your application's object files as normal.
lite-malloc.cpp, or use the
liblite-malloc-static.agenerated with the enclosed Makefile.
- Link your app with the
-staticflag, and these linker flags for wrapping:
-Wl,--wrap,malloc -Wl,--wrap,free -Wl,--wrap,calloc -Wl,--wrap,realloc -Wl,--wrap,memalign -Wl,--wrap,valloc -Wl,--wrap,posix_memalign
Link with a shared library. Linking with
Via standard glibc malloc hooks. (Deprecated.) To enable, just include
lite-hooks.has the first line of your program.
Via the magic
$ LD_PRELOAD=<full pathname>/liblite-malloc-shared.so <your application>
Compile a malloc hook static library into your dynamically-linked application. Follow the instructions in option 1, except omit the
-staticflag, and add these linker flags:
-Wl,--defsym=malloc=__wrap_malloc -Wl,--defsym=free=__wrap_free -Wl,--defsym=calloc=__wrap_calloc -Wl,--defsym=realloc=__wrap_realloc -Wl,--defsym=memalign=__wrap_memalign -Wl,--defsym=valloc=__wrap_valloc -Wl,--defsym=posix_memalign=__wrap_posix_memalign
Linux has an OS-imposed limit on the number of virtual memory mappings.
If you want to allocate large amounts of memory, you may need to adjust this
The allocator can be tuned with compile-time defines. Here are the supported options:
(default is 32)
Mmap blocks of size
LITE_MALLOC_SUPERBLOCK_CACHE_SIZE * 4096 bytes and less will be cached. (With a separate cache per size / 4096 bytes.)
(default is 1024)
Allocations of size
LITE_MALLOC_MINIBLOCK_CACHE_SIZE * 8 bytes and less will be cached. (With a separate cache per size / 8 bytes.)
(default is 32)
This many independent allocator engines will be used.
(default is 1)
If 1, pin a specific allocator engine to each thread. If 0, choose an allocator engine round-robin for each allocation.
(Use 1 unless you have threads with highly unbalanced allocation workloads and are seeing dramatic overhead.)
This library is licensed under the GNU LGPL. Please see the file LICENSE.