This project is a from-scratch implementation of calloc / free / malloc / realloc in C, inspired by glibc malloc internals.
It is intended as an educational and experimental allocator, focusing on understanding real-world allocator design tradeoffs rather than covering every edge case required by production allocators.
The codebase is heavily instrumented and unit-tested, and many design choices mirror glibc behavior (e.g. its laziness in some bookkeeping paths).
-
We no longer override the global libc allocator symbols (malloc, etc.) inside its own translation unit. We replaced the testing mechanism with mock syscalls operating on a pre-allocated stack buffer for custom allocator calls. The previous version with function interposing and overriden global symbols resides in branch first_find_free_list_with_deprecated_sbrk.
-
Latest branch with mocked syscals and deprecated
brkAPIs
π Deep Dive: Linux vs macOS Dynamic Linking Behavior
π See the full investigation in the project wiki: The full process (ELF symbol resolution, Mach-O two-level namespaces, PLT/GOT, and dyld interposition and the allocator dynamics) with proofs reside in Dynamic Linking Deep Dive
- Implement a realistic allocator with behavior comparable to glibc
- Support
calloc,free,malloc,realloc - Model fastbins, unsorted bins, and coalescing semantics
- Support sbrk-backed and mmap-backed allocations
- Make allocator invariants explicit and testable
- Not losing clarity and debuggability over raw performance concerns
- Full POSIX / glibc ABI compatibility
- Thread safety
- Absolute peak performance
Chunk: 'chunk' of memory requested by the user, often implying it is contiguous. I explicitly separate a memory chunk requested and the block it is in. The chunk refers to the contiguous part of memory that the user will start using immediately. It is the allocation from his/her point of view. Block: The enveloping structure of the 'chunk' of memory requested by the user, where the chunk tails the block itself. Block is the header as the handle to the allocated memory in its entirety. During allocation, it is carved out from the available memory by requesting a little more than user's desired size so that the header fits.
+-----------------------------+
| flagged size | β β
| next | βββ Block's header β
| prev | β β
+-----------------------------+ βββ Block
| user memory | β β
| of aligned | βββ Chunk β
| requested size | β β
+-----------------------------+Note: Allocator often finds or searches blocks according to the metadata encoded in their headers and when found or allocated newly, returns the chunk. User is not aware of the block structure and the header, although he/she can reconstruct it if he/she looks at the implementation.
Arena: The data structure that holds blocks and other occasional additional metadata about the heaps state. Bin: An array that holds blocks of specific size or range of specific size.
The allocator maintains a primary arena which owns:
- Fastbins (singly-linked lists for small chunks)
- Unsorted bin
- Bookkeeping metadata (bin bitmaps, total allocated memory, debug markers)
There is no multi-arena or per-thread arena support. Just an additional arena for mmapped memory chunks.
Each block in the heap has the following layout:
| prev true size | β prev. chunks true size (flags removed) if it is free
+-----------------------------+
| flagged size | β bytes in payload with LSBs marking <is prev. free><is free><is mmapped>
| next | β pointer to next block if free and in any bin
| prev | β pointer to previous block if free and in any bin
+-----------------------------+
| | β user memory of aligned requested size also refered as memory chunk
| |
| |
| true size | β size (flags removed) i.e. true size if block is free
+-----------------------------+Each allocation is represented by a Block:
- Header contains size and flags.
- User memory immediately follows the header, also refered as memory chunk.
- When a block is free, its size is also written to its footer so that next contiguous header can easily retrieve it when coalescing/fusing.
This enables:
- O(1) backward coalescing
- Minimize cache misses under heavy pressure (to be proved with benchmarks)
- Encoding
prev_freeinformation without storing an explicit prev pointer
Flags & Metadata
- Allocation state is encoded using low bits in the size field (alignment requirements spare 3-4 least significand bits).
prev_freeis propagated eagerly/lazily depending on context.- Fastbin chunks are treated as βin useβ to avoid premature fusion until they are consolidated.
Given the requested size size:
- Align the size.
- If aligned size is larger than
MIN_CAP_FOR_MMAP(128 KiB), mmap. - Else if arena does not have any blocks, allocate aligned size with sbrk.
- Else
- If aligned size is small/eligible for fast bins.
- Try fast bin, if found return that block.
- If no block is found in fastbins, try small bins (check small bin for exact size), if found return that chunk.
- If no block is found in that small bin, do not go to next larger small bin (glibc does the same), try unsorted bin, if a block as big as (split before returning) or larger than aligned size is found, return that chunk.
- If still no block found, consolidate fastbins (fuse them and put them to unsorted bin).
- Try unsorted again as above step.
- If still no block is found, request new memory for the block via sbrk from OS i.e. sysmalloc.
- If aligned size is large.
- Consolidate fast bins.
- Try unsorted bins and fuse as you go to satisfy the required aligned size, if found return that chunk. If the found one is large enough, split first.
- If not found, try the large bin with the appropriate range of sizes, if found split the block if necessary and return the chunk.
- If aligned size is small/eligible for fast bins.
π The unsorted bin searches mentioned above fuses blocks at hand if possible and if it does not satisfy the requirement, it is put in the appropriate bin (either a small or large bin).
- Sizes eligible for fastbins are placed into fastbins on free
- Fastbins are singly-linked and LIFO
- No immediate coalescing
- Fastbins are periodically consolidated into the unsorted bin
- During consolidation:
- Blocks are moved one-by-one
- Full forward and backward coalescing is performed
- Bin bitmaps are updated lazily (mirroring glibc behavior)
- Requests above
MIN_CAP_FOR_MMAPare fulfilled via mmap - Large reallocations may transition between sbrk and mmap
realloc supports all four transitions:
- SBRK β SBRK (in-place growth or move)
- SBRK β MMAP
- MMAP β SBRK
- MMAP β MMAP (via mremap)
If in-place growth is not possible:
- A new block is allocated
- User memory is deep-copied
- Metadata flags are transferred
- The old block is freed
- Check for edge cases
- If given pointer is null, malloc the given size
- If given size is 0, free the pointer
- If cannot reconstruct block header from the given memory chunk, return NULL
- If the given size and the blocks size (size of the chunk, header size is not included) is equal, do nothing and return the given pointer
- Switch over 4 possibilities given above
- SBRK β SBRK
- Try growing in place by fusing with forward blocks until enough size is obtained (larger is fine, split before returning )
- If cannot still satisfy the requirement: perform malloc new, deep-copy, free old routine
- If satisfies split if too large, and return chunk (no mem move needed)
- SBRK β MMAP
- perform malloc new (mmap), deep-copy, free sbrk (release occurs only if the block is the top block i.e. tangent to the heap's BRK) block routine
- MMAP β SBRK
- perform malloc new (sbrk), deep-copy, free (munmap) block routine
- MMAP β MMAP (via mremap)
- mremap (handles deep-copy and freeing itself)
- SBRK β SBRK
Any error during the syscalls fails and halts the given allocation attempt.
π Deep copy routine uses memmove which handles overlapping memory regions.
- Handle edge cases
- If given pointer is null, do nothing silently return
- If cannot reconstruct the header, do nothing silently return (in tests,
FREE_ON_BAD_PTRis marked ) - If already free (double free), print a message (during testing
DOUBLE_FREEis marked and fails tests if not checked and cleared) and do nothing.
- Mark the block as free and propagate the information via putting the true size in footer and setting the next blocks bit flag if not the top block
- If mmapped, munmap
- If not the top block and is eligible for fast bins (size fits), insert the block in appropriate fast bin and return.
- If the top block or large, fuse with contiguous chunks
- If not the top block, insert the block in unsorted bin and return
- If the top block, release the block i.e. return the memory region back to OS.
π Modern kernels rarely shrink the program break. To make testing this path easier, we diverge here.
- Deterministic, single process Acutest testing suite. Tests donβt fork; everything runs inside one process for determinism.
- Tests cover:
- Allocation and freeing
- Fastbin behavior
- Coalescing correctness
- Reallocation edge cases
- Accounting invariants
- Extensive internal debug markers (MM_MARK)
- Optional verbose logging
- Designed to be run under:
- UBSan
- gdb / lldb
- (
mm_sbrk,mm_brk,mm_mmap, etc.) system call wrappers are used to mock the syscalls for deterministic behavior in testing.
π AddressSanitizer is intentionally disabled when overriding system malloc.
- GNU Make
- GCC or Clang
- Linux or macOS
make clean testmake test-container If USE_GDB is set, directly execs into the gdb session of the test binary, otherwise execs into the container
make investigation-container [USE_GDB=] Common flags:
ENABLE_LOGβ enable verbose allocator loggingTESTINGβ enable test-only hooks and assertionsSHOW_SBRK_RELEASE_SUCCEEDSβ emulate successful memory release in tests
This project is actively developed and frequently refactored as allocator behavior is refined and better understood.
Expect:
- Breaking internal changes
- Additional invariants
- More glibc-inspired behavior over time
This allocator is not intended for production use.
It is designed for:
- Learning how real allocators work
- Experimenting with allocator policies
- Testing ideas in a controlled environment
Use at your own risk.
The following invariants are relied upon throughout the codebase. Many unit tests implicitly assert these properties.
- Blocks are contiguous within an arena (if sbrk-backed)
- Each block knows its true size via its header
- If a block is free, its size is also written to its footer
- A block can only be coalesced with neighbors that are also free (fastbins excluded)
- Allocation state is encoded in low bits of the size field
prev_freeinformation projects the next blockβs metadata- Fastbin chunks are temporarily marked as βin useβ to prevent premature fusion
- Fastbins are singly-linked and LIFO
- Unsorted bin may temporarily contain blocks of any size
- Bin bitmaps may be stale until a full consolidation pass
π As glibc does it, bitmaps are lazily bookkept. An unset bit is a definite indicator of an empty bin, but a set bit does not guarantee the bin is populated. The first try that unravels the false guarantee unsets the bit.
- Invalid backward coalescing
- Footer corruption
- Crashes during fastbin consolidation
A rough guide to where things live:
- block.* β block layout, headers, footers, and basic navigation
- malloc.c β main allocation/free paths and fastbin logic
- arena.* β arena state, bin maps, and global bookkeeping
- mm_debug.* β debug counters, markers, and instrumentation
- tests/ β unit tests (Acutest-based)
The core allocator logic resides in malloc.c, start from there to understand the allocatorβs core behavior.
malloc/
βββ README.md β you are here
βββ Makefile β builds allocator + tests
βββ Dockerfile β reproducible build environment
βββ Dockerfile.investigation β reproducible investigation environment
βββ include/
β βββ malloc/malloc.h β exported API symbols
βββ src/
β βββ alignment.c β alignment helpers
β βββ arena.c β procedures concerning Arenas
β βββ arena.h β macros, declarations and structures concerning Blocks
β βββ block.c β procedures concerning Blocks
β βββ block.h β declarations and structures concerning Blocks
β βββ internal.h β block structure + allocator internals
β βββ malloc.c β main allocator
β βββ mm_debug.* β debug counters (TESTING)
β βββ non_allocating_print.c
β βββ probes.c β test inspection helpers
β βββ sys_call_wrappers.c β brk/sbrk/mmap wrappers
βββ tests/
β βββ acutest.h
β βββ log.c β logging mechanisms for testing
β βββ test_malloc.c
βββ githooks/ β githooks (here for version control), `make install-git-hooks` to install them
βββ pre-push β pre-push hook, runs `test-interpose` if on mac + `make test-container` as guard before pushingThis allocator intentionally mirrors several glibc behaviors:
- Lazy bin bitmap updates
- Fastbins delaying coalescing
- Unsorted bin as a staging area
- Small and large bin behaviors, granularity
However, it also diverges deliberately:
- Single arena only (mmap-arena is separate)
- No thread safety
- Reduced header size experiments
- Strong emphasis on explicit invariants
These differences are intentional and serve educational clarity.
- Prefer running tests under UBSan
- Enable ENABLE_LOG when debugging fusion issues
- Use MM_MARK counters to trace allocator decisions
- When debugging corruption, verify:
- footer placement
- prev_free propagation
- fastbin β unsorted transitions
Planned or possible extensions:
- Giving away the
next,prevpointers to the user when in use, as glibc squeezes in more performance like this. - Reducing worst-case large bin insertions/retrievals by 2D linkedlists, grouping same sizes chunks as a 'fork' in that bin's linkedlist.
- Experimenting with some heuristics e.g. moving averages of allocated sizes, rate of change in allocated sizes.
- Additional integrity checks in debug builds
- Better visualization of arena state
- Experimental policies (e.g. different fastbin thresholds).
The project is expected to evolve as allocator understanding deepens π.