Skip to content

mplaczek99/PA5

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PA5 Custom Memory Allocator

PA5 is a fixed-size C++ heap allocator that implements explicit free/used metadata, next-fit allocation, block splitting, and bidirectional coalescing, with a Linux x86 (-m32) workflow that preserves the original assignment behavior.

This repository includes the allocator implementation, assignment-provided test harnesses, Windows reference output, and a Linux-friendly build and benchmarking workflow.

Overview

The allocator manages a single 50 KiB region (Mem::TotalSize) using in-heap metadata (Heap, Free, Used) and explicit linked-list management for both free and used blocks.

This repository focuses on behavior preservation first and measured optimization second.

Engineering Focus

Building and validating this allocator exercises:

  • low-level memory layout and metadata design
  • fragmentation behavior, splitting, and coalescing tradeoffs
  • allocator performance under stress workloads
  • portability constraints between MSVC/Windows and g++/Linux

Highlights

  • Custom allocator implementation in PA5/Mem.cpp
  • Next-fit style free-block search with a single wrap
  • Address-sorted free-list insertion for deterministic coalescing checks
  • Block splitting with a minimum residual threshold (sizeof(Free) + 8)
  • Upward and downward coalescing of physically adjacent free blocks
  • Incremental heap statistics tracking for used/free blocks and memory
  • Linux x86 build path with -m32 and fair stress benchmark target
  • Compatibility shims in Framework/ for MSVC/Windows-specific assumptions

Allocator instrumentation options:

Macro Purpose
MEM_PROFILE Counts allocator operations (calls, traversal steps, merge counts)
MEM_PROFILE_TIMING Adds timing accumulation for key allocator paths
MEM_VERIFY_FREE_STATS Verifies free-list stats against list traversal (debug validation)

Heap Layout (ASCII)

Managed region: Mem::TotalSize = 50 KiB

low address
+--------------------------------------------------------------+
| Heap header (Heap, 32 bytes)                                |
| pUsedHead | pFreeHead | pNextFit | used/free statistics     |
+--------------------------------------------------------------+  <-- first block

Each block in heap order:
+-------------------------------+------------------------------+
| Block header (Free or Used)   | Payload                      |
| mData (size + flags)          | user bytes                   |
| pNext                         | ...                          |
| pPrev                         | [last 4 bytes] secret ptr    |
+-------------------------------+------------------------------+

mData bit layout (uint32_t):
- bit 0: Type      (1 = Free, 0 = Used)
- bit 1: AboveFree (1 = previous/above block is free)
- bits 31..2: payload size (mask with ~0x3)

high address

Project Layout

.
├── PA5/
│   ├── Mem.cpp / Mem.h        # allocator logic and public API
│   ├── Heap.cpp / Heap.h      # heap state metadata
│   ├── Free.cpp / Free.h      # free block metadata
│   └── Used.cpp / Used.h      # used block metadata
├── PA5_ReadOnly/
│   ├── main.cpp
│   └── MemReadOnly.cpp        # assignment-provided runtime/harness
├── Test/
│   ├── Mem_Test_01.cpp ... Mem_Test_17.cpp
│   └── Mem_Test_Stress.cpp
├── Framework/                 # framework utilities + Linux compatibility headers
├── Makefile                   # Linux build and benchmark targets
└── Reference_Output.txt       # Windows reference benchmark output

Build and Test on Linux

Prerequisites

You need a 32-bit capable C++ toolchain.

Example for Debian/Ubuntu:

sudo apt update
sudo apt install -y build-essential g++-multilib libc6-dev-i386

Full debug tests

make full && ./build/pa5_debug_x86

Fair stress-release benchmark

make stress-release && ./build/pa5_stress_release_x86

For fair Linux benchmarking, this repository uses:

  • -m32
  • -O3
  • -DNDEBUG
  • -flto
  • -fno-builtin-malloc
  • -fno-builtin-free

Testing

  • Functional assignment tests are in Test/Mem_Test_01.cpp through Test/Mem_Test_17.cpp
  • The stress benchmark is in Test/Mem_Test_Stress.cpp
  • Recommended workflow:
    1. run full debug tests
    2. run the fair stress-release benchmark
    3. use median over multiple runs for performance comparisons

Benchmark Results

Windows reference benchmarks (Reference_Output.txt)

Mode System allocator (Mem Lib) Custom allocator (Custom Memory) Reported ratio (times faster)
Windows x86 Debug 4118.881748 ms 3374.176539 ms 1.220707x
Windows x86 Release 871.623310 ms 249.820803 ms 3.488994x

Linux fair stress-release benchmark (current)

Mode Custom allocator median Ratio median (supplemental)
Linux x86 stress-release ~195.99 ms ~0.72x

Raw custom-time comparison (primary cross-platform metric)

Baseline Custom allocator time
Windows x86 Release custom baseline 249.820803 ms
Linux fair stress-release custom median ~195.99 ms

The Linux custom allocator raw benchmark time is about 21.5% faster than the Windows x86 Release custom baseline.

Notes on Windows vs Linux Benchmarking

The times faster ratio is computed against each platform's system allocator baseline.

Because Linux malloc/free and the Windows CRT allocator behave differently, Windows and Linux ratios are not directly apples-to-apples.

For cross-platform comparison, use raw custom allocator time as the primary metric:

  • ratio is useful as a within-platform indicator
  • raw custom time is the Windows-vs-Linux baseline comparison

Example Allocator Usage

#include "Mem.h"

int main()
{
    Mem mem(Mem::Guard::Type_A);
    mem.initialize();

    void* p = mem.malloc(256);
    if (p != nullptr)
    {
        mem.free(p);
    }

    return 0;
}

Future Improvements and Lessons Learned

  • Keep optimization scope narrow and measurable; preserve allocator semantics before changing allocator policy.
  • Add an optional heap-integrity pass that validates header flags, list links, block sizes, and adjacency invariants after stress runs.
  • Evaluate alternate placement policies (for example best-fit or size-segregated free lists) behind compile-time switches while keeping assignment behavior as a regression gate.
  • Replace ad hoc benchmark sampling with scripted multi-run statistics (median, p95, variance) and fixed CPU affinity to reduce noise.
  • Improve reproducibility with CI jobs that build and run both full and stress-release on a 32-bit-capable Linux toolchain.

Project Background

This project began as an assignment-style custom memory allocator exercise and was later extended with a Linux build path, benchmarking workflow, and measured performance improvements.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages