Skip to content

Add compile-time configurable pool size#19

Merged
jserv merged 1 commit intomasterfrom
configurable
Feb 9, 2026
Merged

Add compile-time configurable pool size#19
jserv merged 1 commit intomasterfrom
configurable

Conversation

@jserv
Copy link
Copy Markdown
Collaborator

@jserv jserv commented Feb 9, 2026

This introduces three performance/size tuning knobs:

  • TLSF_MAX_POOL_BITS: clamp FL index to reduce tlsf_t control structure size (e.g. -DTLSF_MAX_POOL_BITS=20 for 1 MB max pool). Runtime guards in arena_{grow,append_pool} prevent merged block from overflowing the mapping function.
  • TLSF_SPLIT_THRESHOLD: configurable minimum remainder for trimming (default BLOCK_SIZE_MIN preserves existing behavior). Raising it avoids tiny free blocks whose metadata overhead exceeds payload.
  • Reorder tlsf_t fields: move hot arena/size before the large block[][] matrix so bitmap scans and pool checks stay in the same cache lines.

FL_COUNT is now computed from FL_MAX and FL_SHIFT rather than hardcoded per architecture. Static asserts validate all constraints.


Summary by cubic

Make the allocator size and split behavior configurable at compile time to reduce memory footprint and fragmentation. Defaults keep current behavior, with static and runtime checks to enforce limits.

  • New Features

    • TLSF_MAX_POOL_BITS: caps max pool size and shrinks tlsf_t by clamping the FL index (e.g., -DTLSF_MAX_POOL_BITS=20 for 1 MB). Runtime guards in arena_grow/arena_append_pool prevent overflow.
    • TLSF_SPLIT_THRESHOLD: sets the minimum remainder when trimming blocks (default BLOCK_SIZE_MIN) to avoid tiny, wasteful free blocks.
  • Refactors

    • Reordered tlsf_t for better cache locality (arena/size before block[][]); FL_COUNT now derived from FL_MAX and FL_SHIFT; added static asserts and updated README.

Written for commit 2c7f75e. Summary will update on new commits.

This introduces three performance/size tuning knobs:
- TLSF_MAX_POOL_BITS: clamp FL index to reduce tlsf_t control structure
  size (e.g. -DTLSF_MAX_POOL_BITS=20 for 1 MB max pool). Runtime guards
  in arena_{grow,append_pool} prevent merged block from overflowing the
  mapping function.
- TLSF_SPLIT_THRESHOLD: configurable minimum remainder for trimming
  (default BLOCK_SIZE_MIN preserves existing behavior). Raising it
  avoids tiny free blocks whose metadata overhead exceeds payload.
- Reorder tlsf_t fields: move hot arena/size before the large block[][]
  matrix so bitmap scans and pool checks stay in the same cache lines.

FL_COUNT is now computed from FL_MAX and FL_SHIFT rather than hardcoded
per architecture. Static asserts validate all constraints.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 9, 2026

WCET Results (x86-64)

TLSF WCET Analysis
==================
Timer:      cycles
Cache:      hot
Pool:       4194304 bytes (4.0 MB)
Iterations: 5000 (warmup: 500)
Sizes:      16 64 256 1024 4096 bytes

--- malloc_worst (small alloc from single huge block) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         73         98         98        123        196        270       98.7        8.0
      64         74         98         98        123        196      28836      104.8      406.5
     256         73         98        122        123        220        269      100.7       10.7
    1024         73         98         98        123        123        245       94.5       12.6
    4096         73         98         98        123        123        245       95.8       12.4

--- malloc_best (exact bin hit, no split) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         49         73         74         74         74        147       70.7        8.2
      64         49         73         74         98         98         98       69.7       10.1
     256         49         73         74         74         98         98       64.3       12.1
    1024         49         73         74         74         74         74       66.7       11.0
    4096         49         73         74         74         74        147       67.4       10.7

--- free_worst (sandwiched between two free blocks) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         49         74         98         98        122        269       75.9       10.9
      64         49         74         98         98        147        196       75.6       10.4
     256         49         74         74         98        147        220       75.0        8.5
    1024         49         74         98         98         98        245       76.3       10.7
    4096         49         74         98         98        147        196       79.9       12.8

--- free_best (no merge (used neighbors)) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         49         73         74         74        122        123       63.2       12.4
      64         49         73         74         74        123        147       63.3       12.5
     256         49         49         74         74        123        221       59.3       12.8
    1024         49         73         74         74        123      18693       67.0      263.8
    4096         49         49         74         74        123        196       59.2       12.6

--- worst/best ratio (p99) ---
    size     malloc       free
      16      1.66x      1.32x
      64      1.66x      1.32x
     256      1.32x      1.32x
    1024      1.66x      1.32x
    4096      1.66x      1.32x

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Feb 9, 2026

WCET Results (arm64)

TLSF WCET Analysis
==================
Timer:      ticks
Cache:      hot
Pool:       4194304 bytes (4.0 MB)
Iterations: 5000 (warmup: 500)
Sizes:      16 64 256 1024 4096 bytes

--- malloc_worst (small alloc from single huge block) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         16         32         32         32         40         40       31.0        2.8
      64         24         32         32         32         40         40       30.9        2.9
     256         24         32         32         40         40         48       30.8        3.2
    1024         24         32         32         40         40       1896       31.1       26.6
    4096         16         32         32         40         40         48       30.8        3.1

--- malloc_best (exact bin hit, no split) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         16         24         32         32         32         40       25.2        2.9
      64         16         24         32         32         32         40       25.3        2.9
     256         16         24         32         32         32         40       24.8        2.5
    1024         16         24         32         32         32         32       24.8        2.5
    4096         16         24         32         32         32         32       24.8        2.5

--- free_worst (sandwiched between two free blocks) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         16         24         32         32         32         48       25.8        3.3
      64          8         24         32         32         32         40       26.0        3.5
     256         24         24         32         32         32         40       25.7        3.3
    1024         16         24         32         32         32         40       25.8        3.3
    4096         24         24         32         32         32       1728       26.4       24.3

--- free_best (no merge (used neighbors)) ---
    size        min        p50        p90        p99      p99.9        max       mean     stddev
      16         16         24         24         24         24         32       20.5        4.0
      64          8         24         24         24         24         32       20.5        4.0
     256         16         24         24         24         24         24       20.5        4.0
    1024          8         24         24         24         24         32       20.5        4.0
    4096         16         24         24         24         24         24       20.6        4.0

--- worst/best ratio (p99) ---
    size     malloc       free
      16      1.00x      1.33x
      64      1.00x      1.33x
     256      1.00x      1.33x
    1024      1.25x      1.33x
    4096      1.25x      1.33x

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

@jserv jserv merged commit b60b174 into master Feb 9, 2026
10 checks passed
@jserv jserv deleted the configurable branch February 9, 2026 06:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant