Skip to content

⚡ Optimize the context layout and allocation strategy#95

Merged
kammce merged 1 commit intomainfrom
context-allocation-speedup
Apr 20, 2026
Merged

⚡ Optimize the context layout and allocation strategy#95
kammce merged 1 commit intomainfrom
context-allocation-speedup

Conversation

@kammce
Copy link
Copy Markdown
Member

@kammce kammce commented Apr 20, 2026

Summary

Replace std::span<stack_word> m_stack with two raw pointers (m_stack_pointer, m_stack_end), saving one word in the context object and bringing hot fields into the first cache line. Capacity is stored in the sentinel slot at m_stack_end rather than in span metadata.

Changes

  • Replace m_stack span with m_stack_pointer + m_stack_end pointers
  • Store capacity in m_stack.back() and read it back via *m_stack_end
  • Reorder members to pack m_state and m_sleep_time before the pointers, fitting more into the first cache line
  • Fix inplace_context constructor to call initialize_stack_memory() in the body rather than the initializer list — member zero-init runs after the base constructor and would otherwise clobber the capacity sentinel written into m_stack.back()
  • Update sync_wait tests to capture ctx.capacity() at construction instead of comparing against the raw template argument (capacity is now N - 1 due to the reserved sentinel slot)

Test Plan

  • Pre-commit checks pass
  • New/updated tests cover the changes
  • Tested locally with conan create .

\## Summary

Replace `std::span<stack_word> m_stack` with two raw pointers
(`m_stack_pointer`, `m_stack_end`), saving one word in the context
object and bringing hot fields into the first cache line. Capacity is
stored in the sentinel slot at `m_stack_end` rather than in span
metadata.

\## Changes

- Replace `m_stack` span with `m_stack_pointer` + `m_stack_end` pointers
- Store capacity in `m_stack.back()` and read it back via `*m_stack_end`
- Reorder members to pack `m_state` and `m_sleep_time` before the
  pointers, fitting more into the first cache line
- Fix `inplace_context` constructor to call `initialize_stack_memory()`
  in the body rather than the initializer list — member zero-init runs
  after the base constructor and would otherwise clobber the capacity
  sentinel written into `m_stack.back()`
- Update `sync_wait` tests to capture `ctx.capacity()` at construction
  instead of comparing against the raw template argument (capacity is
  now `N - 1` due to the reserved sentinel slot)

\## Test Plan

- [x] Pre-commit checks pass
- [x] New/updated tests cover the changes
- [x] Tested locally with `conan create .`
@kammce kammce force-pushed the context-allocation-speedup branch from 5f7a19c to 9d0113b Compare April 20, 2026 23:18
@kammce kammce merged commit fe26e2c into main Apr 20, 2026
7 of 8 checks passed
@kammce kammce deleted the context-allocation-speedup branch April 22, 2026 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant