Skip to content

C++ modernisation: build hygiene, UNIV fixes, sync primitives, networking safety#419

Merged
benswift merged 15 commits intomasterfrom
cpp-modernisation
Apr 19, 2026
Merged

C++ modernisation: build hygiene, UNIV fixes, sync primitives, networking safety#419
benswift merged 15 commits intomasterfrom
cpp-modernisation

Conversation

@benswift
Copy link
Copy Markdown
Collaborator

@benswift benswift commented Apr 19, 2026

Applies phases 1-5 of the 2026-04-19 C++ audit. See docs/superpowers/plans/2026-04-19-cpp-modernisation.md for the phased plan.

Summary

  • Build hygiene-Wall -Wextra -Wshadow with -Werror=return-type and -Werror=uninitialized; SYSTEM-scoped LLVM/portaudio/networking-TS includes; EXTEMPORE_SANITIZE=asan|ubsan|tsan build option; .clang-format and .clang-tidy configs plus a tidy make target.
  • UNIV.cpp correctness — fix sys_slurp_file off-by-one + unchecked fread + missing terminator; fix cname_decode pad-buffer leak (shadow of outer d2 meant the free was always a no-op); add a bounds guard to rsplit; clean up the base64_decode/cname_decode ternary ladders.
  • Kill hand-rolled sync primitives — delete EXTMutex, EXTMonitor, EXTCondition; callers now use std::recursive_mutex + std::condition_variable_any + std::lock_guard directly. Recursive semantics preserved everywhere (some call paths rely on it, e.g. llvm_mutex_* exposed to xtlang).
  • Networking safety — replace 4 gethostbyname call sites with a new extemp::net_util::resolve_ipv4 helper backed by getaddrinfo (thread-safe); fix SchemeREPL::m_serverSocket sentinel (was 0, could cause close(stdin) on a failed connect); make sig_handler async-signal-safe; remove freopen("/dev/null", "w", stderr) (was hiding all LLVM diagnostics).
  • Test harness — adds tests/cpp-unit/ with GoogleTest via FetchContent, 8 unit tests covering the slurp and network helpers. Wired into CI via the cpp-unit ctest label.

All four platforms green (Linux x86_64, Linux aarch64, macOS aarch64, Windows x86_64).

Out of scope (tracked as follow-up backlog tasks)

  • TASK-039 RAII audio/scheme thread ownership
  • TASK-040 LLVM_SCHEME_FF_MAP / FFI table thread safety
  • TASK-041 zone allocator mutex on audio thread (RT safety)
  • TASK-042 audio dispatcher race conditions
  • TASK-043 OSC.cpp UDP/TCP × Win/POSIX consolidation
  • TASK-044 EXTThread → std::jthread migration
  • TASK-045 rreplace/rsplit API redesign with explicit capacity

Test plan

  • CI green on Linux x86_64 / Linux aarch64 / macOS aarch64 / Windows x86_64
  • 8 cpp-unit tests passing (slurp file behaviour + IPv4 resolver)
  • libs-core / libs-external / compiler-unit / examples-core / examples-audio ctest labels green
  • Zero warnings from extempore's own code under -Wall -Wextra -Wshadow
  • extempore binary runs under --batch mode

benswift added 15 commits April 19, 2026 10:41
Reflect the v0.9.0-beta landscape: LLVM 22 via FetchContent, s7, ORC JIT,
WebGPU graphics, Linux aarch64, --repl, CMake 3.28, VS2022. Strip
references to removed or unmaintained paths (OpenGL, legacy ASIO,
VS2019, Catalina signing workaround, old PortAudio layout).
Turns on -Wall -Wextra -Wshadow as warnings and promotes
-Werror=return-type and -Werror=uninitialized. Silences noisy
categories that fire on vendored code (-Wno-unused-parameter,
-Wno-unused-variable, -Wno-unused-function, -Wno-missing-field-initializers,
-Wno-sign-compare, -Wno-deprecated-declarations).

Marks LLVM, portaudio, and networking-ts-impl includes as SYSTEM
to suppress 6000+ transitive warnings from third-party headers.

MSVC gets /W3 with /we4715 (missing-return → error) and -Wno-*
suppressions for the noisy width-conversion warnings.

Currently ~22 warnings remain in Extempore's own code, all of which
are either fixable in follow-up commits or point at real bugs
(e.g. the d2 shadow in UNIV.cpp:173 is cname_decode's pad-branch leak).
Developer-facing CMake cache variable. Accepts asan, ubsan, tsan,
or combinations like asan+ubsan. No-op on Windows (uses MSVC).
Available for editor-on-save and incremental use on changed lines.
A mass-reformat commit is deliberately deferred to TASK-023 so git
blame history isn't poisoned in this refactor branch.
Conservative initial config - only bug-finding checks (bugprone-*,
const-correctness, performance-*, select cppcoreguidelines/readability).
WarningsAsErrors is empty so running tidy never breaks the build.

HeaderFilterRegex excludes pcre, s7, networking-ts-impl, ffi/, shims/,
and linenoise (vendored third-party).

The 'tidy' custom target filters EXTEMPORE_SOURCES to non-vendored
files and runs clang-tidy against build/compile_commands.json.
Also forces CMAKE_EXPORT_COMPILE_COMMANDS on globally.
Fetches googletest v1.15.2 via FetchContent when BUILD_TESTS is on.
tests/cpp-unit/ holds a smoke test for now; real UNIV tests land in
follow-up commits.

Registered via add_subdirectory from extras/cmake/tests.cmake so it
slots into the existing ctest structure. Adds cpp-unit to the CI
label regex so the 4-platform matrix runs the tests.
…ator

The old implementation had three overlapping bugs:
- allocated file_size bytes and wrote the terminator at buf[file_size-1],
  truncating the last content byte
- ignored fread's return, silently returning a partially-populated buffer
  on short reads
- no room for a terminator past the content

Extracts the pure logic into extemp::file_util::slurp_file() in a new
header-only ext/FileUtil.h so cpp-unit tests can exercise it without
dragging the rest of UNIV.cpp. sys_slurp_file becomes a thin wrapper
that layers the SHARE_DIR fallback.

Adds 5 cpp-unit tests covering ASCII, empty, last-byte preservation,
embedded nulls, and missing files.
The inner 'char* d2 = ...' shadowed the outer 'char* d2 = nullptr;', so
the 'if (d2) free(d2)' at the bottom of the function never fired and
every cname_decode() call with unaligned input length leaked the pad
buffer. Renames the inner variable to pad_buf and frees it on every
exit path.

Also refactors the four-line base64 sextet-fetch ladder in both
cname_decode and base64_decode into a single local lambda for
readability. The old expression 'data[i] == '$' ? 0 & i++ : table[data[i++]]'
was valid under C++17 statement-level sequence points but was genuinely
hard to read.

Silences the -Wshadow warning that flagged the original bug.
Add a 2048-byte ceiling check to rsplit, matching the salloc size
used by every known xtlang caller (libs/core/adt.xtm). Without this
the function writes straight into the caller's 2048-byte buffer with
no size information. Returns false instead of overrunning.

Proper fix (passing explicit capacity, or returning std::string) is
tracked as a follow-up backlog task — it needs coordinated changes
in runtime/bitcode.ll and callers in libs/core/adt.xtm.
The three EXT* headers wrapped std::recursive_mutex, std::mutex +
std::condition_variable_any, and std::condition_variable_any respectively,
adding only a debug name string. They predated std::lock_guard so they
invented their own ScopedLock, and used std::recursive_mutex unconditionally.

Call sites now use std::recursive_mutex / std::condition_variable_any and
std::lock_guard directly. Recursive semantics are preserved everywhere
since some code paths (notably llvm_mutex_* exposed to xtlang, and the
scheduler guard) lock the same mutex more than once from the same thread.

TaskScheduler keeps a named Monitor struct with lock/unlock/wait/signal
because the scheduler thread needs to wait on the guard; everything else
uses the bare std:: types.

EXTThread is unchanged for now — it provides platform-specific realtime
scheduling (SCHED_RR on Linux, THREAD_TIME_CONSTRAINT_POLICY on macOS) and
a subsume-current-thread mode that std::thread doesn't natively offer.
Migration to std::jthread tracked as a follow-up backlog task.

All libs-core and cpp-unit tests pass locally.
gethostbyname is deprecated, not thread-safe, and IPv4-only. Four call
sites (EXTLLVM.cpp, OSC.cpp, SchemeREPL.cpp, LinenoiseREPL.cpp) now go
through a new extemp::net_util::resolve_ipv4() helper in
include/ext/NetUtil.h that wraps getaddrinfo.

Also:
- SchemeREPL::m_serverSocket now uses -1 (POSIX) or nullptr (Windows)
  as its sentinel, not 0. The old 0-as-invalid pattern meant a failed
  connect could leave FD 0 in the field, and a subsequent closeREPL()
  would close stdin.
- sig_handler is now async-signal-safe: uses write(2) + _Exit instead
  of printf + exit(). Drops descriptive messages in favour of one-line
  writes.
- Removes the freopen("/dev/null", "w", stderr) call that silenced
  all LLVM / assert diagnostics for the life of the process. Users who
  want quiet stderr can shell-redirect.

Adds 3 cpp-unit tests for resolve_ipv4 covering localhost, null, and
non-resolvable input.
Captures the phased plan that #419 is executing and spins off the
deferred audit findings as discrete backlog tasks so they don't get
lost:

- TASK-038 onboarding UX work (parked from the same audit)
- TASK-039 RAII conversion (audio/scheme thread ownership)
- TASK-040 LLVM_SCHEME_FF_MAP / FFI table thread safety
- TASK-041 zone allocator mutex on audio thread (RT safety)
- TASK-042 audio dispatcher race conditions
- TASK-043 OSC.cpp UDP/TCP × Win/POSIX consolidation
- TASK-044 EXTThread migration to std::jthread where safe
- TASK-045 rreplace/rsplit API redesign with explicit capacity

.gitignore gets .remember/ and build-*san/ directories.
Bonus fixes riding on top of the Phase 1 warning flag turn-on:

- SchemeTask constructor parameter 'Type' renamed to 'TaskType' — the
  parameter was shadowing the enum class of the same name, producing
  the warning in every TU that included SchemeProcess.h
- OSC.cpp:671 'struct stat buf' renamed to 'st' (outer 'char buf[]'
  was in scope)
- OSC.cpp:964 inner loop 'int i' renamed to 'j' (outer parameter 'int* i'
  was in scope)
- SchemeProcess.cpp:506 'auto res(accept(...))' renamed to 'sock' (the
  inline select() result above was already named 'res')
- SchemeProcess.cpp:402,529 prompt buffer widened from 32 to 64 bytes
  and values reduced modulo sensible ranges to keep the printed time
  readable ('%.2u' on a raw uint can print 10 digits)

extempore code now compiles cleanly under -Wall -Wextra -Wshadow
-Werror=return-type -Werror=uninitialized.
Applying /DEF:extempore.def via CMAKE_EXE_LINKER_FLAGS makes every
executable inherit it, which breaks extempore_cpp_unit.exe on Windows
(it can't satisfy the 27 extempore symbols the .def file exports).

Move to target_link_options on the extempore target so other executables
(cpp-unit tests) link cleanly.
On Windows, getaddrinfo silently returns WSANOTINITIALISED until
WSAStartup has been called. The extempore binary initialises Winsock
elsewhere (via portaudio / networking-TS), but the cpp-unit test
executable does not, so ResolveIpv4.LocalhostResolves failed on Windows.

Add a std::call_once wrapper that calls WSAStartup the first time
resolve_ipv4 runs. No WSACleanup — the process exits and the OS cleans
up. Matches the existing no-op-on-UNIX pattern.
@benswift benswift marked this pull request as ready for review April 19, 2026 08:15
@benswift benswift changed the title C++ modernisation: phases 1-4 (WIP) C++ modernisation: build hygiene, UNIV fixes, sync primitives, networking safety Apr 19, 2026
benswift added a commit that referenced this pull request Apr 19, 2026
Two small ownership fixes from the #419 follow-up backlog:

- AudioDevice::m_threads[MAX_RT_AUDIO_THREADS] was a C array of raw
  owning EXTThread* that were `new`ed in init and never `delete`d.
  Now a std::array<std::unique_ptr<EXTThread>, N>.  The dtor detaches
  each running thread so unique_ptr destruction doesn't call
  std::terminate on the underlying std::thread (the RT threads
  intentionally run `while(true)` and are cleaned up at process exit).
  The unused getMTThreads() accessor is removed; no callers existed.

- The imp_randd PRNG was a heap-allocated thread_local pointer that
  leaked on thread exit and required a null check per call.  Now a
  value-typed thread_local seeded with time(nullptr) on first access
  per thread, keeping the existing per-thread seeding semantics.
benswift added a commit that referenced this pull request Apr 19, 2026
- TASK-040 (FFI thread safety): mark Done; moved to completed/.
- TASK-039: tick off m_threads and PRNG; leave singletons and
  SchemeTask::m_ptr as remaining scope.
- TASK-044: tick off the two REPL threads in Extempore.cpp; list the
  call sites that must stay on EXTThread (RT priority / setSubsume /
  xtlang ABI) and the deferred candidates (TaskScheduler, OSC).
@benswift benswift merged commit 8451ca3 into master Apr 19, 2026
4 checks passed
benswift added a commit that referenced this pull request Apr 19, 2026
Captures the phased plan that #419 is executing and spins off the
deferred audit findings as discrete backlog tasks so they don't get
lost:

- TASK-038 onboarding UX work (parked from the same audit)
- TASK-039 RAII conversion (audio/scheme thread ownership)
- TASK-040 LLVM_SCHEME_FF_MAP / FFI table thread safety
- TASK-041 zone allocator mutex on audio thread (RT safety)
- TASK-042 audio dispatcher race conditions
- TASK-043 OSC.cpp UDP/TCP × Win/POSIX consolidation
- TASK-044 EXTThread migration to std::jthread where safe
- TASK-045 rreplace/rsplit API redesign with explicit capacity

.gitignore gets .remember/ and build-*san/ directories.
benswift added a commit that referenced this pull request Apr 19, 2026
Two small ownership fixes from the #419 follow-up backlog:

- AudioDevice::m_threads[MAX_RT_AUDIO_THREADS] was a C array of raw
  owning EXTThread* that were `new`ed in init and never `delete`d.
  Now a std::array<std::unique_ptr<EXTThread>, N>.  The dtor detaches
  each running thread so unique_ptr destruction doesn't call
  std::terminate on the underlying std::thread (the RT threads
  intentionally run `while(true)` and are cleaned up at process exit).
  The unused getMTThreads() accessor is removed; no callers existed.

- The imp_randd PRNG was a heap-allocated thread_local pointer that
  leaked on thread exit and required a null check per call.  Now a
  value-typed thread_local seeded with time(nullptr) on first access
  per thread, keeping the existing per-thread seeding semantics.
benswift added a commit that referenced this pull request Apr 19, 2026
- TASK-040 (FFI thread safety): mark Done; moved to completed/.
- TASK-039: tick off m_threads and PRNG; leave singletons and
  SchemeTask::m_ptr as remaining scope.
- TASK-044: tick off the two REPL threads in Extempore.cpp; list the
  call sites that must stay on EXTThread (RT priority / setSubsume /
  xtlang ABI) and the deferred candidates (TaskScheduler, OSC).
benswift added a commit that referenced this pull request Apr 19, 2026
Two small ownership fixes from the #419 follow-up backlog:

- AudioDevice::m_threads[MAX_RT_AUDIO_THREADS] was a C array of raw
  owning EXTThread* that were `new`ed in init and never `delete`d.
  Now a std::array<std::unique_ptr<EXTThread>, N>.  The dtor detaches
  each running thread so unique_ptr destruction doesn't call
  std::terminate on the underlying std::thread (the RT threads
  intentionally run `while(true)` and are cleaned up at process exit).
  The unused getMTThreads() accessor is removed; no callers existed.

- The imp_randd PRNG was a heap-allocated thread_local pointer that
  leaked on thread exit and required a null check per call.  Now a
  value-typed thread_local seeded with time(nullptr) on first access
  per thread, keeping the existing per-thread seeding semantics.
benswift added a commit that referenced this pull request Apr 19, 2026
- TASK-040 (FFI thread safety): mark Done; moved to completed/.
- TASK-039: tick off m_threads and PRNG; leave singletons and
  SchemeTask::m_ptr as remaining scope.
- TASK-044: tick off the two REPL threads in Extempore.cpp; list the
  call sites that must stay on EXTThread (RT priority / setSubsume /
  xtlang ABI) and the deferred candidates (TaskScheduler, OSC).
@benswift benswift deleted the cpp-modernisation branch April 19, 2026 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant