Add: opt-in compiler sanitizers (ASAN/UBSan + TSAN) for host targets by ChaoWao · Pull Request #915 · hw-native-sys/simpler

ChaoWao · 2026-05-30T09:10:50Z

Summary

Implements #904: opt-in compiler sanitizers for host-compiled code, driven by a single --sanitizer parameter (no env vars), covering the runtime + per-test kernels/orchestration, plus a nightly CI sweep.

Scope rule — sanitize iff host-compiled (auto-enforced by Toolchain.is_host): sim instruments host+aicpu+aicore + sim_context/log + kernels + orchestration; onboard instruments host only; device toolchains (ccec/aarch64) never. Device custom arenas bypass ASAN redzones (documented limitation).

Key design points

Single --sanitizer parameter, no env vars. Install-time: pip install --config-settings=cmake.define.SIMPLER_SANITIZER=asan .; test-time --sanitizer compiles the per-test kernels to match + fails fast with the exact LD_PRELOAD command (both pytest and standalone paths).
g++-15 unification under sanitizer: sim kernels are always g++-15, so the runtime/helpers/orchestration build with g++-15 too when sanitizing — mixing g++ and g++-15 sanitizer runtimes is an ABI mismatch that fails at .so load. Onboard host stays g++ (its kernels are device, never instrumented).
ASAN+UBSan together; TSAN separate (mutually exclusive). cmake/sanitizers.cmake is generic (-fsanitize=${SIMPLER_SANITIZERS}); -Wno-error=tsan keeps TSAN's "atomic_thread_fence unsupported" warning from failing the AICPU -Werror build.
Sanitizers are Linux-only (macOS gcc-15's libasan isn't found by Apple ld).
Nightly CI is a SEPARATE workflow (.github/workflows/sanitizers.yml, schedule + workflow_dispatch) so the cron fires only the sanitizer jobs — never the PR/self-hosted pipeline. Not a PR gate.

Testing (validated on Linux via docker)

ASAN end-to-end: a2a3sim build + run, dynamic_register::test_register_after_init_then_run 6/6 green (~1.7× overhead). Build instrumentation confirmed via nm (host/aicpu/aicore + helpers); --sanitizer none = 0.
TSAN build: a2a3sim host+aicpu+aicore all build with __tsan instrumentation.
tests/ut/py/test_sanitizers.py (preset/validate/preload logic) — 20/20.
pre-commit green (ruff/pyright/markdownlint/check-yaml).
TSAN full run not executed locally (5–15× too slow + amplifies the pre-existing [Bug] Dynamic Register/Unregister Instability In A2A3 Sim CI #884 sim-oversubscription flake) — deferred to the nightly job (1200s timeout). The flake is more frequent under sanitizers' slower timing, handled by generous timeout + manual rerun, not a PR gate.

coderabbitai · 2026-05-30T09:11:02Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7151b8c1-1663-4692-98b9-aa4d33ceb6c8

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR introduces comprehensive ASAN/TSAN/UBSAN sanitizer support. A new sanitizer module validates preset/token combinations and maps to runtime libraries. Build and test systems thread sanitizer flags through CMake targets and compiler invocations. CI adds a nightly job to run instrumented scene tests on both platforms.

Changes

Sanitizer Support (ASAN/TSAN/UBSAN) Infrastructure

Layer / File(s)	Summary
Sanitizer infrastructure and validation module `simpler_setup/sanitizers.py`	New module provides preset expansion (`none`/`asan`/`ubsan`/`tsan` → `-fsanitize` tokens), validation to prevent illegal combinations (thread vs. address/leak/memory), and runtime library mapping for `LD_PRELOAD`.
Toolchain host capability marking `simpler_setup/toolchain.py`	Adds `is_host` boolean attribute to `Toolchain` (default `False`), set to `True` for `Gxx15Toolchain` and `GxxToolchain` to identify host-capable toolchains that may carry host sanitizer runtimes.
CMake sanitizer helper and target configuration `cmake/sanitizers.cmake`, `src/a2a3/platform/onboard/host/CMakeLists.txt`, `src/a2a3/platform/sim/{aicore,aicpu,host}/CMakeLists.txt`, `src/a5/platform/onboard/host/CMakeLists.txt`, `src/a5/platform/sim/{aicore,aicpu,host}/CMakeLists.txt`, `src/common/{log,sim_context}/CMakeLists.txt`	Defines shared CMake function `simpler_apply_sanitizers(tgt)` that conditionally applies sanitizer compile/link flags when `SIMPLER_SANITIZERS` is set. Platform CMakeLists files conditionally invoke this helper for all host runtime and simulation kernel targets.
Compiler pipeline sanitizer threading `simpler_setup/runtime_compiler.py`, `simpler_setup/kernel_compiler.py`	`RuntimeCompiler` stores sanitizer tokens in `_sanitizers` and emits CMake definitions only for host toolchains via `_sanitizer_cmake_args()`. `BuildTarget.gen_cmake_args()` accepts optional `sanitizers` parameter. `KernelCompiler._sanitizer_flags()` produces host-only compiler flags (frame pointers, `-O1`) and appends to orchestration and simulation kernel compilation commands.
Build system sanitizer configuration entry point `CMakeLists.txt`, `simpler_setup/build_runtimes.py`	Root `CMakeLists.txt` adds `SIMPLER_SANITIZER` cache option (default `none`) and threads it to `build_runtimes.py` CLI. Build script accepts `--sanitizer`, resolves/validates via `simpler_setup.sanitizers`, configures `RuntimeCompiler._sanitizers`, and passes to CMake configure for host targets.
Test framework sanitizer configuration and verification `conftest.py`, `simpler_setup/scene_test.py`	Pytest `conftest.py` adds `--sanitizer` CLI option, configures `KernelCompiler._sanitizers`, and verifies expected sanitizer runtime is mapped in-process (via `/proc/self/maps` on Linux) before running tests. Scene test runner adds `--sanitizer` CLI support and propagates setting to subprocess calls. Both modules raise `pytest.UsageError` with OS-specific `LD_PRELOAD` re-run instructions on failure.
CI workflow and user documentation `.github/workflows/ci.yml`, `docs/ci.md`, `docs/testing.md`	Adds nightly scheduled workflow trigger (18:00 UTC / 02:00 Beijing) that runs only new `sanitizer-sim` job. Job matrices over ASAN/TSAN and a2a3sim/a5sim, configures with `SIMPLER_SANITIZER`, preloads sanitizer runtimes via `LD_PRELOAD`, sets sanitizer environment options, and runs extended-timeout scene tests. Documents nightly behavior, sanitizer/runtime matching, `LD_PRELOAD` usage, and TSAN/ASAN mutual exclusion.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

#904: This PR directly implements the sanitizer infrastructure and CI job integration proposed in the issue, including CMake sanitizer toggles, toolchain support, compiler/runtime configuration, and nightly sanitizer-sim job.

Poem

🐰 Through the build in silence creeping,
Sanitizer threads come leaping—
ASAN guards and TSAN calls,
Validating through the halls!
Tokens dance in CMake's might,
Making host-side catches bright. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 76.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Add: opt-in compiler sanitizers (ASAN/UBSan + TSAN) for host targets' accurately and specifically describes the main change in the PR.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description comprehensively explains the sanitizer implementation, design decisions, scope, and testing approach directly relevant to the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces compiler sanitizer support (ASAN, TSAN, and UBSAN) for host-compiled targets, including the simulation runtime, kernels, orchestration, and onboard host runtime. It adds build-time and run-time configuration options, a centralized sanitizer helper module, and updates to the testing documentation and CI schedule. Feedback on these changes suggests adding --no-build-isolation to the documented pip command in CMakeLists.txt, importing from __future__ import annotations in Python files using PEP 585 generic collections to ensure compatibility with Python versions earlier than 3.10, and quoting CMake path variables like ${SIMPLER_CMAKE_DIR} to robustly handle directories containing spaces.

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/ci.yml:
- Around line 561-575: The checkout step currently uses actions/checkout@v5 and
setup-python@v6 unpinned and leaves the workflow token persisted; update the
checkout invocation referenced as "actions/checkout" to include
persist-credentials: false and replace both "uses: actions/checkout@v5" and
"uses: actions/setup-python@v6" with their corresponding immutable commit SHAs
(pin to a specific commit SHA) so the workflow uses pinned action versions and
does not persist credentials.
- Around line 6-9: The workflow-level schedule cron currently triggers the
entire workflow (causing jobs like pre-commit, packaging-matrix, ut, and
detect-changes to run on cron), but only sanitizer-sim was intended; update the
workflow so that either the sanitizer-sim job is the only one gated to the
schedule or move the cron into a separate workflow: add explicit job-level
guards such as if: github.event_name == 'pull_request' to jobs pre-commit,
packaging-matrix, ut, and detect-changes (or relocate the schedule entry into a
new workflow containing only the sanitizer-sim job) so cron runs only execute
the intended sanitizer-sim job.

In `@simpler_setup/sanitizers.py`:
- Around line 51-58: The validate function currently allows the 'thread'
sanitizer everywhere; update validate(tokens: str) to detect non-Linux hosts and
raise a ValueError when 'thread' (TSAN) is requested on non-Linux platforms:
check the platform (e.g., via sys.platform or platform.system()) at the start of
validate and if 'thread' is present in toks and the platform is not Linux, raise
a clear ValueError explaining TSAN is Linux-only; keep the existing
_THREAD_INCOMPATIBLE check and error message for the other incompatibilities.
- Around line 61-75: The preload_lib function currently ignores the raw token
"leak", so callers like resolve() that accept raw token lists never trigger
preloading for leak builds; update preload_lib(tokens: str) to treat "leak" the
same as "address" (return "libasan.so") or alternatively add upfront validation
in resolve() to reject unsupported raw tokens (including "leak") with a clear
error. Locate the preload_lib function and either add "leak" to the toks checks
alongside "address" or implement token validation in resolve() to raise on
unsupported tokens so the sanitizer runtime requirement isn't silently missed.

In `@simpler_setup/scene_test.py`:
- Around line 1355-1368: After resolving and validating the sanitizer tokens via
_san_resolve and _san_validate, perform the same "sanitizer runtime preloaded"
guard that the conftest path uses before assigning KernelCompiler._sanitizers;
if the runtime is not preloaded, call parser.error with an explanatory message
(similar to the conftest guard) so standalone runs (invoking _san_tokens and
KernelCompiler._sanitizers) fail early with an actionable error instead of
hitting dlopen/runtime failures.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0af69c3c-ca69-4f7e-80d8-80658aa4de1f

📥 Commits

Reviewing files that changed from the base of the PR and between d48fe39 and aa275d2.

📒 Files selected for processing (22)

.github/workflows/ci.yml
CMakeLists.txt
cmake/sanitizers.cmake
conftest.py
docs/ci.md
docs/testing.md
simpler_setup/build_runtimes.py
simpler_setup/kernel_compiler.py
simpler_setup/runtime_compiler.py
simpler_setup/sanitizers.py
simpler_setup/scene_test.py
simpler_setup/toolchain.py
src/a2a3/platform/onboard/host/CMakeLists.txt
src/a2a3/platform/sim/aicore/CMakeLists.txt
src/a2a3/platform/sim/aicpu/CMakeLists.txt
src/a2a3/platform/sim/host/CMakeLists.txt
src/a5/platform/onboard/host/CMakeLists.txt
src/a5/platform/sim/aicore/CMakeLists.txt
src/a5/platform/sim/aicpu/CMakeLists.txt
src/a5/platform/sim/host/CMakeLists.txt
src/common/log/CMakeLists.txt
src/common/sim_context/CMakeLists.txt

Closes part of hw-native-sys#904. Sanitizers are host-toolchain-only, threaded through a single --sanitizer parameter (no env vars), covering runtime + per-test kernels/orchestration, plus a nightly CI sweep. - cmake/sanitizers.cmake: generic helper, -fsanitize=${SIMPLER_SANITIZERS}; -Wno-error=tsan so TSAN's "atomic_thread_fence not supported" warning doesn't fail the AICPU target's -Werror build (known/acceptable TSAN limitation) - simpler_setup/sanitizers.py: preset table + mutual-exclusion validation + preload lib/command + host_cxx — one source of truth (unit-tested) - toolchain.py: is_host marker; GxxToolchain prefer_g15 — under a sanitizer the sim runtime/helpers/orchestration build with g++-15 to MATCH the sim kernels (mixing g++ and g++-15 sanitizer runtimes is an ABI mismatch that fails at .so load). Onboard host stays g++ (its kernels are device, never instrumented) - runtime + kernel_compiler: --sanitizer -> host targets only (is_host gate) - conftest.py + scene_test.py: --sanitizer compiles kernels to match, validates, and BOTH (pytest + standalone) fail fast with the exact LD_PRELOAD command - top CMakeLists: SIMPLER_SANITIZER cache var (pip install --config-settings) - .github/workflows/sanitizers.yml: SEPARATE nightly workflow (schedule + workflow_dispatch) so cron fires ONLY the sanitizer jobs, never the PR/ self-hosted pipeline (asan/tsan × a2a3sim/a5sim, ubuntu-only) - docs/testing.md + ci.md; tests/ut/py/test_sanitizers.py Sanitizers are Linux-only (macOS gcc-15's libasan isn't found by Apple ld). Validated on Linux (docker): a2a3sim ASAN build + run, dynamic_register 6/6 green (~1.7x overhead); TSAN host/aicpu/aicore build. The pre-existing sim-oversubscription flake is more frequent under ASAN's slower timing, so the nightly job uses a generous timeout and is not a PR gate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

gemini-code-assist Bot reviewed May 30, 2026

View reviewed changes

Comment thread CMakeLists.txt Outdated

Comment thread simpler_setup/kernel_compiler.py

Comment thread simpler_setup/runtime_compiler.py

Comment thread src/common/log/CMakeLists.txt Outdated

Comment thread src/common/sim_context/CMakeLists.txt Outdated

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

Comment thread .github/workflows/ci.yml Outdated

Comment thread .github/workflows/ci.yml Outdated

Comment thread simpler_setup/sanitizers.py

Comment thread simpler_setup/sanitizers.py

Comment thread simpler_setup/scene_test.py

ChaoWao force-pushed the feat/sanitizers branch 2 times, most recently from a6e1c15 to bd91022 Compare May 30, 2026 10:09

ChaoWao force-pushed the feat/sanitizers branch from bd91022 to 58d5bad Compare May 30, 2026 12:05

ChaoWao merged commit fa4a993 into hw-native-sys:main May 30, 2026
16 checks passed

ChaoWao mentioned this pull request May 31, 2026

CI: make the nightly sanitizer sweep pass (ASAN gate, TSAN informational) #931

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add: opt-in compiler sanitizers (ASAN/UBSan + TSAN) for host targets#915

Add: opt-in compiler sanitizers (ASAN/UBSan + TSAN) for host targets#915
ChaoWao merged 1 commit into
hw-native-sys:mainfrom
ChaoWao:feat/sanitizers

ChaoWao commented May 30, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 30, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ChaoWao commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key design points

Testing (validated on Linux via docker)

Uh oh!

coderabbitai Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ChaoWao commented May 30, 2026 •

edited

Loading

coderabbitai Bot commented May 30, 2026 •

edited

Loading