ci: fix gcc-ASAN workflow — adopt r-hub container on x86#262
Merged
Conversation
main's ASan.yml never received the vm.mmap_rnd_bits=28 fix from 867df93 (2026-03-24) -- that commit, and everything after it (container/arm experiments), landed only on unmerged feature branches. Main was still on the pre-fix Sept-2025 config, so GitHub's current ubuntu-24.04 runners hit "Shadow memory range interleaves with an existing memory mapping" (google/sanitizers#856) on every dispatch. Also add remotes to Config/Needs/memcheck: the vignettes leg was failing separately on "The package 'remotes' is required" from devtools::build_vignettes(). Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
vm.mmap_rnd_bits=28 (the original 867df93 fix) no longer prevents the shadow-memory clash on today's ubuntu-24.04 runners -- verified run 28644456099 still aborted with "Shadow memory range interleaves" in both tests and examples legs even with the sysctl applied (confirmed applied: vm.mmap_rnd_bits = 28 in the log). The March fix was already probabilistic ("vignettes happened to avoid the collision"), not deterministic. Lower rnd_bits to 8 and additionally wrap the ASAN Rscript invocation in `setarch $(uname -m) -R`, which disables ASLR outright (personality ADDR_NO_RANDOMIZE) for that process tree -- a deterministic fix rather than a smaller random window. Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
28 is the minimum vm.mmap_rnd_bits the kernel accepts on x86_64 (sysctl rejected 8 with "Invalid argument", killing all three legs in ~13s). Keep the sysctl at 28 and rely on the setarch -R wrapper from 9a2ef4c as the actual deterministic ASLR fix. Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
… path The bare-runner + manual-makevars approach on main cannot be made reliable: it hits a kernel ASLR/shadow-memory clash (google/sanitizers #856, "Shadow memory range interleaves") that neither sudo sysctl vm.mmap_rnd_bits=28 (commits 8a5de02) nor `setarch -R` to disable ASLR outright (9a2ef4c/f8b1ac05) prevents on today's ubuntu-24.04 runners — both were verified to still abort before reaching any R code. Switch to the r-hub gcc-asan container via ms609/actions/asan@main, which ships R-devel already built with ASAN/UBSAN and sidesteps the shadow-memory clash by construction. This is the exact config TreeTools ran green for 42 min on 2026-06-16 (commit decfe3e9), before an unrelated ubuntu-latest->ubuntu-24.04-arm switch broke it (the r-hub image has no arm64 variant; the container dies at init). We stay on ubuntu-latest (x86) and add exclude-packages: Rogue to skip Rogue's ~30-min Rfast source build. Also reverts the now-unnecessary DESCRIPTION Config/Needs/memcheck remotes addition: the action installs devtools with its hard deps (which include remotes), so build_vignettes() finds it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Verified run 28646564439 under the r-hub container: tests and examples passed clean under real ASAN, but vignettes failed on "The package 'remotes' is required" from devtools::build_vignettes(). devtools 2.5.0+ moved remotes to Suggests, so the action's hard-deps install of Config/Needs/memcheck (devtools) does not pull it. Name it explicitly. (TreeTools does not hit this — its memcheck uses pkgdown, not devtools.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ms609
added a commit
that referenced
this pull request
Jul 3, 2026
The r-hub gcc-asan container image has no arm64 variant; on ubuntu-24.04-arm the container dies at init before any R work runs (every run since 2026-06-19 failed in ~30-40s). Cherry-picked from TreeSearch main's fix (PR #262) so the T-329 fix can be verified under a working ASan CI.
ms609
added a commit
that referenced
this pull request
Jul 4, 2026
* fix(constraint): reject impossible constraints, guard collapsed-split OOB (T-329) An impossible (non-laminar) constraint reached random_constrained_tree() unfiltered: .PrepareConstraint() only filtered splits by size, never checked pairwise compatibility. In random_constrained_tree(), a split that loses all its tips to tighter non-laminar splits collapses (split_root = -1); if it still has a strict-superset parent, the parent's child-split loop pushed that -1 as a phantom item, causing tree.parent[-1] and tree.left/right[n_tip-1] out-of-bounds writes. Primary fix: .PrepareConstraint() now validates pairwise split compatibility (disjoint or nested) and stops with a clear error before reaching the C++ builder. Defensive fix: guard split_root[j] >= 0 in random_constrained_tree()'s child-split collection, mirroring the existing root-level guard. Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com> * ci(ASan): switch mem-check runner to ubuntu-latest (x86) The r-hub gcc-asan container image has no arm64 variant; on ubuntu-24.04-arm the container dies at init before any R work runs (every run since 2026-06-19 failed in ~30-40s). Cherry-picked from TreeSearch main's fix (PR #262) so the T-329 fix can be verified under a working ASan CI. * ci(asan): install phangorn for tests leg, exclude MaxMin from vignettes leg The tests leg only installs hard deps + Config/Needs/memcheck (Suggests are skipped for tests/examples), but tests/testthat/*.R make heavy unguarded use of phangorn::phyDat() etc. -> "there is no package called 'phangorn'". Add it to Config/Needs/memcheck so it installs regardless of leg. The vignettes leg installs all immediate Suggests via a single pak::pkg_install() batch, which resolves plain package names against CRAN/Bioconductor only - it doesn't see the local DESCRIPTION's Remotes mapping. MaxMin is GitHub-only (Remotes: ms609/MaxMin), so that batched solve fails outright with "Can't find package called MaxMin", taking every other Suggest down with it. Exclude MaxMin, mirroring the existing Rogue exclusion: its only use (WideSample()) is requireNamespace-guarded and no vignette/example exercises it. * fix: guard zero-length memcpy in TreeState::load_tip_states Datasets whose characters are all uninformative (e.g. every observed state is a singleton) collapse to zero blocks and a zero-length tip_states vector in build_dataset(). std::vector::data() on an empty vector may legally return nullptr, which memcpy's nonnull-parameter attribute forbids — flagged by UBSan (gcc-ASAN run 28662381835) as a null-pointer violation reached via ts_char_steps() -> init_from_edge() -> load_tip_states(). Skip the copy when there is nothing to copy. Covered by the existing "ConcordantInformation() works" regression test in test-Concordance.R, which already exercises an all-singleton dataset via ConcordantInformation() -> CharacterLength() -> ts_char_steps(). Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com> * fix(constraint): use four-gamete split compatibility, not laminar-subset (T-329 follow-up) T-329 (19f4f97) added a laminarity gate to .PrepareConstraint() that tested each constraint character's "1"-side as a full bipartition vs everything-else and accepted only disjoint-or-nested pairs. This rejected SATISFIABLE constraints: it (1) folded wildcard "?" and unconstrained tips into the "0" side though they are free to plot either side, and (2) tested only three of the four split-compatibility cases, missing the case where the two "0" sides are disjoint (the "1" sides jointly cover the constrained tips). The gcc-ASAN CI (run 28662381835) surfaced this on the pre-existing Morphy test and the tree-search vignette "complex-constraints" chunk (ab|cef & abcd|ef, g free) -- both displayable on ((a,b),(d,(c,(e,f))))+g. T-329's own "impossible" example {t1,t2,t3}&{t3,t4,t5} was likewise satisfiable (coexist on ((t1,t2),t3,(t4,t5))). Fix: build a companion 0-group matrix and apply the four-gamete test on both groups (wildcards excluded) -- incompatible iff all four group intersections are non-empty. Valid constraints return to their working pre-T-329 path; genuinely-impossible ones still error. The C++ OOB guard (ts_wagner.cpp split_root[j] >= 0) is the real crash fix and is retained unchanged. Tests: corrected the first T-329 case to a genuinely four-gamete-incompatible constraint ({t1,t2,t3}&{t2,t3,t4}); added a positive regression asserting the non-laminar-but-compatible case is accepted. Verified locally: Morphy suite, vignette chunk, and 446 constraint tests all green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(wagner): guard zero-Fitch-words start-tree build (all-hierarchy HSJ/xform) When every character belongs to a hierarchy block, the recoded equal-weights dataset has zero blocks, so DataSet::total_words == 0 and the per-word Fitch state vectors (ds.tip_states, edge_set) are empty. wagner_tree() unconditionally took the address of element 0 of those empty vectors (&ds.tip_states[tip * total_words] == &ds.tip_states[0], and likewise &edge_set[node * tw]) to feed the incremental insertion-cost machinery. That is undefined behaviour (address of a past-the-end element of an empty vector) and aborted under the hardened libstdc++ assertions in the gcc-ASAN CI (run 28662381835, ts-xform group; _GLIBCXX_ASSERTIONS operator[] __n < size()). With no Fitch characters every insertion cost is identically zero, so guard the per-word work behind have_words = ds.total_words > 0: skip the edge-set precompute and evaluate the indirect length as 0, while still running the DFS so constraints are honoured and the first legal edge is chosen. The search then optimises the tree from the Sankoff (xform) / hierarchy-DP (HSJ) term alone. The have_words == true path is behaviourally identical to before. Regression tests (test-ts-xform.R, test-ts-hsj.R) run an all-hierarchy dataset through MaximizeParsimony() so the hardened/ASAN CI covers the zero-words path. Verified locally with a -D_GLIBCXX_ASSERTIONS -O0 -g build: the whole ts-xform group (incl. SK-01, which previously aborted) and ts-hsj group now pass with no abort; a normal equal-weights search is unaffected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
gcc-ASANworkflow onmainhas been broken anddisabled_manually.main'sASan.ymlwas frozen since Sept 2025 on the bare-runner +manual-makevars approach, which hits a kernel ASLR/shadow-memory clash on
current
ubuntu-24.04runners:This aborts before any R code runs. (Note: the arm-runner/r-hub-container
breakage seen on unmerged branches never affected
main— that was aseparate red herring.)
Rejected fixes (empirically)
Both standard ASLR workarounds were tried and verified to still abort:
sudo sysctl vm.mmap_rnd_bits=288a5de02setarch "$(uname -m)" -R(disable ASLR outright)9a2ef4c/f8b1ac0The shadow-memory clash is x86-bare-runner-specific and not reliably
fixable this way on today's runner images.
Fix
Adopt the r-hub
gcc-asancontainer onubuntu-latest(x86) viams609/actions/asan@main— the container ships R-devel already built withASAN/UBSAN and sidesteps the shadow-memory clash by construction. This is the
exact config TreeTools runs green (42 min on 2026-06-16, and a fresh 17-min
green run today). Adds
exclude-packages: Rogue(skips Rogue's ~30-minRfastsource build) and
remotestoConfig/Needs/memcheck(the vignettes leg'sdevtools::build_vignettes()needs it; devtools 2.5.0+ demotedremotestoSuggests).
Verification
tests + examples green under real ASAN; vignettes failed only on the
remotesdependency.all three legs green (tests/examples/vignettes) after the
remotesfix.No memory-safety findings surfaced in TreeSearch's own C++.
Merge note
Please squash-merge: the branch carries 3 abandoned sysctl/setarch commits
(kept for the empirical record) before the working container fix (
562dd70)and the
remotesfix (8e9c510).🤖 Generated with Claude Code