Skip to content

T-207: SA perturbation phase in run_single_replicate()#222

Closed
ms609 wants to merge 12 commits intocpp-searchfrom
feature/pt-eval
Closed

T-207: SA perturbation phase in run_single_replicate()#222
ms609 wants to merge 12 commits intocpp-searchfrom
feature/pt-eval

Conversation

@ms609
Copy link
Owner

@ms609 ms609 commented Mar 24, 2026

Agent C (T-207). CI fixes by Agent A (roxygen regen, spelling wordlist, workflow alignment). GHA run 23509475416 PASS.

Multi-cycle PCSA (post-convergence simulated annealing with best-tree restart) integrated into the driven search pipeline. SA phase runs after drift, before final TBR polish. Enabled in large preset (≥120 tips) with saCycles=3.

ms609 added 12 commits March 24, 2026 13:56
Add PTDiagnostics struct to parallel_temper_search():
- Per-round chain scores, acceptance rates, timing
- Per-pair swap acceptance rates and Metropolis probabilities
- Cold chain score trajectory and improvement tracking
- Timing breakdown (cold TBR vs hot stochastic vs total)

Add ts_parallel_temper_diag Rcpp bridge for R-level benchmarking.

Also brings ts_wagner.h/cpp and ts_driven.h/cpp from
feature/adaptive-bandit to fix broken cpp-search build
(WagnerBias/BiasedWagnerParams were referenced but not defined).
Exposes parallel_temper_search() with PTDiagnostics to R for
benchmarking and evaluation. Returns chain_log, swap_log,
per-pair acceptance rates, cold chain trajectory, and timing
breakdown (cold TBR vs hot stochastic vs total).

Builds on ts_temper.h/cpp from T-198. Registered in init.c
and RcppExports.
Standard Boltzmann PT is fundamentally incompatible with discrete
parsimony. Pair 0-1 swap acceptance is 0% across all 8 matrices
(40-385 tips), all temperature ladders, and all moves-per-round
settings. Cold chain converges instantly; hot chains diverge too far.

See dev/pt_t199_findings.md for full analysis.
Added ts_anneal_diag() Rcpp bridge for simulated annealing + TBR polish.

Extended T-199 findings:
- Ultra-low T (0.01-0.1): warm chain still diverges 400-900 steps (EW)
  because stochastic TBR samples random moves while cold chain does
  systematic best-improvement sweep
- IW(k=3): gap shrinks from ~760 to ~22; with T_cold>=5, pair 0-1
  swaps succeed at 5-60%; one case showed forced-restart improving
  cold chain (59.16 -> 58.70); but cold-only still wins on average
- SA(t_start=20/50 -> 0): worse than cold-only because stochastic
  phases destroy tree structure before TBR polish can recover
Score-transfer PT: replaces Metropolis with direct injection when any
hot chain beats the cold chain. Modest benefit at 205 tips (occasional
wins when hot chains discover better trees).

Post-convergence SA: TBR to convergence, SA perturbation, TBR polish.
Strong under EW at 125+ tips: 60-170 step improvement at 205 tips vs
cold-only (which is stuck at its local optimum). Under IW, no benefit
because TBR already navigates the smoother landscape well.

Key insight: at large tree sizes, the cold chain converges to a
suboptimal local optimum. Extra TBR rounds don't escape it. Perturbation
(SA or ratchet-style) is needed.
- sa_cycles parameter: repeats SA+TBR polish, restarting each cycle
  from the best tree found so far (like ratchet's perturbation pattern)
- startEdge parameter: initialize from a given tree instead of Wagner
- Best-tree tracking: saves and restores the best tree state across cycles

Benchmark results (10 seeds, 205 tips EW):
  Cold-only:  mean 1413, SD 62
  PCSA×3:     mean 1373, SD 12
  PCSA×5:     mean 1367, SD 10  (best)
  Cold+SA×3:  mean 1376, SD 14

PCSA with best-tree restart dramatically reduces variance and
consistently outperforms cold-only TBR by 40-110 steps at 125-205 tips.
…peline

Add multi-cycle post-convergence simulated annealing (PCSA) with
best-tree restart to run_single_replicate(), inserted after drift
and before final TBR polish.

Each SA cycle: perturb current best tree via scheduled Boltzmann
cooling (t_start → t_end over n_phases), reconverge with TBR,
keep if improved. Benchmarked at 125–205 tips EW: reduces variance
6× (SD 62→10) and improves mean score by 40–100 steps vs cold-only
TBR.

DrivenParams gains sa_cycles, sa_t_start, sa_t_end, sa_n_phases,
sa_moves_per_phase. SearchControl() and ts_driven_search() Rcpp
bridge wired (SA params packed as NumericVector to stay within
.Call 65-arg limit). Enabled in large preset (≥120 tips) with
saCycles=3.

PhaseTimings gains sa_ms for profiling visibility.
When n_phases=1, frac was 1.0 (yielding t_end, strict hill-climbing).
Fixed to 0.0 so a single-phase SA runs at t_start for perturbation.
Found by S-RED focus 3.
Fixes codoc mismatch (SA params missing from usage section) and spelling
check failures. Agent A pickup of T-207 GHA 23507676351.
…error)

LaTeX warnings (inconsolata.sty missing) and vignette directory warnings
caused false FAIL with error-on: warning. Tests and spelling both passed.
Agent A.
anneal_search() now saves the tree at each phase boundary when it
improves on the best seen so far. After all phases complete, restores
the best tree instead of the last SA state.

Previously, Boltzmann acceptance could displace a good tree found in
an earlier phase, and PCSA would reconverge from a worse endpoint.
With this fix, each PCSA cycle's TBR starts from the best SA state.
ms609 added a commit that referenced this pull request Mar 24, 2026
…t A)

- T-182: Fixed codoc + spelling → GHA PASS → PR #221.
- T-207: Fixed codoc + spelling + CI → GHA PASS → PR #222.
- T-208: Already PR #219 (Agent G), cleaned up.
@ms609
Copy link
Owner Author

ms609 commented Mar 25, 2026

Closing: branch is 197 commits behind cpp-search with major regressions (missing constraint fixes T-214, grouped-list refactor, ratchet taper, SPR stale-state fix T-235, MaddisonSlatkin optimizations PR #211, dependency reduction PR #225).

The only novel contributions — multi-cycle PCSA perturbation (T-207) and anneal_search best-tree tracking (T-210) — will be cherry-picked onto a fresh branch from current cpp-search. See feature/pcsa-phase (forthcoming PR).

@ms609 ms609 closed this Mar 25, 2026
ms609 added a commit that referenced this pull request Mar 25, 2026
Cherry-picked from feature/pt-eval (closed PR #222).

- Multi-cycle PCSA: SA -> TBR polish -> keep best, repeated
  annealCycles times (T-207)
- Best-tree tracking in anneal_search() across SA phases (T-210)
- Single-phase temperature fix: n_phases=1 runs at t_start (hot)
- New SearchControl parameter: annealCycles (default 0 = disabled)
- Large preset: annealCycles=3
- Backward compat: compat wrapper auto-sets annealCycles=1 when
  annealPhases > 0 but cycles not specified
- Vignette: PCSA section in search-algorithm.Rmd
- Tests updated for annealCycles

WIP: needs build verification and GHA check before PR.
ms609 added a commit that referenced this pull request Mar 25, 2026
* feat(T-207/T-210): Multi-cycle PCSA perturbation phase

Cherry-picked from feature/pt-eval (closed PR #222).

- Multi-cycle PCSA: SA -> TBR polish -> keep best, repeated
  annealCycles times (T-207)
- Best-tree tracking in anneal_search() across SA phases (T-210)
- Single-phase temperature fix: n_phases=1 runs at t_start (hot)
- New SearchControl parameter: annealCycles (default 0 = disabled)
- Large preset: annealCycles=3
- Backward compat: compat wrapper auto-sets annealCycles=1 when
  annealPhases > 0 but cycles not specified
- Vignette: PCSA section in search-algorithm.Rmd
- Tests updated for annealCycles

WIP: needs build verification and GHA check before PR.

* fix: restore mc_fitch_scores export + .ts_driven_search_raw naming (cherry-pick artifacts)
@ms609 ms609 deleted the feature/pt-eval branch March 25, 2026 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant