Skip to content

fix(checker): self-driving stack-depth measurement + CI guard (#37)#80

Merged
hyperpolymath merged 2 commits into
mainfrom
claude/bold-noether-8uSuC
May 30, 2026
Merged

fix(checker): self-driving stack-depth measurement + CI guard (#37)#80
hyperpolymath merged 2 commits into
mainfrom
claude/bold-noether-8uSuC

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Closes the one remaining open DoD item on #37: the Windows-CI leg of the stack-budget measurement that confirms MAX_EXPR_DEPTH = 128 is safe on the 1 MiB msvc main-thread stack.

Background

Subtleties 1 & 3 of #37 (general iterative AST teardown; MAX_EXPR_DEPTH re-derived 256 → 128) were fixed in #43. That PR explicitly left #37 open for a single datapoint: "the Windows-CI leg of the stack measurement — needs CI access … Leave #37 open until that datapoint confirms the 128 budget on the msvc toolchain."

The blocker was that examples/measure_depth.rs was a single-shot probe — it ran one walk at one depth/stack and relied on a human reading the exit code across many runs. Nothing in CI drove it, so the Windows number was never produced.

What this does

examples/measure_depth.rs — rewritten into a self-driving measurement. A stack overflow aborts the process, so it can't be caught in-process; the example now re-execs itself as worker subprocesses and binary-searches the overflow cliff for each of: recursive Drop, the guarded check_expr walk, and the iterative teardown. It prints per-platform bytes/level and exits non-zero if MAX_EXPR_DEPTH no longer fits the 1 MiB floor with headroom. The old worker mode (measure_depth <mode> <depth> <stack_kib>) is preserved as the subprocess entry point.

.github/workflows/stack-depth.yml — new. Runs the measurement + the regression test on ubuntu-latest and windows-latest (same SHA-pinned checkout / rust-toolchain actions as checker-scaling.yml). The Windows datapoint is now produced and locked in on every change — a future bump that breaks the budget fails CI on the affected OS.

crates/my-lang/tests/stack_depth_37.rs — new. Asserts, on a thread pinned to exactly 1 MiB (the Windows budget), that the depth-MAX_EXPR_DEPTH guarded walk and a 1,000,000-deep iterative teardown both survive. This is the OS-portable regression guard.

checker.rs — doc update. The MAX_EXPR_DEPTH doc-comment now records that the budget is reconfirmed automatically on both OSes, not by a one-off manual run.

Verification (Linux, this run)

recursive Drop      : cliff ~4016 levels @ 256 KiB  (~65 B/level)
checker @ depth 128 : fits in >= 111 KiB  (~888 B/level)
iterative teardown  : survives 1000000 levels @ 256 KiB: true
PASS: MAX_EXPR_DEPTH=128 is safe within 819 KiB of the 1024 KiB floor on linux.

cargo test -p my-lang --test stack_depth_37 --release → 2 passed, 0 failed. The Windows leg runs once this lands and the windows-latest job executes.

DoD status (#37)

Once the Stack Depth (#37) workflow goes green on windows-latest, #37 can close.

https://claude.ai/code/session_013JnrUmkCpMHsABRmhLVyd2


Generated by Claude Code

claude added 2 commits May 30, 2026 20:55
Closes the one open DoD item on #37: the Windows-CI leg of
the stack-budget measurement confirming MAX_EXPR_DEPTH=128 is safe on the 1 MiB
msvc main-thread stack. Subtleties 1 & 3 were fixed in #43; the value was left
open until that datapoint, and nothing yet ran the probe.

- examples/measure_depth.rs: rewrite the single-shot probe into a self-driving
  measurement. It re-execs itself as worker subprocesses, binary-searches the
  overflow cliff for the recursive Drop, the guarded checker walk and the
  iterative teardown, prints per-platform bytes/level, and exits non-zero if
  MAX_EXPR_DEPTH no longer fits the 1 MiB floor with headroom. One command now
  produces and asserts the datapoint on any platform (was a probe needing a
  manual exit-code wrapper).
- .github/workflows/stack-depth.yml: run that measurement and the regression
  test on ubuntu-latest + windows-latest (same pinned actions as
  checker-scaling.yml), so the Windows datapoint is produced and locked in on
  every change instead of relying on a manual run.
- tests/stack_depth_37.rs: assert the depth-128 guarded walk and a 1e6-deep
  iterative teardown both survive a 1 MiB stack.
- checker.rs: document that the budget is now reconfirmed automatically on both
  OSes rather than by a one-off manual measurement.

https://claude.ai/code/session_013JnrUmkCpMHsABRmhLVyd2
Review of the self-driving measurement flagged that survives() collapsed
both subprocess spawn-failure and any non-zero exit into false ("cliffed"),
with no positive proof the walk ran — an infra failure (or a future path
that exits 0 without doing the work) could skew a measured cliff or risk a
false PASS.

- survives() now distinguishes three outcomes: spawn-failure / exit-0-without-
  running-the-walk are fatal (exit 3, never a datapoint); a survived run
  requires exit 0 AND the worker's "OK ..." completion line; a non-zero exit
  is the genuine overflow cliff signal.
- Document that overflow-aborts-the-process is the cliff mechanism and that
  join().is_err() only covers the unwinding-panic path (default panic=unwind;
  abort would also exit non-zero, so either strategy is safe).

Driver still PASSes locally (MAX_EXPR_DEPTH=128 safe within 819 KiB of the
1 MiB floor); the binary searches were independently verified correct.

https://claude.ai/code/session_013JnrUmkCpMHsABRmhLVyd2
@hyperpolymath hyperpolymath marked this pull request as ready for review May 30, 2026 22:06
@hyperpolymath hyperpolymath merged commit efba529 into main May 30, 2026
15 of 24 checks passed
@hyperpolymath hyperpolymath deleted the claude/bold-noether-8uSuC branch May 30, 2026 22:06
hyperpolymath added a commit that referenced this pull request May 30, 2026
@github-actions
Copy link
Copy Markdown

🔍 Hypatia Security Scan

Findings: 96 issues detected

Severity Count
🔴 Critical 6
🟠 High 39
🟡 Medium 51

⚠️ Action Required: Critical security issues found!

View findings
[
  {
    "reason": "Action perpolymath/standards/.github/workflows/governance-reusable.yml@main\n needs attention",
    "type": "unpinned_action",
    "file": "governance.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in cflite_batch.yml",
    "type": "missing_timeout_minutes",
    "file": "cflite_batch.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in cflite_batch.yml",
    "type": "missing_timeout_minutes",
    "file": "cflite_batch.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in cflite_pr.yml",
    "type": "missing_timeout_minutes",
    "file": "cflite_pr.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in cflite_pr.yml",
    "type": "missing_timeout_minutes",
    "file": "cflite_pr.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in checker-scaling.yml",
    "type": "missing_timeout_minutes",
    "file": "checker-scaling.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in codeql.yml",
    "type": "missing_timeout_minutes",
    "file": "codeql.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in governance.yml",
    "type": "missing_timeout_minutes",
    "file": "governance.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in hypatia-scan.yml",
    "type": "missing_timeout_minutes",
    "file": "hypatia-scan.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in mirror.yml",
    "type": "missing_timeout_minutes",
    "file": "mirror.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants