feat(tools): perf-harness.sh — multi-run boot-to-end aggregation (P10) by cemililik · Pull Request #21 · HodeTech/Tyrne

cemililik · 2026-05-08T10:25:11Z

Summary

P10 wall-clock benchmark harness lands. Promotes single-run boot-to-end anecdotes (the wide ~4–6.5 ms band that absorbed PR #17's 4.1 ms vs PR #16's 5.8 ms without flagging) to a multi-run percentile band measured against a tight statistical envelope — load-bearing for measuring B2 (T-016 MMU activation) regressions against a sound baseline rather than the pre-P10 envelope.

Two commits:

1de8143 — feat(tooling): add perf-harness.sh for multi-run boot-to-end aggregation. Pure bash + awk (macOS bash 3.2 + BSD awk compatible). Portable per-run watchdog (no GNU timeout dependency).
abf26b9 — docs: P10 baseline report + roadmap / perf-rebaseline cross-references. New baseline report; new "Performance harness" section in docs/standards/infrastructure.md; new 2026-05-08 banner in docs/roadmap/current.md (older banners preserved); one-line cross-reference in the 2026-05-07 perf re-baseline §"Post-T-015 amendment".

Measured baseline

Debug build, 20 iterations, 5 s per-run timeout, QEMU TCG, all 20 valid:

Metric	ms
p10	3.884
p50 (median)	4.642
p90	5.584
p99	6.558
mean	4.711
stddev	0.709

Brackets the previous "~4–6.5 ms typical" anecdote tightly. From now on, current.md quotes the measured band instead of single-run anecdotes; B2 changes are measured against this envelope.

Harness usage

# Default (20 iterations, 5 s timeout, no report file):
./tools/perf-harness.sh

# Custom iteration count + timeout + named report:
./tools/perf-harness.sh --iterations=50 --timeout=10 --report=2026-05-09-post-t-016

Flags: --iterations=K (default 20), --timeout=SECONDS (default 5), --release (forwards to tools/run-qemu.sh), --quiet (suppress per-iteration progress), --report=CONTEXT (write a markdown report to docs/analysis/reports/perf-baseline-<context>.md).

What's not in this PR

CI integration — no .github/workflows/ change. Harness is maintainer-launched, matching the existing tools/run-qemu.sh convention.
Regression-detection threshold — the harness prints the band; future enhancement: a --baseline=<file> flag that compares against a stored baseline and exits non-zero on drift.
IPC round-trip-specific kernel benchmark (kernel-side micro-bench under cfg(feature = "bench")) — that's a future P10 extension; v1 P10 is multi-run aggregation of the existing boot-to-end emission only.

Deviations from the brief

Report-filename construction strips a leading YYYY-MM-DD- from --report=CONTEXT so the user-supplied context 2026-05-08-post-pr-19-pre-adr-0027 produces perf-baseline-2026-05-08-post-pr-19-pre-adr-0027.md, not the doubled-date form. Same de-duplication on the H1 title.
Threshold uses (N+1)/2 — at least 50 % valid (10/20), per the brief's literal "fewer than 50 % failure → exit non-zero".

Project-side observations for the maintainer

The repo's origin URL still points at cemililik/UmbrixOS.git; GitHub is redirecting to cemililik/Tyrne.git. Worth running git remote set-url origin https://github.com/cemililik/Tyrne.git so future pushes don't depend on the redirect.
macOS default awk rejects nested function definitions; this is now handled by lifting pct to top-level inside the awk program. Worth keeping in mind for any future bash + awk tooling.

Test plan

Harness runs cleanly on macOS bash 3.2 + BSD awk + no GNU timeout
20-iteration baseline produces a valid percentile band (no failed iterations)
cargo fmt --all -- --check clean (no Rust changes)
cargo host-test 159/159 (no source changes)
cargo +nightly miri test 159/159
tools/run-qemu.sh smoke unaffected (full demo trace; baseline report includes the same trace 20 times)
Baseline report (docs/analysis/reports/perf-baseline-2026-05-08-post-pr-19-pre-adr-0027.md) cross-references match the harness's live output

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Introduced a performance benchmarking harness for capturing and analyzing boot-to-end timing metrics across multiple iterations with statistical aggregation (min, percentiles, max, mean, stddev).
Documentation
- Established performance measurement standards, including guidelines for percentile-based reporting and baseline capture methodology.
- Generated initial performance baseline report.
- Updated roadmap with performance infrastructure implementation status.

Single-run boot-to-end claims have ~15-30 % run-to-run variance under QEMU TCG (the counter advances on emulated instructions, not wall- clock time, so translation-cache state leaks into the number). The 2026-05-07 multi-axis review's Track D §D2 promoted the queued P10 proposal — a multi-run harness — from "queued" to "load-bearing before B2 ADR-0027 implementation" so future perf-relevant changes land against a tight percentile band rather than a single anecdote. The harness wraps tools/run-qemu.sh in an iteration loop with a portable per-run watchdog (the kernel halts in WFI after the demo; QEMU never exits on its own), parses the kernel's "boot-to-end elapsed = X ns" emission out of each run, and prints min / p10 / p50 / p90 / p99 / max / mean / stddev in both ns and ms. It optionally writes a markdown report under docs/analysis/reports/ using the perf-review master plan's Inputs / Methodology / Metric / Verdict shape. Pure bash + awk; macOS bash 3.2 compatible (matches run-qemu.sh's existing idioms; no GNU `timeout` binary required). A run aborts non-zero if fewer than half of its iterations produced a valid sample — that threshold is treated as environmental rather than a measurement worth aggregating, per the brief. Out of scope: CI integration (maintainer-launched only, like the QEMU smoke), automatic regression detection vs a stored baseline, the kernel-side `cfg(feature = "bench")` IPC microbench (a future P10 extension; v1 is multi-run aggregation of the existing boot- to-end emission). Refs: P10 (2026-05-06 Track D review), 2026-05-07 multi-axis review §D2

Records the first measured boot-to-end band produced by tools/perf-harness.sh and wires the harness into the project's documentation surface. - docs/analysis/reports/perf-baseline-2026-05-08-post-pr-19-pre- adr-0027.md — auto-generated baseline at HEAD `aa7e6c5`; debug build, 20 iterations, 5 s per-run timeout, QEMU TCG. Headline band: p10=3.884 ms / p50=4.642 ms / p90=5.584 ms / p99=6.558 ms, mean 4.711 ms, stddev 0.709 ms. Brackets the prior "~4-6.5 ms typical" anecdote tightly but is now a measured band on this host rather than an order-of-magnitude observation. - docs/standards/infrastructure.md — new "Performance harness" section names tools/perf-harness.sh as the canonical source for boot-to-end timing claims and deprecates single-run anecdotes in PR bodies. - docs/roadmap/current.md — new 2026-05-08 banner above the 2026-05-07 banner records the harness landing and the measured band; old banner preserved as the historical record (matches the append-only update discipline already used in this file). - docs/analysis/reviews/performance-optimization-reviews/ 2026-05-07-B1-closure.md — one-line cross-reference appended at the end of the "Post-T-015 amendment" section pointing at the new baseline report. The existing single-run claims throughout the 2026-05-07 baseline are deliberately preserved (the brief explicitly asked for the historical record to stay intact). The 2026-05-08 banner does not promote any task to In Progress / Done — the harness is tooling, not a roadmap task. B2 prep (ADR-0027 drafting) remains the active thread. Refs: P10 (2026-05-06 Track D review), 2026-05-07 multi-axis review §D2

qodo-code-review · 2026-05-08T10:25:14Z

ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one.

sourcery-ai

Sorry @cemililik, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

coderabbitai · 2026-05-08T10:25:22Z

Warning

Rate limit exceeded

@cemililik has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 56 minutes and 26 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 30edee82-efe4-43ac-a7be-ad4efe94d3de

📥 Commits

Reviewing files that changed from the base of the PR and between abf26b9 and ef30b5c.

📒 Files selected for processing (1)

tools/perf-harness.sh

📝 Walkthrough

Walkthrough

This PR introduces a performance harness tool for multi-run boot timing aggregation. The changes add tools/perf-harness.sh (a Bash script that runs QEMU multiple times, extracts nanosecond samples, and computes percentiles), establish collection standards in docs/standards/infrastructure.md, generate the first baseline report, and update related documentation to deprecate single-run claims and record the new methodology.

Changes

Performance Harness Tool and Standards

Layer / File(s)	Summary
Performance Collection Standard `docs/standards/infrastructure.md`	New "Performance harness" section defines `tools/perf-harness.sh` as canonical source for boot-to-end timing claims, describes iteration/sample extraction/aggregation, specifies 50% validity threshold for non-zero exit, establishes reporting discipline for PR claims, and documents QEMU TCG `now_ns()` measurement caveats.
Harness Script Implementation `tools/perf-harness.sh`	New Bash harness orchestrates multi-run iterations under per-run timeout, extracts first `boot-to-end elapsed = <ns> ns` sample per run, computes min/p10/p50/p90/p99/max/mean/stddev via awk, formats output with thousands separators, and optionally generates Markdown baseline reports with run inputs, methodology, metrics table, and raw samples.
First Baseline Report `docs/analysis/reports/perf-baseline-2026-05-08-post-pr-19-pre-adr-0027.md`	New baseline report records run timestamp, iterations, timeout, build/kernel/git/environment details, extraction methodology, computed statistics table in ns/ms, ordered raw sample values, and verdict marking as baseline-only and instructing future comparisons to cite percentile bands.
Integration Updates `docs/analysis/reviews/performance-optimization-reviews/2026-05-07-B1-closure.md`, `docs/roadmap/current.md`	Append-only notes documenting P10 harness landing, linking to first measured baseline, recording p10/p50/p90/p99 figures, and noting deprecation of single-run boot-to-end claims under infrastructure policy.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A harness born to measure boot,
Multi-runs through QEMU's root,
Percentiles dance in neat array,
Baselines baseline the faster way! 📊✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat(tools): perf-harness.sh — multi-run boot-to-end aggregation (P10)' directly and clearly summarizes the main change: adding a performance harness tool for multi-run boot-to-end timing aggregation.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch p10-wall-clock-bench-harness

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

The 2026-05-08 banner originally claimed P10 lands in PR #20; the actual PR-number assignment landed P10 at #21 (after path-drift PR #19 and ADR-0027 PR #20). Single-line correction; nothing else changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request introduces a performance harness tool (tools/perf-harness.sh) for multi-run boot-to-end timing aggregation and updates documentation to establish a measured performance baseline. The feedback focuses on optimizing the bash script by replacing inefficient piped commands and subshells with more direct awk and while read patterns to improve performance and readability.

gemini-code-assist · 2026-05-08T10:26:40Z

+    SAMPLE=$(printf '%s\n' "$OUTPUT" \
+        | grep -oE 'boot-to-end elapsed = [0-9]+ ns' \
+        | head -n 1 \
+        | grep -oE '[0-9]+' \
+        | head -n 1 || true)


The extraction of the nanosecond sample can be significantly simplified using a single awk command. This approach is more efficient as it avoids multiple piped processes and stops processing after the first match is found. Using a here-string (<<<) is also more idiomatic in Bash for passing variable content to a command.

Suggested change

SAMPLE=$(printf '%s\n' "$OUTPUT" \

| grep -oE 'boot-to-end elapsed = [0-9]+ ns' \

| head -n 1 \

| grep -oE '[0-9]+' \

| head -n 1 || true)

SAMPLE=$(awk '/boot-to-end elapsed = [0-9]+ ns/ { print $4; exit }' <<< "$OUTPUT")

gemini-code-assist · 2026-05-08T10:26:40Z

+STATS=$(read_stats)
+STAT_MIN=$(echo "$STATS"    | awk '$1=="min"    {print $2}')
+STAT_P10=$(echo "$STATS"    | awk '$1=="p10"    {print $2}')
+STAT_P50=$(echo "$STATS"    | awk '$1=="p50"    {print $2}')
+STAT_P90=$(echo "$STATS"    | awk '$1=="p90"    {print $2}')
+STAT_P99=$(echo "$STATS"    | awk '$1=="p99"    {print $2}')
+STAT_MAX=$(echo "$STATS"    | awk '$1=="max"    {print $2}')
+STAT_MEAN=$(echo "$STATS"   | awk '$1=="mean"   {print $2}')
+STAT_STDDEV=$(echo "$STATS" | awk '$1=="stddev" {print $2}')


Parsing the STATS output by repeatedly calling awk in subshells is inefficient. A single while read loop can process the key-value pairs in one pass, improving performance and readability while remaining compatible with Bash 3.2.

Suggested change

STATS=$(read_stats)

STAT_MIN=$(echo "$STATS" | awk '$1=="min" {print $2}')

STAT_P10=$(echo "$STATS" | awk '$1=="p10" {print $2}')

STAT_P50=$(echo "$STATS" | awk '$1=="p50" {print $2}')

STAT_P90=$(echo "$STATS" | awk '$1=="p90" {print $2}')

STAT_P99=$(echo "$STATS" | awk '$1=="p99" {print $2}')

STAT_MAX=$(echo "$STATS" | awk '$1=="max" {print $2}')

STAT_MEAN=$(echo "$STATS" | awk '$1=="mean" {print $2}')

STAT_STDDEV=$(echo "$STATS" | awk '$1=="stddev" {print $2}')

while read -r key val; do

case "$key" in

min) STAT_MIN=$val ;;

p10) STAT_P10=$val ;;

p50) STAT_P50=$val ;;

p90) STAT_P90=$val ;;

p99) STAT_P99=$val ;;

max) STAT_MAX=$val ;;

mean) STAT_MEAN=$val ;;

stddev) STAT_STDDEV=$val ;;

esac

done <<EOF

$(read_stats)

EOF

coderabbitai

🧹 Nitpick comments (1)

tools/perf-harness.sh (1)

420-421: 💤 Low value

Pre-compute relative paths to silence SC2295.

${KERNEL_ELF#${REPO_ROOT}/} (line 421) and ${REPORT_PATH#${REPO_ROOT}/} (line 492) treat $REPO_ROOT as a glob pattern inside the # operator, so a path containing [, ], *, or ? would silently mismatch. While REPO_ROOT is a realpath-resolved string unlikely to carry glob characters, the fix is a one-liner pre-compute before each use-site.

🔧 Proposed fix

+    KERNEL_ELF_REL="${KERNEL_ELF#"$REPO_ROOT"/}"
     {
         ...
-        echo "| Kernel ELF | \`${KERNEL_ELF#${REPO_ROOT}/}\` |"
+        echo "| Kernel ELF | \`${KERNEL_ELF_REL}\` |"

-    echo "report written: ${REPORT_PATH#${REPO_ROOT}/}"
+    REPORT_PATH_REL="${REPORT_PATH#"$REPO_ROOT"/}"
+    echo "report written: $REPORT_PATH_REL"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/perf-harness.sh` around lines 420 - 421, Pre-compute the relative paths
instead of using parameter expansion with REPO_ROOT inline to avoid glob
interpretation: create variables like kernel_rel by stripping REPO_ROOT from
KERNEL_ELF (e.g., compute kernel_rel from KERNEL_ELF and REPO_ROOT before the
echo) and similarly compute report_rel from REPORT_PATH and REPO_ROOT before its
use; then use kernel_rel and report_rel in the echo lines that currently
reference `${KERNEL_ELF#${REPO_ROOT}/}` and `${REPORT_PATH#${REPO_ROOT}/}` so
the `#` operator never sees REPO_ROOT as a glob pattern.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tools/perf-harness.sh`:
- Around line 420-421: Pre-compute the relative paths instead of using parameter
expansion with REPO_ROOT inline to avoid glob interpretation: create variables
like kernel_rel by stripping REPO_ROOT from KERNEL_ELF (e.g., compute kernel_rel
from KERNEL_ELF and REPO_ROOT before the echo) and similarly compute report_rel
from REPORT_PATH and REPO_ROOT before its use; then use kernel_rel and
report_rel in the echo lines that currently reference
`${KERNEL_ELF#${REPO_ROOT}/}` and `${REPORT_PATH#${REPO_ROOT}/}` so the `#`
operator never sees REPO_ROOT as a glob pattern.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0b289f79-0dc6-4fa3-852d-3b738b6ac4c0

📥 Commits

Reviewing files that changed from the base of the PR and between aa7e6c5 and abf26b9.

📒 Files selected for processing (5)

docs/analysis/reports/perf-baseline-2026-05-08-post-pr-19-pre-adr-0027.md
docs/analysis/reviews/performance-optimization-reviews/2026-05-07-B1-closure.md
docs/roadmap/current.md
docs/standards/infrastructure.md
tools/perf-harness.sh

…ctor) Both gemini-code-assist inline comments on PR #21 flagged efficiency improvements: 1. **L213 sample extraction (4-pipe → single awk).** Replaced the `printf | grep | head | grep | head` pipeline with one awk invocation. Departures from gemini's literal suggestion: - Used sub/sub stripping rather than `print $4` because the kernel's boot-to-end line has the format "tyrne: boot-to-end elapsed = NUM ns" — `$4` is the equals sign; `$5` is the number. The sub/sub pattern is format-shift-tolerant (any "...= NUM ns..." shape resolves to NUM regardless of preceding fields), avoiding the fragility of a positional field index that would break if a future kernel build added a prefix or moved the line. - Confirmed by 5-iteration sanity run: band 3.973 / 4.559 / 5.642 ms p10/p50/p90 (consistent with the existing baseline). 2. **L309 STATS parsing (8 echo|awk → single while-read loop).** Replaced eight `echo "$STATS" | awk '$1=="key" {print $2}'` invocations with one `while read -r key val; do case "$key" in ... esac done <<EOF $(read_stats) EOF` loop. Bash 3.2 compatible (heredoc keeps the loop in the parent shell so variable assignments persist past the loop end). Saves 7 fork+exec per harness invocation; runs once per harness invocation so the absolute saving is small, but the pattern is cleaner and more maintainable. Verification: cargo fmt clean, host-test 159/159 (no source changes), harness 5-iter sanity run produces consistent stats; refactor is behaviour-preserving. Refs: PR #21 review-round (gemini-code-assist) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cemililik · 2026-05-08T12:41:48Z

Superseded by the integration PR for the B2 prep bundle: #19 + #20 + #21 + 2026-05-08 review artefacts + T3-M1 phase-b status-block fix all land together. Branch p10-wall-clock-bench-harness preserved for reference; the integration branch integration-2026-05-08-b2-prep-bundle carries this PR's content (3 commits including the gemini-review-round refactor) unchanged.

…losure status Re-verified all 13 §Follow-up backlog items from the 2026-05-07 PR #12-#17 multi-axis review against the current integration-branch state. All 9 hygiene items + the 1 forward-flagged P10 harness item are now closed; 3 remain forward-flagged on appropriate downstream venues. Two updates to the consolidated review file: 1. **Top-of-file closure-status banner** — readers see the per-item disposition at-a-glance without scrolling to the bottom backlog. 2. **§Follow-up backlog per-item closure annotations** — items 1-9 gain ✅ + closing-PR + closing-commit references; item 11 (P10 harness) gains ✅ + integration-PR reference + measured-baseline numbers; items 10/12/13 keep their forward-flagged status with "status unchanged 2026-05-08" markers. Re-verification at integration-branch HEAD (per item): | # | Item | Closing PR / commit | Verification | |---|---|---|---| | 1 | current.md + perf re-baseline `.text 22,020` | PR #18 / `94a6c0f` | grep "22,020 bytes" current.md → 1 hit | | 2 | cancel_recv_on_recv_complete test | PR #18 / `25854a1` | grep test name in ipc/mod.rs → 1 hit; host-test 159/159 | | 3 | ipc_cancel_recv doc-rider on cap-bearing state | PR #18 / `25854a1` | grep "destroy-drain callers (Phase B2+)" → 1 hit | | 4 | cancel-block SAFETY wording | PR #18 / `25854a1` | grep "caller_table.*shared.*reborrow" → 1 hit | | 5 | UNSAFE-2026-0014 SHA back-fill (c30f4ee, 7a402cb) | PR #18 / `94a6c0f` | grep both SHAs in unsafe-log.md → 2 hits each | | 6 | unsafe-policy.md §3 mechanical-edit exemption | PR #18 / `94a6c0f` | grep "Mechanical-edit exemption" → 1 hit | | 7 | ADR-0026 §Simulation chronology rider | PR #18 / `94a6c0f` | grep "§Simulation rule was retro-extracted" → 1 hit | | 8 | master-plan AC cross-reference | PR #18 / `94a6c0f` | grep "Closure-trio coordination cross-reference" in security + perf master-plans → 1 hit each | | 9 | ADR-0026 §skill-clause reconciliation rider | PR #18 / `94a6c0f` | grep "single-commit Propose+Accept landing reconciliation" → 1 hit | | 11 | P10 wall-clock harness | this integration PR (replaces #19/#20/#21) | tools/perf-harness.sh exists; baseline report exists; band p10=3.884/p50=4.642/p90=5.584 ms | Forward-flagged (status unchanged): - Item 10: RecvWaiting waiter-identity gap — ADR-0030 / ADR-0019 venue - Item 12: cancel-on-cap-bearing-state destroy-drain ADR — first userspace-destroy task venue - Item 13: B5+ preemption-rollback re-validation of ADR-0032 — B5+ preemption ADR venue This commit only touches the consolidated review's annotation; track files preserved as historical artefacts (their per-track verdicts are the snapshot at the moment of the review, not subject to back-edits). The review's per-item findings (Track-A NIT-2 SchedQueue::new doc rename; Track-G MIN-G1/G2/G3; Track-H MIN-1/MIN-2; Track-A MIN-2 ipc_cancel_recv doc-rider; Track-D D1; Track-F §F-1) are all closed in PR #18 + this integration PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… 2026-05-08 + 2026-05-07 follow-ups Closes the 12 remaining items flagged by the 2026-05-08 multi-axis review's §Follow-up backlog (3 Track-2 Majors, 1 Track-3 Major fix already closed in `59c08e9`, 6 Track-3 / Track-2 Minors, 2 Track-4 Minors, 2 Track-2 Nits) plus the carry-forward 2026-05-07 Track-H NIT-1. **Track 2 Majors (forward-flagged → ADR-0027 / phase-b ledger riders):** - M1 (escape-hatch doc): ADR-0027 §Decision outcome (c) gains a bullet documenting `mem::forget` / `ManuallyDrop` / `let _ = ...` as deliberate-but-rare escape hatches, mirroring the `x86_64::structures::paging::MapperFlush` precedent. - M2 (MMU-instance binding): ADR-0027 §Decision outcome (c) gains a bullet noting `MapperFlush::flush(self, mmu: &impl Mmu)` accepts any `Mmu`; multi-`Mmu` deployments (B3+ per-task `AddressSpace`, Phase C multi-CPU) will need a stronger token type. Out of scope for v1. - M3 (ADR-0034 placeholder): ADR-0027 §Decision outcome adds an "ADR-0034 (kernel-image section permissions) placeholder" block alongside ADR-0033, and `phase-b.md` ADR ledger gains rows for both ADR-0033 and ADR-0034 with named-but-unallocated discipline. **Track 3 Minors (governance / wording polish):** - m1 + m2 (current.md L52): drop "Phase-2" prefix on "§Simulation table" (the table walks Steps 0–4, not a "Phase 2"); "Accept will be" → "Accept landed as" + actual commit SHA `bb0a6ba`. - m3 (commit-style.md PR-numbering rider): new §"PR-number references in committed artefacts" subsection naming the recurrence (PR #18 + PR #20 each had a one-commit PR-number fix-up) and codifying three acceptable disciplines (defer banner authoring; reference branch slug; or use commit SHA). - n1 (framing alignment): "first to apply Simulation forward" wording in current.md (banner + Active decisions row) and phase-b.md §B2 status block aligned to the precise "first non-recovery-primitive state-machine ADR drafted under §Simulation" phrasing — ADR-0032's Propose did land with a table; the prior framing was technically defensible only under a narrow reading of "retro-extracted". **Track 2 Nits (substance riders in ADR-0027):** - #4 (DSB ISH vs DSB NSH rationale): §Simulation gains a rationale paragraph after the table — `ISH` is forward-compatible with the eventual SMP boot, sub-microsecond cost on single-core, matches Linux aarch64 `arch/arm64/mm/proc.S` for the same reason. - #5 (TCR_EL1.AS wording tighten): line 59 reworded — `AS = 0` selects 8-bit ASID *size*, not "the ASID value is 0" (the value is `TTBR0_EL1.ASID = 0` and is what's "globally used in v1"). - #7 (line 17 §-citation precision) + Track-3 n1: "first non-recovery-primitive state-machine" framing now precise. - #8 (memory-management.md L88 page-table descriptor cosmetic): ASCII bit-field diagram redrawn to match L2 block-descriptor reality (OutputAddress[47:21], not [47:12]); explanatory note added for L1-block / L3-page variants. **Track 4 Minors (perf-harness.sh):** - #1 (Ctrl-C cleanup trap): new `cleanup_in_flight` shell function + `trap '...' EXIT INT TERM` that kills any in-flight QEMU + watchdog PIDs tracked in `CURRENT_CMD_PID` / `CURRENT_WATCHDOG_PID` shell globals. `run_with_timeout` updates the globals at every call so the trap addresses whichever pair is currently in flight; clears them at every clean exit so the trap is a no-op outside iterations. - #2 (p99 small-N reporting hygiene): generated baseline report Methodology section gains a "**Note on p99 at small `n`**" paragraph explaining that under nearest-rank `p99 == max` for `n < 100` and callers should not over-read it as a tail-latency signal until `n >= 100`. **Track 4 Nit #3 (read_stats refactor):** **Already closed by PR #21 review-round commit `ef30b5c`** — the 8 `echo | awk` parses became a single `while read` loop. Verified at HEAD (`grep -c "while read -r key val" tools/perf-harness.sh` → 1 hit). **2026-05-07 Track-H NIT-1 (Pending Amendment closure-path indexing):** UNSAFE-2026-0019 / 0020 / 0021 each gain a 2026-05-08 "closure-path indexed" Amendment naming the canonical clearance trigger (B5 Milestone, ADR-0030 entry-point, deadline-arming syscall) explicitly, so a future reader of `unsafe-log.md` alone has the full picture without leaving the file. No semantic change; co-locates information that was previously distributed across `phase-b.md` cross-references. Verification gates re-run on the integration branch: - `cargo fmt --all -- --check` clean - `cargo host-test` 159/159 (25 + 100 + 34) - `cargo host-clippy` clean (-D warnings) - `cargo kernel-clippy` clean - `cargo kernel-build` clean - `tools/perf-harness.sh --iterations=3` runs end-to-end with the new trap + Methodology note + while-read parsing intact This commit + the prior 2026-05-08 review's recommendations close all follow-ups identified by both 2026-05-07 and 2026-05-08 multi-axis reviews. Forward-flagged items (10/12/13 from 2026-05-07) and any review-round bot input on this integration branch remain the only open items. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…bundle B2 prep integration bundle (replaces #19, #20, #21) — ADR-0027 + T-016 Draft + P10 harness + path-drift sweep + 2026-05-08 review

cemililik added 2 commits May 8, 2026 13:08

sourcery-ai Bot reviewed May 8, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 8, 2026

View reviewed changes

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

This was referenced May 8, 2026

docs: sweep relative-path drift in 2026-05-06 comprehensive review (180/180 fixed) #19

Closed

ADR-0027 — kernel virtual memory layout (B2) + open T-016 #20

Closed

cemililik closed this May 8, 2026

cemililik mentioned this pull request May 8, 2026

B2 prep integration bundle (replaces #19, #20, #21) — ADR-0027 + T-016 Draft + P10 harness + path-drift sweep + 2026-05-08 review #22

Merged

8 tasks

cemililik added a commit that referenced this pull request May 8, 2026

Merge pull request #22 from cemililik/integration-2026-05-08-b2-prep-…

6494ed2

…bundle B2 prep integration bundle (replaces #19, #20, #21) — ADR-0027 + T-016 Draft + P10 harness + path-drift sweep + 2026-05-08 review

cemililik deleted the p10-wall-clock-bench-harness branch May 25, 2026 12:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tools): perf-harness.sh — multi-run boot-to-end aggregation (P10)#21

feat(tools): perf-harness.sh — multi-run boot-to-end aggregation (P10)#21
cemililik wants to merge 3 commits into
mainfrom
p10-wall-clock-bench-harness

cemililik commented May 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

qodo-code-review Bot commented May 8, 2026

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

coderabbitai Bot commented May 8, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

gemini-code-assist Bot May 8, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

cemililik commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-STATS=$(read_stats)
-STAT_MIN=$(echo "$STATS"    | awk '$1=="min"    {print $2}')
-STAT_P10=$(echo "$STATS"    | awk '$1=="p10"    {print $2}')
-STAT_P50=$(echo "$STATS"    | awk '$1=="p50"    {print $2}')
-STAT_P90=$(echo "$STATS"    | awk '$1=="p90"    {print $2}')
-STAT_P99=$(echo "$STATS"    | awk '$1=="p99"    {print $2}')
-STAT_MAX=$(echo "$STATS"    | awk '$1=="max"    {print $2}')
-STAT_MEAN=$(echo "$STATS"   | awk '$1=="mean"   {print $2}')
-STAT_STDDEV=$(echo "$STATS" | awk '$1=="stddev" {print $2}')
+while read -r key val; do
+    case "$key" in
+        min)    STAT_MIN=$val ;;
+        p10)    STAT_P10=$val ;;
+        p50)    STAT_P50=$val ;;
+        p90)    STAT_P90=$val ;;
+        p99)    STAT_P99=$val ;;
+        max)    STAT_MAX=$val ;;
+        mean)   STAT_MEAN=$val ;;
+        stddev) STAT_STDDEV=$val ;;
+    esac
+done <<EOF
+$(read_stats)
+EOF

Conversation

cemililik commented May 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Measured baseline

Harness usage

What's not in this PR

Deviations from the brief

Project-side observations for the maintainer

Test plan

Summary by CodeRabbit

Uh oh!

qodo-code-review Bot commented May 8, 2026

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cemililik commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cemililik commented May 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 8, 2026 •

edited

Loading