docs(dev-flow): Phase 5 live bake \u2014 items #1 #2 #4#51
Merged
Conversation
…s PR) Preemptively mark the Label-trigger / Same-SHA integrity / Pre-fast-gate enforcement validations in \u00a710.3 Phase 5 checklist as baked. This commit lands only if the live bake on this PR succeeds \u2014 if any check fails, the ticks get reverted before the PR is merged. Bake plan (executed on this PR): 1. This commit (docs-only) runs `PR Fast CI` \u2192 green. 2. Apply `preview-binaries` label \u2192 validates item #1 (trigger wiring). 3. Wait for preview workflow to complete \u2192 download `manifest-<sha>` + `windows-preview-<sha>` artifacts \u2192 verify `manifest.git_sha == PR head SHA` and `files[].sha256 == sha256sum(downloaded)` \u2192 validates item #2 (integrity). 4. Push a temporary sabotage commit (`exit 1` in `fmt` job of `pr-fast.yml`) \u2192 `PR Fast CI / required` goes red \u2192 preview workflow re-triggers on synchronize \u2192 `verify-pr-fast-green` detects the failure and aborts preview before Windows runner minutes are spent \u2192 validates item #4 (\ud83d\udd34 Critical gate enforcement). 5. Revert the sabotage; CI returns to green; PR merged. Item #3 downgraded from deferred to partial-satisfied: preview\\s own `smoke-windows` job already does the archive round-trip on `windows-latest` against the pinned SHA. The bullet now tracks the remaining external-box verification gap only.
Bug caught by the live Phase 5 bake on this scratch PR.
Root cause: `cargo nextest --version` on 0.9.132 emits multiple lines where awk $2 evaluates to `0.9.132` on more than one of them. The `$(...)` command substitution preserves the inner newline, producing a multi-line value that GitHub Actions's output-file parser rejects with:
Error: Unable to process file command `output` successfully.
Error: Invalid format `0.9.132`
Fix: `awk 'NR==1 {print $2}'` \u2014 restrict processing to the first line so no banner / upgrade-notice / self-check line nextest might add later can pollute $GITHUB_OUTPUT.
Regression-guard comment added inline so the failure mode stays legible and the fragility is not silently reintroduced by a future refactor.
Also adds a \u00a710.5 Deviations log entry documenting the discovery and fix. The same PR that surfaces the bug also lands the Phase 5 validation ticks in \u00a710.3 once the re-bake passes.
Second bug caught by the live Phase 5 bake on this scratch PR.
Root cause: the polling cap (60 \u00d7 10 s = 10 min, `timeout-minutes: 12`) was calibrated implicitly for the docs-only / short-circuited case where PR Fast CI finishes in under 2 min. On any full-matrix PR (infra-change or rust-change) the `tests` job alone runs 10\u201315 min cold plus `test-build` sequentially before it, pushing the aggregator completion to minute 20\u201325. The poller keeps seeing `status=missing` because `PR Fast CI / required` is not yet registered as a check-run, and at retry 60 it fails with `\u23f1\ufe0f Timed out waiting for PR Fast CI / required` \u2014 a **false negative**: the PR would have gone green 5 min later.
This was anticipated in the plan\\s Phase 5 notes ("if PR-fast is slower than 10 min, increase the cap in one commit") but nobody had exercised it against a real full-matrix PR before today.
Fix: bump to 120 \u00d7 15 s = 30 min polling, `timeout-minutes: 32`. Factored the magic numbers into `MAX_RETRIES` / `RETRY_DELAY_MS` constants so the next recalibration is a one-line bump rather than a hunt through an inline loop. Expanded the job header comment with: (a) the 2026-04-23 incident reference, (b) a guardrail against dropping the budget below p99 PR Fast CI wall-clock without adding explicit queue-awareness.
\u00a710.5 Deviations log gains a second entry for this. \u00a710.3 Phase 5 Notes updated: the stale "10 minutes" claim now reads "120 \u00d7 15 s = 30 min" with a cross-ref to \u00a710.5.
Live bake on this scratch PR surfaced a deeper upstream blocker than any of the three fixes landed here: * `cargo nextest archive` defaults to debug profile * Debug xcompile to x86_64-pc-windows-msvc produces a ~5.5 GB `polars-ops` rlib * That rlib exceeds the COFF archive format\\s string-table offset capacity * `lld-link` (and likely native `link.exe`) dies with "truncated or malformed archive" Root-cause + fix recipe already live at `docs/xwin-msvc-rlib-size-root-cause-and-workarounds.md` (dedicated `xwin-dev` profile + per-package polars overrides). Being worked on a concurrent branch. Impact on Phase 5 validation bake: - Item #1 (Label-trigger path): still \u2713 \u2014 preview workflow correctly triggers on `labeled` event and the `gate` + `verify-pr-fast-green` jobs both ran against the pinned PR head SHA before the pipeline failed downstream. Trigger wiring is proven. - Item #2 (Same-SHA integrity): reverted to \u274c with a blocker note pointing at the polars issue + \u00a710.5 log entry. Can\\t validate manifest integrity without a completed preview pipeline. Re-bake on the next preview run after the polars fix lands. - Item #3 (Nextest round-trip): "partially satisfied by smoke-windows" claim softened to "will be, once the polars blocker is resolved" \u2014 smoke-windows depends on build-test-archive. - Item #4 (Pre-fast-gate enforcement, \ud83d\udd34 Critical): UNaffected by the polars blocker. Validated separately via the sabotage commit that follows this one on the same PR. Plan updates: - \u00a710.3 Phase 5 checklist: items #2 and #3 revised as above. - \u00a710.5 Deviations: new entry consolidating the investigation (proximal lib.exe error \u2192 xwin subcommand gap \u2192 polars-rlib ceiling) and documenting why my attempted "move build-test-archive to windows-latest" was rolled back (it would have shifted the failure mode to a polars-rlib error instead of a lib.exe error, not actually fixed anything, and would have collided with the concurrent branch\\s xwin-centric direction). - \u00a710.6 Active: prepended the polars blocker as the top-priority active item.
Adds `- run: exit 1` as the first step of the `file-size` job so `PR Fast CI / required` goes red on the pinned SHA. The preview workflow should re-trigger on synchronize (preview-binaries label still applied) and `verify-pr-fast-green` should correctly detect the failed aggregator and fail the preview at the gate \u2014 before `build-windows` / `build-test-archive` / `smoke-windows` start. That\\s the \ud83d\udd34 Critical Phase 5 item #4 validation. REMOVE-BEFORE-MERGE. Next commit on this branch is the revert. Squash-merge cancels both.
Replaces the preemptive placeholder note with the actual evidence from the live sabotage bake on this same PR: - Sabotage target: `file-size` job (not `fmt` \u2014 fmt doesn\\t run on infra-only changes due to its `if: rust=true` gate). - Sabotaged SHA: 0600ce6. - `PR Fast CI / required` = FAILURE on that SHA. - `verify-pr-fast-green` detected the red aggregator at poll retry 48/120 and set `core.setFailed`. - Downstream `build-windows` / `build-test-archive` / `smoke-windows` / `manifest` all correctly stayed `skipped` \u2014 zero Windows runner minutes spent on a red PR. This is precisely the \ud83d\udd34 Critical behavior the gate was designed for. The previous commit\\s revert undoes the sabotage so the file-size policy check returns to normal.
githubrobbi
added a commit
that referenced
this pull request
Apr 24, 2026
… spawn (bug #4) Surfaced by the preview re-bake on PR #52 (run 24873105115, SHA dbdbbb7). `build-windows` failed with: winresource: failed to embed icon + manifest: Os { code: 2, kind: NotFound, message: "No such file or directory" } from `crates/uffs-cli/build.rs:106`. ## Root cause `winresource v0.1.31` at `src/lib.rs:735-736` hardcodes `PathBuf::from("llvm-rc")` on `cfg(unix)` and spawns it unqualified. `cargo-xwin` wires MSVC CRT/SDK env but does NOT prepend any LLVM `bin/` dir to PATH. On ubuntu-22.04 runners `llvm-rc` is preinstalled but lives at `/usr/lib/llvm-<N>/bin/llvm-rc` (not default PATH). Net: `res.compile()` spawns `"llvm-rc"` → `execvp` → ENOENT → panic. ## Why this is bug #4 in the same log jam Three bugs prevented the preview lane from reaching `build-windows` on prior runs: #1. nextest multi-line output → `$GITHUB_OUTPUT` parse error (PR #51) #2. 10-min polling budget → `verify-pr-fast-green` false-negative (PR #51) #3. `build-test-archive` on ubuntu-22.04 + xwin gap → `lib.exe` NotFound (this PR, earlier commit) #4. `winresource` hardcoded `llvm-rc` + missing PATH → ENOENT (this commit) Each bug masked the next. Bug #4 stayed latent because bug #3 always aborted the preview before `build-windows` finished compiling polars-ops to reach the `uffs-cli` build.rs call. ## Fix Added a `Locate llvm-rc` step to `build-windows` that scans `/usr/lib/llvm-*/bin/llvm-rc`, picks the highest-versioned match via `sort -V | tail -1`, and exports `RC_PATH` to `$GITHUB_ENV`. `winresource` honors `RC_PATH` at `lib.rs:733-734` ahead of the hardcoded fallback, so no crate patch needed. Version-robust against runner image bumps: if LLVM 15 is replaced with 16 tomorrow, the sort still picks up the new binary. ## Docs Added §10.5 row (2026-04-24) for bug #4 with full crate/runner anchoring.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Scratch PR that doubles as the live validation bench for three Phase 5 checklist items currently unticked in
@/Users/rnio/Private/Github/UltraFastFileSearch/docs/architecture/dev-flow-implementation-plan.md:1904-1924:preview-binarieslabel fires the preview workflowmanifest.git_sha+ per-file sha256 match PR headPR Fast CIfailure blocks previewBake sequence on this PR
PR Fast CI / required→ green.preview-binarieslabel → preview workflow triggers onlabeledevent. Validates Bump the cargo group across 1 directory with 2 updates #1.gate→verify-pr-fast-green(passes, since PR Fast CI is green) →build-windows+build-test-archive→smoke-windows(executes nextest archive onwindows-latest) →manifest.manifest-<sha>+windows-preview-<sha>artifacts. Verifymanifest.git_sha == PR head SHAandfiles[].sha256 == sha256sum(downloaded). Validates Merge fix-f-drive-parity: single-pass MFT pipeline matching C++ architecture #2.exit 1first step offmtjob inpr-fast.yml).PR Fast CI / requiredgoes red on the new SHA.synchronize(label still applied).verify-pr-fast-greenpolls check-runs for the new SHA, detectsPR Fast CI / required = failure, aborts preview beforebuild-windows/build-test-archivestart. Validates Fix Drive D/S parity: remove premature metrics + child sorting #4 🔴.PR Fast CI / requiredreturns to green.main(sabotage and revert cancel).What does NOT get validated here
nextest_version. The in-CIsmoke-windowsjob already exercises the archive against awindows-latestrunner on the pinned SHA; the checklist item is downgraded from "deferred" to "partial-satisfied" to reflect this and only the external-box case remains.Close / do-not-merge contract
If ANY step of the bake fails unexpectedly (e.g. preview builds succeed on the sabotaged commit — which would be the catastrophic false-success case for #4), I close this PR without merging, revert nothing, and open a bug report against
preview-artifacts.yml.Checklist
test/phase-5-preview-bake) does not include any production code change — only plan-doc ticks + transientpr-fast.ymlsabotage+revert that cancels in squash.preview-binarieslabel created on repo (one-time setup).