chore(quality): umbrella tracker for quality-automation increment (8 chunks) by Luis85 · Pull Request #130 · Luis85/agentonomous

Luis85 · 2026-04-26T11:56:43Z

Role — umbrella tracker for the quality-automation increment

This PR is the umbrella tracker (planning side) for the
quality-automation increment. It stays draft until every
downstream chunk PR has merged. Implementation lives in those
downstream PRs, cut from develop.

Durable tracker: Issue #131.
The issue is the long-lived record of this increment — it survives
PR merges. This PR ships the plan files; the issue auto-flips its
task list as each downstream chunk PR merges and stays open until
the plans are archived.

Documents in this PR

Umbrella plan —
docs/plans/2026-04-26-quality-automation-routines.md
— goal, architecture, coordination with Add Pre-v1 Demo Evolution Plan and product-demo rename strategy #129, tracker table,
downstream-PR contract, shared resolve_action_sha helper, risk
register.
8 chunk plans under docs/plans/2026-04-26-quality-*.md —
one per shippable row:
- Row 1 · quality-codeql.md — Weekly + push CodeQL JS/TS scan
- Row 2 · quality-dep-triage-bot.md — Weekly Dependabot triage cloud routine + dependabot.yml grouping
- Row 3 · quality-actions-bump-bot.md — Weekly action-SHA bump cloud routine
- Row 4 · quality-plan-recon-bot.md — Monthly plan reconciliation cloud routine
- Row 5 · quality-bundle-trend.md — Weekly bundle-size trend snapshot
- Row 6 · quality-determinism-replay.md — Weekly + push hash-pinned replay across 8 seeds
- Row 7 · quality-mutation-testing.md — Weekly StrykerJS mutation run
- Row 8 · quality-demo-smoke.md — Nightly Playwright demo smoke

Downstream PR contract

Every PR cut from these chunk plans MUST:

Branch off develop (NOT off this tracker's branch
chore/quality-automation-routines). Worktree under
.worktrees/<branch-slug> per CLAUDE.md.
Include BOTH body lines:
- Tracks: #130 (planning PR — supplies the chunk plan).
- Tracks: #131 (durable issue tracker — auto-ticks on merge).
Tick its row in the umbrella plan's tracker table in the same
diff (no follow-up "tracker update" PRs).
Pass npm run verify locally before opening.
Pin every new GHA uses: reference to a 40-char commit SHA via
the umbrella's peel-aware resolve_action_sha helper — never
gh api .../refs/tags/<tag> --jq '.object.sha' directly.
Skip changesets (tooling-only PRs, pre-1.0).
Address Codex review per
MEMORY.md → feedback_pr_codex_polling.md /
feedback_codex_signal_endpoints.md.

Tracker — downstream PRs (auto-flips on merge)

Add - [ ] #NNN — short description rows here as downstream PRs
are opened. Issue #131 carries the parallel task list as the durable
record.

(no downstream PRs opened yet)

Coordination with PR #129

Wave 0 of PR #129
atomically renames examples/nurture-pet/ → examples/product-demo/.
Only row 8 (demo smoke) touches the demo workspace. The umbrella
plan documents the sequencing rule: the demo-smoke chunk uses
examples/nurture-pet/ literal paths today; if Wave 0 lands first,
the chunk substitutes mechanically before opening its PR.

Why an umbrella + a durable issue, not one big PR

Earlier drafts of this increment bundled all eight rows on a single
branch. After three Codex review cycles the plan had grown past
1390 lines and review latency was the dominant cost. Splitting into
an umbrella + 8 chunk plans:

Lets independent agents work rows in parallel.
Keeps each Codex review pass narrowly scoped.
Bounds blast radius if any single row needs revert.

The PR is the planning surface — it ships the plan files and
closes once they land. The issue (#131) is the durable surface
— it tracks every chunk PR, captures the original motivation, and
stays open until the plans are archived.

When this PR merges

When every chunk PR is merged into develop:

Mark this PR ready-for-review.
Squash-merge.
After merge, git mv docs/plans/2026-04-26-quality-*.md docs/archive/plans/ in a single follow-up chore(docs) commit
on develop (the only doc move that ships outside its
originating PR — the originating PR here IS the tracker).
Close issue Quality automation increment — tracker #131 once the archive sweep lands.

🤖 Generated with Claude Code

Adds an actionable plan covering 8 quality / supply-chain / drift automation surfaces (CodeQL, Dependabot triage prompt, action-SHA bump prompt, plan-reconciliation prompt, bundle-size trend, determinism replay, mutation testing, demo Playwright smoke). Stale-branch cleaner intentionally excluded. This commit lands only the plan; subsequent commits on this branch implement each row top-to-bottom. PR opens draft and is flipped to ready-for-review only after every checkbox in the plan is ticked and 'npm run verify' is green.

Luis85 · 2026-04-26T12:02:14Z

@codex review

chatgpt-codex-connector · 2026-04-26T12:02:52Z

Summary

Added a new pinned-SHA CodeQL workflow (.github/workflows/codeql.yml) that runs on push/PR for develop + main and on a weekly cron, with security-and-quality queries for JS/TS. .github/workflows/codeql.ymlL1-L36
Updated Dependabot npm update grouping for both the root package and examples/nurture-pet so minor/patch updates are batched into npm-non-major. .github/dependabot.ymlL10-L55
Added the dependency-triage routine prompt + README scaffold under docs/dep-triage-bot/. docs/dep-triage-bot/PROMPT.mdL1-L71 docs/dep-triage-bot/README.mdL1-L26
Added the action-SHA bump routine prompt + README scaffold under docs/actions-bump-bot/. docs/actions-bump-bot/PROMPT.mdL1-L51 docs/actions-bump-bot/README.mdL1-L21
Added the plan-reconciliation routine prompt + README scaffold under docs/plan-recon-bot/. docs/plan-recon-bot/PROMPT.mdL1-L44 docs/plan-recon-bot/README.mdL1-L22
Committed on the current branch (work) as 5e916a1, and created a PR via the make_pr tool with title: “chore(quality): add CodeQL workflow and routine prompt scaffolding”.

Testing

✅ npm run verify
⚠️ docker run --rm -v "$PWD":/repo -w /repo rhysd/actionlint:latest -color .github/workflows/codeql.yml (warning: docker is not installed in this environment)

View task →

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 516a2f4289

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-26T12:05:50Z

+gh api repos/actions/checkout/git/refs/tags/v6.0.2 --jq '.object.sha'
+gh api repos/github/codeql-action/git/refs/tags/v4 --jq '.object.sha'


Resolve action tags to commit SHAs before pinning

Step 1.1 tells the worker to use gh api .../git/refs/tags/... --jq '.object.sha' as the workflow pin, but annotated tags can return a tag-object SHA instead of a commit SHA, which can make uses: owner/repo@<sha> invalid or non-reproducible. This repo’s own resolver in scripts/bump-actions.mjs already handles this by checking object.type and dereferencing /git/tags/<sha> to the underlying commit, so following this plan literally can generate broken action pins.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-26T12:05:50Z

+- [ ] **Step 6.5: Generate `baseline.json` once**
+
+```bash
+node --import tsx scripts/run-determinism-replay.mjs --write-baseline


Add tsx before invoking node --import tsx

The determinism baseline command depends on tsx, but this plan does not include adding tsx to dependencies in the same row; on a clean checkout the command fails with ERR_MODULE_NOT_FOUND before baseline generation runs. That means step 6.5 (and the corresponding script wiring later in this plan) is not executable as written unless tsx is explicitly installed or the command is switched to a runtime that is already available.

Useful? React with 👍 / 👎.

Two P1 line-level findings on the initial plan commit (516a2f4) addressed: 1. Annotated-tag SHA resolution. The original Step 1.1 told the worker to pin via 'gh api .../git/refs/tags/<tag> --jq .object.sha', which returns a tag-object SHA for annotated tags rather than a commit SHA - producing invalid 'uses: owner/repo@<sha>' references. Added a peel-aware 'resolve_action_sha' Bash helper to the verification-gate section (mirrors the logic already in scripts/bump-actions.mjs tagToCommitSha) and updated Step 1.1 to call it instead. Loud anti-pattern callout retained. 2. Missing 'tsx' runtime for determinism baseline. The original Step 6.5 ran 'node --import tsx scripts/run-determinism-replay.mjs --write-baseline' but tsx was never added to devDeps - the command would fail ERR_MODULE_NOT_FOUND on a clean checkout. Dropped the harness script and the tsx requirement entirely; baseline write is now done by the same vitest test file via an argv flag passed through 'npx vitest -- --write-baseline'. Cross-platform via process.argv (no env-var prefix needed, no cross-env dep). Side effects: file structure table no longer lists 'scripts/run-determinism-replay.mjs'; modified-files table for package.json drops the '@playwright/test' (it was already documented as living only in examples/nurture-pet/package.json) and the implicit tsx mention. Risk-register baseline note updated to reference the 'npm run determinism:baseline' command directly. No code changes; plan-only commit on chore/quality-automation-routines.

Luis85 · 2026-04-26T12:13:37Z

Addressed both P1 findings in 0adb8c5:

P1 Phase A.1 — ship-readiness remediation (27/28 R-XX items) #1 (line 170, SHA peel): Added peel-aware resolve_action_sha Bash helper to the verification-gate section (mirrors scripts/bump-actions.mjs#tagToCommitSha). Step 1.1 + every other SHA-resolve step now calls the helper. Loud anti-pattern callout retained against gh api .../refs/tags/<tag> --jq '.object.sha'.
P1 chore: update all dependencies to latest versions #2 (line 679, tsx missing): Dropped tsx + scripts/run-determinism-replay.mjs entirely. Baseline write moved into the same vitest file behind npx vitest -- --write-baseline (argv flag, cross-platform via process.argv — no env-var prefix or cross-env dep). File-structure table + Step 6.3/6.5/6.8/6.11 all updated to match.

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0adb8c5c77

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-26T12:17:55Z

+- Create: `tests/determinism/replay.bench.ts` *(actually a vitest
+  test, not a bench — using `.bench.ts` would skew CI timing)*


Rename replay file to a runtime test suffix

Using replay.bench.ts here makes the determinism check non-runnable under the plan’s own commands: Vitest’s runtime test discovery defaults to .test/.spec patterns, while .bench.* files are benchmark-oriented (vitest bench). Because later steps call vitest run tests/determinism/replay.bench.ts, the replay assertions/baseline write path can be skipped or fail before executing any tests unless test.include is explicitly changed (not documented in this plan). Please switch the file to a test suffix (for example replay.test.ts) or add an explicit include override.

Useful? React with 👍 / 👎.

Codex flagged that vitest's runtime test discovery defaults to .test/.spec patterns; .bench.* files are routed to the benchmark runner ('vitest bench') and are skipped by 'vitest run'. The plan's own commands ('npx vitest run tests/determinism/replay.bench.ts') would therefore be no-ops without an explicit test.include override. Renamed the file across the file-structure table, Chunk 4 Files block, Steps 6.3 / 6.4 / 6.5 / 6.6 / 6.7 / 6.8 / 6.11. Dropped the now-stale parenthetical caveat at the Files block ('actually a vitest test, not a bench — using .bench.ts would skew CI timing') since the rename eliminates the underlying confusion. Renamed Chunk 4 heading from 'Determinism replay benchmark' to 'Determinism replay' to keep terminology aligned. No code changes; plan-only commit on chore/quality-automation-routines.

Luis85 · 2026-04-26T12:19:41Z

Addressed P1 finding in 10651dc:

P1 (line 671, .bench.ts non-discoverable): Renamed tests/determinism/replay.bench.ts → tests/determinism/replay.test.ts across the entire plan (file-structure table, Chunk 4 Files block, Steps 6.3/6.4/6.5/6.6/6.7/6.8/6.11). Dropped the stale '(actually a vitest test, not a bench…)' caveat — the rename moots it. Renamed Chunk 4 heading 'Determinism replay benchmark' → 'Determinism replay' for terminology consistency. Vitest run now picks the file up via the default .test glob; no test.include override needed.

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 10651dc612

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-26T12:22:22Z

+- [ ] **Step 5.5: Create the empty trend file**
+
+```bash
+touch docs/metrics/bundle-trend.jsonl


Create docs/metrics before touching trend file

Step 5.5 runs touch docs/metrics/bundle-trend.jsonl, but this repository does not contain a docs/metrics/ directory, so the command fails with No such file or directory and blocks the row from being completed as written. Add a preceding mkdir -p docs/metrics (or otherwise ensure the directory exists) so the plan is executable without manual deviation.

Useful? React with 👍 / 👎.

Counter-arg: this is a stale finding. Step 5.5 in commit 2544067 (and current HEAD a02d414) already runs mkdir -p docs/metrics immediately before touch docs/metrics/bundle-trend.jsonl in the same fenced bash block (lines 575–576):

mkdir -p docs/metrics touch docs/metrics/bundle-trend.jsonl

The 10651dc → 2544067 commit is the one that introduced the mkdir; the original 10651dc finding was correct, but this re-flag on 2544067/a02d414 misses the fix. No further change required.

Codex flagged Step 5.5 ('touch docs/metrics/bundle-trend.jsonl') as non-executable on a clean develop checkout because docs/metrics/ does not yet exist; touch does not create parent directories and fails with 'No such file or directory'. Inserted 'mkdir -p docs/metrics' before the touch and added an explicit 'git add docs/metrics/bundle-trend.jsonl' + 'git status --short' check afterwards so a Windows zero-byte staging quirk does not silently drop the new file from the commit. No code changes.

Luis85 · 2026-04-26T12:25:16Z

Addressed P1 finding in 2544067:

P1 (line 572, missing parent dir): Step 5.5 now runs mkdir -p docs/metrics before touch docs/metrics/bundle-trend.jsonl. Added a follow-up git add + git status --short docs/metrics/ check to defend against Windows zero-byte staging quirks (some IDE git filters elide empty-file adds from the staged set).

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 254406747e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-26T12:28:48Z

+      - run: npm run build
+      - name: Append snapshot row
+        env:
+          GITHUB_SHA: ${{ github.sha }}


Record checked-out SHA for bundle snapshot metadata

This step injects GITHUB_SHA from ${{ github.sha }} even though the job explicitly checks out develop. On schedule runs, GitHub sets github.sha to the default-branch tip, so in repositories where the default branch is not develop, the JSONL row will store a SHA that does not match the code that size-limit actually measured. That produces silently incorrect trend metadata and makes later regressions harder to trace; derive the SHA from the checked-out tree (git rev-parse) instead of the event context.

Useful? React with 👍 / 👎.

Codex flagged that the bundle-size-trend workflow injects GITHUB_SHA from ${{ github.sha }} while the checkout step pins ref: develop. On schedule events github.sha points at the default-branch tip; this repo's default is main, so the JSONL row would store a main SHA while size-limit actually measured the develop tree - silently wrong trend metadata. Replaced with an explicit 'Resolve checked-out SHA' step that runs 'git rev-parse HEAD' post-checkout and feeds its output to the append-snapshot step via steps.head.outputs.sha. Inline comment explains why github.sha is wrong here. No code changes.

Luis85 · 2026-04-26T12:32:24Z

Addressed P2 finding in a02d414:

P2 (line 618, github.sha vs checkout ref): Added a 'Resolve checked-out SHA' step that runs git rev-parse HEAD post-checkout and feeds its output to the append-snapshot step via steps.head.outputs.sha instead of ${{ github.sha }}. Inline comment in the YAML explains why github.sha is wrong here (default branch is main, checkout pins develop).

The companion P1 ('Create docs/metrics before touching trend file') was a stale re-flag — see thread reply on the line comment for the counter-arg. Step 5.5 already mkdirs the directory.

@codex review

chatgpt-codex-connector · 2026-04-26T12:34:29Z

Codex Review: Didn't find any major issues. Can't wait for the next one!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…uct-demo) PR #129 is the umbrella tracker for the pre-v1 demo evolution increment. Wave 0 of that increment is an atomic single-PR rename of examples/nurture-pet/ -> examples/product-demo/, with the rename PR sweeping every reference across scripts, Pages workflow, README, CI and tests. This plan touches the demo workspace in Task 8 (Playwright smoke), so the two PRs need an explicit coordination contract. Choices: (a) Pre-rename in this plan to anticipate Wave 0 (b) Stay on examples/nurture-pet/ and let the rename PR sweep us Picked (b). Reasoning: develop currently contains only nurture-pet/; shipping a plan that references product-demo/ before Wave 0 merges would point at a path that does not exist, breaking Step 8.0 (path sanity check) and Step 8.4 (local Playwright run against preview server). A single mechanical sed in the Wave 0 PR converges everything; pre-renaming would force manual reconciliation in two places. Changes: - Architecture bullet now points at the demo workspace via that description rather than a hardcoded directory name; cross-refs the new coordination section. - New 'Coordination with PR #129 (demo rename)' section between Out of scope and File structure: explains the rename, the path policy table, and the explicit sequencing rule (decide at start of Task 8; if Wave 0 has merged, merge develop into this branch and substitute paths; if Wave 0 has not merged, proceed verbatim and ping the Wave 0 PR when it opens). Cites MEMORY.md -> feedback_parallel_pr_plan_conflicts.md for the merge vs rebase preference. - Task 8 gains a 'Demo path' callout block + a new Step 8.0 that greps origin/develop for the demo dir name and tells the worker whether to substitute. - Risk register: new bullet 'Demo rename in flight (PR #129 / Wave 0)' warning specifically against shipping examples/product-demo/ paths from this branch before Wave 0 lands. Plan-only commit; no library or workflow code touched. Path strings remain examples/nurture-pet/ literal across all eight rows by design.

Luis85 · 2026-04-26T12:39:25Z

Cross-PR coordination added in 3e52bb7 for the demo rename in PR #129.

PR #129 introduces a Wave 0 atomic rename of examples/nurture-pet/ → examples/product-demo/. Task 8 of this plan (Playwright smoke) touches the demo workspace, so the two need an explicit contract.

Decision: keep all eight rows on examples/nurture-pet/ literal paths. develop currently contains only nurture-pet/, so pre-renaming here would point at a non-existent directory and break Step 8.0/8.4. A single mechanical sed in the Wave 0 rename PR converges everything.

Plan additions:

New 'Coordination with PR Add Pre-v1 Demo Evolution Plan and product-demo rename strategy #129 (demo rename)' section between Out-of-scope and File structure: rename rationale, path-policy table, explicit sequencing rule (if Wave 0 merged first → merge develop into this branch + substitute paths; if Task 8 merged first → ping Wave 0 PR for sweep coverage).
Task 8: new 'Demo path' callout + Step 8.0 to grep origin/develop for the active demo dir before doing anything else.
Risk register: explicit bullet warning against shipping examples/product-demo/ paths before Wave 0 lands.

Cites MEMORY.md → feedback_parallel_pr_plan_conflicts.md for the merge-not-rebase preference on plan conflicts.

cc PR #129 maintainers — open to swapping (b) → (a) if owner prefers anticipating the rename, but that means PR #130 cannot merge until Wave 0 is in.

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3e52bb7206

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-26T12:42:35Z

+Argv: `--target <path>`. Reads stdin → JSON. Builds `{iso, sha,
+entries}`. Appends one line. Idempotent on identical content (skip if
+last line equals new line minus `iso`).


Preserve weekly bundle snapshots when payload is unchanged

The plan currently asks append-size-snapshot to skip appending when the last row matches except for iso, but this conflicts with the stated goal of a weekly trend file. On weeks where develop and bundle sizes are unchanged, the cron run would emit no new row, creating gaps in the time series and making weekly trend analysis impossible (you only record change events, not snapshots). Consider limiting dedupe to same-run retries (e.g., same date+sha) instead of dropping rows that differ only by timestamp.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-26T12:42:35Z

+"mutation": "stryker run",
+"mutation:report": "open reports/mutation/mutation.html"


Use a cross-platform command for mutation report viewing

The proposed mutation:report script uses open, which is macOS-specific. Contributors running Linux or Windows will get a command-not-found failure when following this plan, even though the rest of the document explicitly tries to stay cross-platform (for example, Windows-compatible argv handling in Task 6). Use a portable launcher (or document OS-specific alternatives) so the scripted workflow works across supported dev environments.

Useful? React with 👍 / 👎.

Restructures the plan from one monolithic 1388-line file under one big PR into an umbrella-tracker pattern matching PR #129: - The original docs/plans/2026-04-26-quality-automation-routines.md is now the umbrella tracker (~268 lines): role, goals, coordination with PR #129, tracker table, downstream-PR contract, shared resolve_action_sha helper, risk register, archive policy. No implementation steps live in this file anymore. - Eight new self-contained chunk plans under docs/plans/: 2026-04-26-quality-codeql.md (~158 lines, row 1) 2026-04-26-quality-dep-triage-bot.md (~141 lines, row 2) 2026-04-26-quality-actions-bump-bot.md (~100 lines, row 3) 2026-04-26-quality-plan-recon-bot.md (~108 lines, row 4) 2026-04-26-quality-bundle-trend.md (~230 lines, row 5) 2026-04-26-quality-determinism-replay.md (~252 lines, row 6) 2026-04-26-quality-mutation-testing.md (~175 lines, row 7) 2026-04-26-quality-demo-smoke.md (~263 lines, row 8) Each chunk plan stands alone: Files block, full step list with bite-sized TDD where applicable, acceptance criteria, and a per-chunk Tracks: #130 + tracker-row-tick contract. Why split now: - Reviewer cost. After three Codex passes the monolithic plan had grown past 1390 lines and review latency was dominating the iteration loop. - Parallelism. Eight chunk plans can be picked up by eight independent agents/sessions concurrently from origin/develop, all ticking back into the same umbrella tracker. - Bounded blast radius. Each row's downstream PR is small enough to Codex-review fast, revert cleanly if needed, and ship without blocking the others. Downstream PR contract codified in the umbrella section 'Downstream PR contract': branch off develop (NOT off this tracker branch), include 'Tracks: #130' body line, tick the tracker row in the same diff (no follow-up tracker-update PRs), pass npm run verify, pin every new GHA uses: ref via the umbrella's resolve_action_sha helper, no changeset (tooling-only). The PR #129 demo-rename coordination is preserved in both the umbrella plan AND the demo-smoke chunk's own coordination section. Shared resolve_action_sha helper lives once in the umbrella; chunks link to it instead of duplicating. This commit is plan-restructure only — no library, workflow, or script code is added by it. Each downstream chunk PR delivers its own implementation.

Luis85 · 2026-04-26T12:50:10Z

Restructured PR #130 from monolithic plan into umbrella tracker + 8 chunk plans (commit 16f23ac).

What changed:

Original plan compacted to umbrella tracker (~268 lines): goal, architecture, coordination with Add Pre-v1 Demo Evolution Plan and product-demo rename strategy #129, tracker table, downstream-PR contract, shared resolve_action_sha helper, risk register.
8 new self-contained chunk plans under docs/plans/2026-04-26-quality-*.md, one per shippable row. Each carries its own Files block, full step list (TDD where applicable), acceptance criteria, and Tracks: #130 contract.
PR body rewritten to umbrella form (mirrors Add Pre-v1 Demo Evolution Plan and product-demo rename strategy #129's shape): document index, downstream PR contract, tracker table, sequencing notes.
PR title updated.

Downstream PR pattern (per umbrella section 'Downstream PR contract'):

Branch off develop (NOT this tracker branch).
Body line Tracks: #130.
Tick the tracker row in the same diff (no follow-up tracker-update PRs).
Pass npm run verify + pin GHA refs via peel-aware helper + skip changeset.

This PR stays draft until every row in the tracker table is [x]. After merge, the 9 plan files git mv to docs/archive/plans/ in a single follow-up commit on develop.

@codex review

P2 #1 (bundle-trend chunk, line 605 of pre-split plan): the dedupe policy 'skip if last line equals new line minus iso' would drop weeks where bundle sizes are unchanged, turning the JSONL into a change-log instead of a snapshot time series and breaking weekly trend analysis. Fix: dedupe only on (sha, calendar-date) tuple. Re-runs of the same cron firing on the same commit + same UTC day are no-ops; identical payload week-over-week with a new sha still appends a row. Added two new tests to lock in the semantics: - dedupes a same-day same-sha re-run (workflow_dispatch retry) - appends a new row when entries are unchanged but sha differs Updated the implementation pseudocode in Step 3 to show the (sha, date) tuple comparison, and the Acceptance criteria to call out the dedupe-policy assertions explicitly. P2 #2 (mutation chunk, line 982 of pre-split plan): the proposed 'mutation:report': 'open reports/mutation/mutation.html' npm script is macOS-only - 'open' is not on PATH on Linux (xdg-open) or Windows (start), so contributors get 'command not found' when following the plan. Fix: drop the convenience script entirely. The HTML report is the deliverable; CI uploads it as a 30-day artifact (Step 6) so reviewers consume it from the GitHub UI anyway. Added a per-OS command block to the plan documenting macOS / Linux / Windows incantations for contributors who want to open the file locally. Cross-platform launcher (open-cli or process.platform branch) would buy ~3 keystrokes for one extra devDep or script file - not worth it for this surface. Plan-only commits in two of the eight chunk plans; umbrella + remaining 6 chunks unchanged.

Luis85 · 2026-04-26T12:52:44Z

Addressed both P2 findings on commit 3e52bb7 in 1b3d6e5 (post-split, applied to the relevant chunk plans):

P2 Phase A.1 — ship-readiness remediation (27/28 R-XX items) #1 (bundle-trend dedupe over-broad): Changed the dedupe policy from 'skip if last line equals new line minus iso' to 'skip only if (sha, calendar-date) tuple matches'. Identical bundle payload week-over-week with a different sha now correctly appends a new row, preserving the time series. Added two tests in scripts/append-size-snapshot.test.mjs to lock the semantics: same-day-same-sha retry is a no-op; identical-payload + new-sha appends. Acceptance criteria updated.
P2 chore: update all dependencies to latest versions #2 (open reports/... not cross-platform): Dropped the mutation:report npm script entirely (was macOS-only — open fails on Linux/Windows). The HTML report is the deliverable; CI uploads it as the mutation-report artifact. Added a per-OS command block (macOS / Linux / Windows) for local viewing.

Both changes plan-only; touch docs/plans/2026-04-26-quality-bundle-trend.md + docs/plans/2026-04-26-quality-mutation-testing.md. Umbrella + 6 other chunks unchanged.

@codex review

chatgpt-codex-connector · 2026-04-26T12:56:11Z

Codex Review: Didn't find any major issues. What shall we delve into next?

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Adds a long-lived GitHub Issue (#131) as the durable tracker that survives PR merges. PR #130 ships the plan files; the issue tracks every downstream chunk PR via auto-flipping task list and stays open until the plans are archived to docs/archive/plans/. Changes: - Umbrella plan header gains a 'Durable tracker' callout linking Issue #131 with the close-when criteria. - Umbrella's downstream-PR contract section now requires BOTH body lines on chunk PRs: Tracks: #130 (planning PR — supplies the chunk plan) Tracks: #131 (durable issue tracker — auto-ticks on merge) - All 8 chunk plan headers updated to cite both #130 (umbrella plan) and #131 (durable issue tracker). - All 8 chunks' embedded 'gh pr create --body ...' blocks now emit both Tracks lines so worker agents inherit the contract verbatim. The Issue itself was created via gh issue create. It captures: - Origin (the 2026-04-26 'are there more things we can or should automate' question). - Resulting plan layout (umbrella + 8 chunks). - Auto-flipping task list mirroring the umbrella tracker table. - Same coordination with #129 + same risk register summary. - Closes-when criteria: every chunk PR merged + plans archived. PR #130 body also rewritten to point at #131 and explain the PR-vs-issue split (PR is the planning surface and closes; issue is the durable surface and survives). Plan-only commit. No code, no workflow, no script changes.

Luis85 · 2026-04-26T12:56:46Z

Per owner request, set up Issue #131 as the durable, long-lived tracker that survives this PR's merge.

Split of responsibilities:

PR chore(quality): umbrella tracker for quality-automation increment (8 chunks) #130 (this PR): planning surface. Ships the umbrella + 8 chunk plans. Closes after squash-merge.
Issue Quality automation increment — tracker #131: durable surface. Captures the original motivation ('are there more things we can or should automate'), the 8-row task list, coordination with Add Pre-v1 Demo Evolution Plan and product-demo rename strategy #129, the risk register, and the close-when criteria. Auto-flips its task entries when chunk PRs merge. Stays open until the plans are archived to docs/archive/plans/.

Wiring (commit 66676e8):

Umbrella plan header gains a 'Durable tracker' callout linking Quality automation increment — tracker #131 + close-when criteria.
Downstream-PR contract now requires BOTH Tracks: #130 AND Tracks: #131 body lines on every chunk PR.
All 8 chunk plan headers cite both chore(quality): umbrella tracker for quality-automation increment (8 chunks) #130 (umbrella) and Quality automation increment — tracker #131 (durable).
All 8 chunks' embedded gh pr create --body ... blocks emit both Tracks lines so worker agents inherit the contract verbatim.
PR chore(quality): umbrella tracker for quality-automation increment (8 chunks) #130 body rewritten to explain the PR-vs-issue split.

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 66676e823f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-26T13:02:00Z

+npm run build && npm run preview &
+PREVIEW_PID=$!
+npx playwright test --headed --project=chromium


Start preview synchronously before launching Playwright

The Step 4 command npm run build && npm run preview & backgrounds the entire build && preview chain, so npx playwright test can start while the preview server is still building (or before it binds the port). In that timing window, Playwright may try to spawn its own webServer command or fail to connect, making this selector-wiring step flaky and non-deterministic for contributors following the plan.

Useful? React with 👍 / 👎.

Codex flagged a race condition in the demo-smoke chunk's local selector-wiring step: cd examples/nurture-pet npm run build && npm run preview & PREVIEW_PID=$! npx playwright test --headed --project=chromium kill $PREVIEW_PID Backgrounding the 'build && preview' chain forks the shell immediately, so 'npx playwright test' can launch while vite is still building (or before it has bound port 4173). Selector wiring becomes flaky for contributors following the plan. Fix: rely on the webServer block already declared in the Step 2 playwright.config.ts (it spawns 'npm run preview ...' AND polls http://127.0.0.1:4173 until it responds before sending any test traffic, with reuseExistingServer: !process.env.CI to support 'leave dev server open in another terminal' workflows). Step 4 now: cd examples/nurture-pet npm run build npx playwright test --headed --project=chromium cd - Build synchronously first because playwright preview needs dist/. Then hand control to playwright — it manages the server lifecycle. Removes PREVIEW_PID dance + the kill at the end. No code changes; demo-smoke chunk plan only.

Luis85 · 2026-04-26T13:03:21Z

Addressed P2 finding in 26ef13d:

P2 (demo-smoke Step 4, line 133, race on preview startup): Backgrounding npm run build && npm run preview & returns as soon as the shell forks the chain, not when port 4173 is bound — npx playwright test could fire before vite was listening, making the selector-wiring loop flaky. Replaced with a synchronous npm run build followed by npx playwright test --headed and let the existing webServer block in playwright.config.ts (Step 2) own the server lifecycle. The webServer config already polls the URL until it responds and supports reuseExistingServer: !process.env.CI for 'leave dev server open' workflows. Dropped the PREVIEW_PID / kill dance entirely.

@codex review

chatgpt-codex-connector · 2026-04-26T13:07:23Z

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Tracks: #130 Tracks: #131 ## Summary Adds a CodeQL JS/TS scan that runs on every push and PR to develop/main plus a Mondays-06:00-UTC weekly cron, using the `security-and-quality` query suite. All actions pinned to 40-char commit SHAs per the repo's supply-chain rule. Ticks row 1 of the umbrella tracker. ## Test plan - [x] `npm run verify` green (523/523 tests, lint/typecheck/build all pass) - [x] `actionlint` clean on `.github/workflows/codeql.yml` - [x] Action SHAs resolved via the umbrella's `resolve_action_sha` peel-aware helper (annotated tags handled) - [ ] Post-merge: confirm `CodeQL / Analyze (javascript-typescript)` job runs green on the merge push to `develop` Co-authored-by: Luis Mendez <hallo@luis-mendez.de>

Tracks: #130 Tracks: #131 Adds the weekly Dependabot triage cloud-routine prompt + README scaffold under `docs/dep-triage-bot/`. Mirrors the existing `docs/review-bot/` and `docs/docs-review-bot/` shape (system prompt + README + rolling tracker issue, idempotency via canonical SHA list in issue body). ## What's in the diff - `docs/dep-triage-bot/PROMPT.md` — system prompt: role, scope, triage policy table (dev-deps minor/patch auto-merge; runtime minor/patch approval-comment; majors + peer-deps owner-only), hard rules, output format (rolling tracker + per-PR table), idempotency, dry-run guard, failure handling. - `docs/dep-triage-bot/README.md` — routine setup, sinks, iteration workflow, known tradeoffs. - `.github/dependabot.yml` — added `npm-non-major` group block to BOTH npm ecosystem entries (root + `examples/nurture-pet`) so minor + patch updates land as one weekly bundled PR per ecosystem instead of N. Replaced the prior `dev-dependencies` group on the root entry — the new group covers both runtime + dev minors/patches under the same bundling policy. Major bumps remain ungrouped (one PR per major, owner-reviewed). GitHub Actions ecosystem grouping (`actions:`) is unchanged. - `docs/plans/2026-04-26-quality-automation-routines.md` — ticked row 2 (this row) to `- [x] shipped` per the umbrella's tick rule. ## Out-of-repo follow-up (does NOT gate merge) Once this PR lands, the owner needs to: 1. Create the rolling tracker issue `Dependency triage — develop` with label `dep-triage-bot` and the seed body documented in `docs/dep-triage-bot/README.md` setup checklist. 2. Schedule the cloud routine (Claude Cloud) at `0 8 * * 1` (Monday 08:00 UTC, one hour after Dependabot fires) using the wrapper prompt in the README. 3. Ensure repo allows auto-merge (`Settings → General → Pull Requests → Allow auto-merge`) so the dev-deps minor/patch path can actually `gh pr merge --auto --squash`. All three are spelled out in the README's Setup checklist. ## Verification - `npm run verify` — green locally on this branch. - Doc-only diff (plus a config grouping change in `.github/dependabot.yml`); no `src/**`, no changeset. Ticks row 2 of the umbrella tracker. --------- Co-authored-by: Luis Mendez <hallo@luis-mendez.de>

Tracks: #130 Tracks: #131 Adds a Sundays-04:00-UTC weekly workflow that snapshots npx size-limit --json for the develop tip into docs/metrics/bundle-trend.jsonl. SHA is resolved from the checked-out tree (git rev-parse HEAD), NOT github.sha (which would point at the default-branch tip on schedule events). Dedupe policy: same (sha, calendar-date) re-run is a no-op; identical-payload week-over-week with a new sha appends a new row. Ticks row 5 of the umbrella tracker.

Tracks: #130 Tracks: #131 Adds the monthly plan-reconciliation cloud-routine prompt + README scaffold under docs/plan-recon-bot/. Distinct from docs-review-bot (which audits prose drift); this routine answers 'is this plan done?' and archives finished plans via git mv. Failure issues land under the plan-recon-bot label per the umbrella's cloud-routine output convention. Ticks row 4 of the umbrella tracker. --------- Co-authored-by: Luis Mendez <hallo@luis-mendez.de> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Tracks: #130 Tracks: #131 Adds the weekly action-SHA-bump cloud-routine prompt + README scaffold under docs/actions-bump-bot/. The routine wraps the existing scripts/bump-actions.mjs (which already peels annotated tags via tagToCommitSha) into a weekly PR-opening workflow. Failure issues land under the actions-bump-bot label per the umbrella's cloud-routine output convention. Ticks row 3 of the umbrella tracker. --------- Co-authored-by: Luis Mendez <hallo@luis-mendez.de> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector Bot reviewed Apr 26, 2026

View reviewed changes

Luis85 changed the title ~~chore(quality): add quality-automation routines (CodeQL, mutation, Playwright, …)~~ chore(quality): umbrella tracker for quality-automation increment (8 chunks) Apr 26, 2026

Luis85 mentioned this pull request Apr 26, 2026

Quality automation increment — tracker #131

Open

8 tasks

chatgpt-codex-connector Bot reviewed Apr 26, 2026

View reviewed changes

Luis85 marked this pull request as ready for review April 26, 2026 13:07

Luis85 merged commit d52e10c into develop Apr 26, 2026
16 checks passed

Luis85 deleted the chore/quality-automation-routines branch April 26, 2026 13:15

Luis85 mentioned this pull request Apr 26, 2026

ci(codeql): add weekly + push CodeQL JS/TS scan #133

Merged

4 tasks

Luis85 mentioned this pull request Apr 26, 2026

docs(routine): add weekly Dependabot triage prompt #136

Merged

This was referenced Apr 26, 2026

docs(routine): add weekly action-SHA bump prompt #142

Merged

docs(routine): add monthly plan reconciliation prompt #143

Merged

ci(metrics): weekly bundle-size trend snapshot #144

Merged

Luis85 added the roadmap:quality-baseline Quality automation: CodeQL, Stryker, determinism replay, demo smoke, review-finding fixes label May 5, 2026

		gh api repos/actions/checkout/git/refs/tags/v6.0.2 --jq '.object.sha'
		gh api repos/github/codeql-action/git/refs/tags/v4 --jq '.object.sha'

		- Create: `tests/determinism/replay.bench.ts` *(actually a vitest
		test, not a bench — using `.bench.ts` would skew CI timing)*

		"mutation": "stryker run",
		"mutation:report": "open reports/mutation/mutation.html"

Conversation

Luis85 commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Role — umbrella tracker for the quality-automation increment

Documents in this PR

Downstream PR contract

Tracker — downstream PRs (auto-flips on merge)

Coordination with PR #129

Why an umbrella + a durable issue, not one big PR

When this PR merges

Uh oh!

Luis85 commented Apr 26, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 26, 2026

Summary

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 commented Apr 26, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 commented Apr 26, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 commented Apr 26, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 commented Apr 26, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 26, 2026

Uh oh!

Luis85 commented Apr 26, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 26, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 commented Apr 26, 2026

Uh oh!

Luis85 commented Apr 26, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 26, 2026

Uh oh!

Luis85 commented Apr 26, 2026

Uh oh!

Luis85 commented Apr 26, 2026 •

edited

Loading