Skip to content

chore(quality): umbrella tracker for quality-automation increment (8 chunks)#130

Merged
Luis85 merged 10 commits into
developfrom
chore/quality-automation-routines
Apr 26, 2026
Merged

chore(quality): umbrella tracker for quality-automation increment (8 chunks)#130
Luis85 merged 10 commits into
developfrom
chore/quality-automation-routines

Conversation

@Luis85
Copy link
Copy Markdown
Owner

@Luis85 Luis85 commented Apr 26, 2026

Role — umbrella tracker for the quality-automation increment

This PR is the umbrella tracker (planning side) for the
quality-automation increment. It stays draft until every
downstream chunk PR has merged. Implementation lives in those
downstream PRs, cut from develop.

Durable tracker: Issue #131.
The issue is the long-lived record of this increment — it survives
PR merges. This PR ships the plan files; the issue auto-flips its
task list as each downstream chunk PR merges and stays open until
the plans are archived.

Documents in this PR

Downstream PR contract

Every PR cut from these chunk plans MUST:

  1. Branch off develop (NOT off this tracker's branch
    chore/quality-automation-routines). Worktree under
    .worktrees/<branch-slug> per CLAUDE.md.
  2. Include BOTH body lines:
    • Tracks: #130 (planning PR — supplies the chunk plan).
    • Tracks: #131 (durable issue tracker — auto-ticks on merge).
  3. Tick its row in the umbrella plan's tracker table in the same
    diff (no follow-up "tracker update" PRs).
  4. Pass npm run verify locally before opening.
  5. Pin every new GHA uses: reference to a 40-char commit SHA via
    the umbrella's peel-aware resolve_action_sha helper — never
    gh api .../refs/tags/<tag> --jq '.object.sha' directly.
  6. Skip changesets (tooling-only PRs, pre-1.0).
  7. Address Codex review per
    MEMORY.md → feedback_pr_codex_polling.md /
    feedback_codex_signal_endpoints.md.

Tracker — downstream PRs (auto-flips on merge)

Add - [ ] #NNN — short description rows here as downstream PRs
are opened. Issue #131 carries the parallel task list as the durable
record.

  • (no downstream PRs opened yet)

Coordination with PR #129

Wave 0 of PR #129
atomically renames examples/nurture-pet/examples/product-demo/.
Only row 8 (demo smoke) touches the demo workspace. The umbrella
plan documents the sequencing rule: the demo-smoke chunk uses
examples/nurture-pet/ literal paths today; if Wave 0 lands first,
the chunk substitutes mechanically before opening its PR.

Why an umbrella + a durable issue, not one big PR

Earlier drafts of this increment bundled all eight rows on a single
branch. After three Codex review cycles the plan had grown past
1390 lines and review latency was the dominant cost. Splitting into
an umbrella + 8 chunk plans:

  • Lets independent agents work rows in parallel.
  • Keeps each Codex review pass narrowly scoped.
  • Bounds blast radius if any single row needs revert.

The PR is the planning surface — it ships the plan files and
closes once they land. The issue (#131) is the durable surface
— it tracks every chunk PR, captures the original motivation, and
stays open until the plans are archived.

When this PR merges

When every chunk PR is merged into develop:

  1. Mark this PR ready-for-review.
  2. Squash-merge.
  3. After merge, git mv docs/plans/2026-04-26-quality-*.md docs/archive/plans/ in a single follow-up chore(docs) commit
    on develop (the only doc move that ships outside its
    originating PR — the originating PR here IS the tracker).
  4. Close issue Quality automation increment — tracker #131 once the archive sweep lands.

🤖 Generated with Claude Code

Adds an actionable plan covering 8 quality / supply-chain / drift
automation surfaces (CodeQL, Dependabot triage prompt, action-SHA bump
prompt, plan-reconciliation prompt, bundle-size trend, determinism
replay, mutation testing, demo Playwright smoke). Stale-branch cleaner
intentionally excluded.

This commit lands only the plan; subsequent commits on this branch
implement each row top-to-bottom. PR opens draft and is flipped to
ready-for-review only after every checkbox in the plan is ticked and
'npm run verify' is green.
@Luis85
Copy link
Copy Markdown
Owner Author

Luis85 commented Apr 26, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Summary

Testing

  • npm run verify
  • ⚠️ docker run --rm -v "$PWD":/repo -w /repo rhysd/actionlint:latest -color .github/workflows/codeql.yml (warning: docker is not installed in this environment)

View task →

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 516a2f4289

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +169 to +170
gh api repos/actions/checkout/git/refs/tags/v6.0.2 --jq '.object.sha'
gh api repos/github/codeql-action/git/refs/tags/v4 --jq '.object.sha'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Resolve action tags to commit SHAs before pinning

Step 1.1 tells the worker to use gh api .../git/refs/tags/... --jq '.object.sha' as the workflow pin, but annotated tags can return a tag-object SHA instead of a commit SHA, which can make uses: owner/repo@<sha> invalid or non-reproducible. This repo’s own resolver in scripts/bump-actions.mjs already handles this by checking object.type and dereferencing /git/tags/<sha> to the underlying commit, so following this plan literally can generate broken action pins.

Useful? React with 👍 / 👎.

- [ ] **Step 6.5: Generate `baseline.json` once**

```bash
node --import tsx scripts/run-determinism-replay.mjs --write-baseline
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Add tsx before invoking node --import tsx

The determinism baseline command depends on tsx, but this plan does not include adding tsx to dependencies in the same row; on a clean checkout the command fails with ERR_MODULE_NOT_FOUND before baseline generation runs. That means step 6.5 (and the corresponding script wiring later in this plan) is not executable as written unless tsx is explicitly installed or the command is switched to a runtime that is already available.

Useful? React with 👍 / 👎.

Two P1 line-level findings on the initial plan commit
(516a2f4) addressed:

1. Annotated-tag SHA resolution. The original Step 1.1 told the worker
   to pin via 'gh api .../git/refs/tags/<tag> --jq .object.sha', which
   returns a tag-object SHA for annotated tags rather than a commit
   SHA - producing invalid 'uses: owner/repo@<sha>' references. Added a
   peel-aware 'resolve_action_sha' Bash helper to the verification-gate
   section (mirrors the logic already in scripts/bump-actions.mjs
   tagToCommitSha) and updated Step 1.1 to call it instead. Loud
   anti-pattern callout retained.

2. Missing 'tsx' runtime for determinism baseline. The original Step
   6.5 ran 'node --import tsx scripts/run-determinism-replay.mjs
   --write-baseline' but tsx was never added to devDeps - the command
   would fail ERR_MODULE_NOT_FOUND on a clean checkout. Dropped the
   harness script and the tsx requirement entirely; baseline write is
   now done by the same vitest test file via an argv flag passed
   through 'npx vitest -- --write-baseline'. Cross-platform via
   process.argv (no env-var prefix needed, no cross-env dep).

Side effects: file structure table no longer lists
'scripts/run-determinism-replay.mjs'; modified-files table for
package.json drops the '@playwright/test' (it was already documented
as living only in examples/nurture-pet/package.json) and the implicit
tsx mention. Risk-register baseline note updated to reference the
'npm run determinism:baseline' command directly.

No code changes; plan-only commit on chore/quality-automation-routines.
@Luis85
Copy link
Copy Markdown
Owner Author

Luis85 commented Apr 26, 2026

Addressed both P1 findings in 0adb8c5:

  • P1 Phase A.1 — ship-readiness remediation (27/28 R-XX items) #1 (line 170, SHA peel): Added peel-aware resolve_action_sha Bash helper to the verification-gate section (mirrors scripts/bump-actions.mjs#tagToCommitSha). Step 1.1 + every other SHA-resolve step now calls the helper. Loud anti-pattern callout retained against gh api .../refs/tags/<tag> --jq '.object.sha'.
  • P1 chore: update all dependencies to latest versions #2 (line 679, tsx missing): Dropped tsx + scripts/run-determinism-replay.mjs entirely. Baseline write moved into the same vitest file behind npx vitest -- --write-baseline (argv flag, cross-platform via process.argv — no env-var prefix or cross-env dep). File-structure table + Step 6.3/6.5/6.8/6.11 all updated to match.

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0adb8c5c77

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +670 to +671
- Create: `tests/determinism/replay.bench.ts` *(actually a vitest
test, not a bench — using `.bench.ts` would skew CI timing)*
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Rename replay file to a runtime test suffix

Using replay.bench.ts here makes the determinism check non-runnable under the plan’s own commands: Vitest’s runtime test discovery defaults to .test/.spec patterns, while .bench.* files are benchmark-oriented (vitest bench). Because later steps call vitest run tests/determinism/replay.bench.ts, the replay assertions/baseline write path can be skipped or fail before executing any tests unless test.include is explicitly changed (not documented in this plan). Please switch the file to a test suffix (for example replay.test.ts) or add an explicit include override.

Useful? React with 👍 / 👎.

Codex flagged that vitest's runtime test discovery defaults to
.test/.spec patterns; .bench.* files are routed to the benchmark
runner ('vitest bench') and are skipped by 'vitest run'. The plan's
own commands ('npx vitest run tests/determinism/replay.bench.ts') would
therefore be no-ops without an explicit test.include override.

Renamed the file across the file-structure table, Chunk 4 Files block,
Steps 6.3 / 6.4 / 6.5 / 6.6 / 6.7 / 6.8 / 6.11. Dropped the now-stale
parenthetical caveat at the Files block ('actually a vitest test, not
a bench — using .bench.ts would skew CI timing') since the rename
eliminates the underlying confusion. Renamed Chunk 4 heading from
'Determinism replay benchmark' to 'Determinism replay' to keep
terminology aligned.

No code changes; plan-only commit on chore/quality-automation-routines.
@Luis85
Copy link
Copy Markdown
Owner Author

Luis85 commented Apr 26, 2026

Addressed P1 finding in 10651dc:

  • P1 (line 671, .bench.ts non-discoverable): Renamed tests/determinism/replay.bench.tstests/determinism/replay.test.ts across the entire plan (file-structure table, Chunk 4 Files block, Steps 6.3/6.4/6.5/6.6/6.7/6.8/6.11). Dropped the stale '(actually a vitest test, not a bench…)' caveat — the rename moots it. Renamed Chunk 4 heading 'Determinism replay benchmark' → 'Determinism replay' for terminology consistency. Vitest run now picks the file up via the default .test glob; no test.include override needed.

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 10651dc612

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- [ ] **Step 5.5: Create the empty trend file**

```bash
touch docs/metrics/bundle-trend.jsonl
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Create docs/metrics before touching trend file

Step 5.5 runs touch docs/metrics/bundle-trend.jsonl, but this repository does not contain a docs/metrics/ directory, so the command fails with No such file or directory and blocks the row from being completed as written. Add a preceding mkdir -p docs/metrics (or otherwise ensure the directory exists) so the plan is executable without manual deviation.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counter-arg: this is a stale finding. Step 5.5 in commit 2544067 (and current HEAD a02d414) already runs mkdir -p docs/metrics immediately before touch docs/metrics/bundle-trend.jsonl in the same fenced bash block (lines 575–576):

mkdir -p docs/metrics
touch docs/metrics/bundle-trend.jsonl

The 10651dc2544067 commit is the one that introduced the mkdir; the original 10651dc finding was correct, but this re-flag on 2544067/a02d414 misses the fix. No further change required.

Codex flagged Step 5.5 ('touch docs/metrics/bundle-trend.jsonl') as
non-executable on a clean develop checkout because docs/metrics/ does
not yet exist; touch does not create parent directories and fails with
'No such file or directory'.

Inserted 'mkdir -p docs/metrics' before the touch and added an explicit
'git add docs/metrics/bundle-trend.jsonl' + 'git status --short' check
afterwards so a Windows zero-byte staging quirk does not silently drop
the new file from the commit. No code changes.
@Luis85
Copy link
Copy Markdown
Owner Author

Luis85 commented Apr 26, 2026

Addressed P1 finding in 2544067:

  • P1 (line 572, missing parent dir): Step 5.5 now runs mkdir -p docs/metrics before touch docs/metrics/bundle-trend.jsonl. Added a follow-up git add + git status --short docs/metrics/ check to defend against Windows zero-byte staging quirks (some IDE git filters elide empty-file adds from the staged set).

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 254406747e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- run: npm run build
- name: Append snapshot row
env:
GITHUB_SHA: ${{ github.sha }}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Record checked-out SHA for bundle snapshot metadata

This step injects GITHUB_SHA from ${{ github.sha }} even though the job explicitly checks out develop. On schedule runs, GitHub sets github.sha to the default-branch tip, so in repositories where the default branch is not develop, the JSONL row will store a SHA that does not match the code that size-limit actually measured. That produces silently incorrect trend metadata and makes later regressions harder to trace; derive the SHA from the checked-out tree (git rev-parse) instead of the event context.

Useful? React with 👍 / 👎.

Codex flagged that the bundle-size-trend workflow injects GITHUB_SHA
from ${{ github.sha }} while the checkout step pins ref: develop. On
schedule events github.sha points at the default-branch tip; this
repo's default is main, so the JSONL row would store a main SHA while
size-limit actually measured the develop tree - silently wrong trend
metadata.

Replaced with an explicit 'Resolve checked-out SHA' step that runs
'git rev-parse HEAD' post-checkout and feeds its output to the
append-snapshot step via steps.head.outputs.sha. Inline comment
explains why github.sha is wrong here. No code changes.
@Luis85
Copy link
Copy Markdown
Owner Author

Luis85 commented Apr 26, 2026

Addressed P2 finding in a02d414:

  • P2 (line 618, github.sha vs checkout ref): Added a 'Resolve checked-out SHA' step that runs git rev-parse HEAD post-checkout and feeds its output to the append-snapshot step via steps.head.outputs.sha instead of ${{ github.sha }}. Inline comment in the YAML explains why github.sha is wrong here (default branch is main, checkout pins develop).

The companion P1 ('Create docs/metrics before touching trend file') was a stale re-flag — see thread reply on the line comment for the counter-arg. Step 5.5 already mkdirs the directory.

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Can't wait for the next one!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…uct-demo)

PR #129 is the umbrella tracker for the pre-v1 demo evolution increment.
Wave 0 of that increment is an atomic single-PR rename of
examples/nurture-pet/ -> examples/product-demo/, with the rename PR
sweeping every reference across scripts, Pages workflow, README, CI
and tests.

This plan touches the demo workspace in Task 8 (Playwright smoke), so
the two PRs need an explicit coordination contract. Choices:

(a) Pre-rename in this plan to anticipate Wave 0
(b) Stay on examples/nurture-pet/ and let the rename PR sweep us

Picked (b). Reasoning: develop currently contains only nurture-pet/;
shipping a plan that references product-demo/ before Wave 0 merges
would point at a path that does not exist, breaking Step 8.0 (path
sanity check) and Step 8.4 (local Playwright run against
preview server). A single mechanical sed in the Wave 0 PR converges
everything; pre-renaming would force manual reconciliation in two
places.

Changes:
- Architecture bullet now points at the demo workspace via that
  description rather than a hardcoded directory name; cross-refs the
  new coordination section.
- New 'Coordination with PR #129 (demo rename)' section between Out
  of scope and File structure: explains the rename, the path policy
  table, and the explicit sequencing rule (decide at start of Task 8;
  if Wave 0 has merged, merge develop into this branch and substitute
  paths; if Wave 0 has not merged, proceed verbatim and ping the Wave
  0 PR when it opens). Cites
  MEMORY.md -> feedback_parallel_pr_plan_conflicts.md for the merge
  vs rebase preference.
- Task 8 gains a 'Demo path' callout block + a new Step 8.0 that
  greps origin/develop for the demo dir name and tells the worker
  whether to substitute.
- Risk register: new bullet 'Demo rename in flight (PR #129 / Wave 0)'
  warning specifically against shipping examples/product-demo/ paths
  from this branch before Wave 0 lands.

Plan-only commit; no library or workflow code touched. Path strings
remain examples/nurture-pet/ literal across all eight rows by design.
@Luis85
Copy link
Copy Markdown
Owner Author

Luis85 commented Apr 26, 2026

Cross-PR coordination added in 3e52bb7 for the demo rename in PR #129.

PR #129 introduces a Wave 0 atomic rename of examples/nurture-pet/examples/product-demo/. Task 8 of this plan (Playwright smoke) touches the demo workspace, so the two need an explicit contract.

Decision: keep all eight rows on examples/nurture-pet/ literal paths. develop currently contains only nurture-pet/, so pre-renaming here would point at a non-existent directory and break Step 8.0/8.4. A single mechanical sed in the Wave 0 rename PR converges everything.

Plan additions:

  • New 'Coordination with PR Add Pre-v1 Demo Evolution Plan and product-demo rename strategy #129 (demo rename)' section between Out-of-scope and File structure: rename rationale, path-policy table, explicit sequencing rule (if Wave 0 merged first → merge develop into this branch + substitute paths; if Task 8 merged first → ping Wave 0 PR for sweep coverage).
  • Task 8: new 'Demo path' callout + Step 8.0 to grep origin/develop for the active demo dir before doing anything else.
  • Risk register: explicit bullet warning against shipping examples/product-demo/ paths before Wave 0 lands.

Cites MEMORY.md → feedback_parallel_pr_plan_conflicts.md for the merge-not-rebase preference on plan conflicts.

cc PR #129 maintainers — open to swapping (b) → (a) if owner prefers anticipating the rename, but that means PR #130 cannot merge until Wave 0 is in.

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3e52bb7206

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +603 to +605
Argv: `--target <path>`. Reads stdin → JSON. Builds `{iso, sha,
entries}`. Appends one line. Idempotent on identical content (skip if
last line equals new line minus `iso`).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve weekly bundle snapshots when payload is unchanged

The plan currently asks append-size-snapshot to skip appending when the last row matches except for iso, but this conflicts with the stated goal of a weekly trend file. On weeks where develop and bundle sizes are unchanged, the cron run would emit no new row, creating gaps in the time series and making weekly trend analysis impossible (you only record change events, not snapshots). Consider limiting dedupe to same-run retries (e.g., same date+sha) instead of dropping rows that differ only by timestamp.

Useful? React with 👍 / 👎.

Comment on lines +981 to +982
"mutation": "stryker run",
"mutation:report": "open reports/mutation/mutation.html"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use a cross-platform command for mutation report viewing

The proposed mutation:report script uses open, which is macOS-specific. Contributors running Linux or Windows will get a command-not-found failure when following this plan, even though the rest of the document explicitly tries to stay cross-platform (for example, Windows-compatible argv handling in Task 6). Use a portable launcher (or document OS-specific alternatives) so the scripted workflow works across supported dev environments.

Useful? React with 👍 / 👎.

Restructures the plan from one monolithic 1388-line file under one big
PR into an umbrella-tracker pattern matching PR #129:

- The original docs/plans/2026-04-26-quality-automation-routines.md is
  now the umbrella tracker (~268 lines): role, goals, coordination
  with PR #129, tracker table, downstream-PR contract, shared
  resolve_action_sha helper, risk register, archive policy. No
  implementation steps live in this file anymore.

- Eight new self-contained chunk plans under docs/plans/:
    2026-04-26-quality-codeql.md              (~158 lines, row 1)
    2026-04-26-quality-dep-triage-bot.md      (~141 lines, row 2)
    2026-04-26-quality-actions-bump-bot.md    (~100 lines, row 3)
    2026-04-26-quality-plan-recon-bot.md      (~108 lines, row 4)
    2026-04-26-quality-bundle-trend.md        (~230 lines, row 5)
    2026-04-26-quality-determinism-replay.md  (~252 lines, row 6)
    2026-04-26-quality-mutation-testing.md    (~175 lines, row 7)
    2026-04-26-quality-demo-smoke.md          (~263 lines, row 8)

  Each chunk plan stands alone: Files block, full step list with
  bite-sized TDD where applicable, acceptance criteria, and a
  per-chunk Tracks: #130 + tracker-row-tick contract.

Why split now:

- Reviewer cost. After three Codex passes the monolithic plan had
  grown past 1390 lines and review latency was dominating the
  iteration loop.
- Parallelism. Eight chunk plans can be picked up by eight
  independent agents/sessions concurrently from origin/develop, all
  ticking back into the same umbrella tracker.
- Bounded blast radius. Each row's downstream PR is small enough to
  Codex-review fast, revert cleanly if needed, and ship without
  blocking the others.

Downstream PR contract codified in the umbrella section
'Downstream PR contract': branch off develop (NOT off this tracker
branch), include 'Tracks: #130' body line, tick the tracker row in
the same diff (no follow-up tracker-update PRs), pass npm run verify,
pin every new GHA uses: ref via the umbrella's resolve_action_sha
helper, no changeset (tooling-only).

The PR #129 demo-rename coordination is preserved in both the
umbrella plan AND the demo-smoke chunk's own coordination section.
Shared resolve_action_sha helper lives once in the umbrella; chunks
link to it instead of duplicating.

This commit is plan-restructure only — no library, workflow, or
script code is added by it. Each downstream chunk PR delivers its
own implementation.
@Luis85 Luis85 changed the title chore(quality): add quality-automation routines (CodeQL, mutation, Playwright, …) chore(quality): umbrella tracker for quality-automation increment (8 chunks) Apr 26, 2026
@Luis85
Copy link
Copy Markdown
Owner Author

Luis85 commented Apr 26, 2026

Restructured PR #130 from monolithic plan into umbrella tracker + 8 chunk plans (commit 16f23ac).

What changed:

  • Original plan compacted to umbrella tracker (~268 lines): goal, architecture, coordination with Add Pre-v1 Demo Evolution Plan and product-demo rename strategy #129, tracker table, downstream-PR contract, shared resolve_action_sha helper, risk register.
  • 8 new self-contained chunk plans under docs/plans/2026-04-26-quality-*.md, one per shippable row. Each carries its own Files block, full step list (TDD where applicable), acceptance criteria, and Tracks: #130 contract.
  • PR body rewritten to umbrella form (mirrors Add Pre-v1 Demo Evolution Plan and product-demo rename strategy #129's shape): document index, downstream PR contract, tracker table, sequencing notes.
  • PR title updated.

Downstream PR pattern (per umbrella section 'Downstream PR contract'):

  1. Branch off develop (NOT this tracker branch).
  2. Body line Tracks: #130.
  3. Tick the tracker row in the same diff (no follow-up tracker-update PRs).
  4. Pass npm run verify + pin GHA refs via peel-aware helper + skip changeset.

This PR stays draft until every row in the tracker table is [x]. After merge, the 9 plan files git mv to docs/archive/plans/ in a single follow-up commit on develop.

@codex review

P2 #1 (bundle-trend chunk, line 605 of pre-split plan): the dedupe
policy 'skip if last line equals new line minus iso' would drop
weeks where bundle sizes are unchanged, turning the JSONL into a
change-log instead of a snapshot time series and breaking weekly
trend analysis.

Fix: dedupe only on (sha, calendar-date) tuple. Re-runs of the same
cron firing on the same commit + same UTC day are no-ops; identical
payload week-over-week with a new sha still appends a row. Added two
new tests to lock in the semantics:

  - dedupes a same-day same-sha re-run (workflow_dispatch retry)
  - appends a new row when entries are unchanged but sha differs

Updated the implementation pseudocode in Step 3 to show the (sha,
date) tuple comparison, and the Acceptance criteria to call out the
dedupe-policy assertions explicitly.

P2 #2 (mutation chunk, line 982 of pre-split plan): the proposed
'mutation:report': 'open reports/mutation/mutation.html' npm script
is macOS-only - 'open' is not on PATH on Linux (xdg-open) or
Windows (start), so contributors get 'command not found' when
following the plan.

Fix: drop the convenience script entirely. The HTML report is the
deliverable; CI uploads it as a 30-day artifact (Step 6) so
reviewers consume it from the GitHub UI anyway. Added a per-OS
command block to the plan documenting macOS / Linux / Windows
incantations for contributors who want to open the file locally.
Cross-platform launcher (open-cli or process.platform branch) would
buy ~3 keystrokes for one extra devDep or script file - not worth
it for this surface.

Plan-only commits in two of the eight chunk plans; umbrella +
remaining 6 chunks unchanged.
@Luis85
Copy link
Copy Markdown
Owner Author

Luis85 commented Apr 26, 2026

Addressed both P2 findings on commit 3e52bb7 in 1b3d6e5 (post-split, applied to the relevant chunk plans):

  • P2 Phase A.1 — ship-readiness remediation (27/28 R-XX items) #1 (bundle-trend dedupe over-broad): Changed the dedupe policy from 'skip if last line equals new line minus iso' to 'skip only if (sha, calendar-date) tuple matches'. Identical bundle payload week-over-week with a different sha now correctly appends a new row, preserving the time series. Added two tests in scripts/append-size-snapshot.test.mjs to lock the semantics: same-day-same-sha retry is a no-op; identical-payload + new-sha appends. Acceptance criteria updated.

  • P2 chore: update all dependencies to latest versions #2 (open reports/... not cross-platform): Dropped the mutation:report npm script entirely (was macOS-only — open fails on Linux/Windows). The HTML report is the deliverable; CI uploads it as the mutation-report artifact. Added a per-OS command block (macOS / Linux / Windows) for local viewing.

Both changes plan-only; touch docs/plans/2026-04-26-quality-bundle-trend.md + docs/plans/2026-04-26-quality-mutation-testing.md. Umbrella + 6 other chunks unchanged.

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. What shall we delve into next?

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Adds a long-lived GitHub Issue (#131) as the durable tracker that
survives PR merges. PR #130 ships the plan files; the issue tracks
every downstream chunk PR via auto-flipping task list and stays open
until the plans are archived to docs/archive/plans/.

Changes:

- Umbrella plan header gains a 'Durable tracker' callout linking
  Issue #131 with the close-when criteria.
- Umbrella's downstream-PR contract section now requires BOTH body
  lines on chunk PRs:
    Tracks: #130   (planning PR — supplies the chunk plan)
    Tracks: #131   (durable issue tracker — auto-ticks on merge)
- All 8 chunk plan headers updated to cite both #130 (umbrella plan)
  and #131 (durable issue tracker).
- All 8 chunks' embedded 'gh pr create --body ...' blocks now emit
  both Tracks lines so worker agents inherit the contract verbatim.

The Issue itself was created via gh issue create. It captures:
- Origin (the 2026-04-26 'are there more things we can or should
  automate' question).
- Resulting plan layout (umbrella + 8 chunks).
- Auto-flipping task list mirroring the umbrella tracker table.
- Same coordination with #129 + same risk register summary.
- Closes-when criteria: every chunk PR merged + plans archived.

PR #130 body also rewritten to point at #131 and explain the
PR-vs-issue split (PR is the planning surface and closes; issue is
the durable surface and survives).

Plan-only commit. No code, no workflow, no script changes.
@Luis85
Copy link
Copy Markdown
Owner Author

Luis85 commented Apr 26, 2026

Per owner request, set up Issue #131 as the durable, long-lived tracker that survives this PR's merge.

Split of responsibilities:

Wiring (commit 66676e8):

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 66676e823f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +131 to +133
npm run build && npm run preview &
PREVIEW_PID=$!
npx playwright test --headed --project=chromium
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Start preview synchronously before launching Playwright

The Step 4 command npm run build && npm run preview & backgrounds the entire build && preview chain, so npx playwright test can start while the preview server is still building (or before it binds the port). In that timing window, Playwright may try to spawn its own webServer command or fail to connect, making this selector-wiring step flaky and non-deterministic for contributors following the plan.

Useful? React with 👍 / 👎.

Codex flagged a race condition in the demo-smoke chunk's local
selector-wiring step:

  cd examples/nurture-pet
  npm run build && npm run preview &
  PREVIEW_PID=$!
  npx playwright test --headed --project=chromium
  kill $PREVIEW_PID

Backgrounding the 'build && preview' chain forks the shell
immediately, so 'npx playwright test' can launch while vite is
still building (or before it has bound port 4173). Selector wiring
becomes flaky for contributors following the plan.

Fix: rely on the webServer block already declared in the Step 2
playwright.config.ts (it spawns 'npm run preview ...' AND polls
http://127.0.0.1:4173 until it responds before sending any test
traffic, with reuseExistingServer: !process.env.CI to support
'leave dev server open in another terminal' workflows). Step 4 now:

  cd examples/nurture-pet
  npm run build
  npx playwright test --headed --project=chromium
  cd -

Build synchronously first because playwright preview needs dist/.
Then hand control to playwright — it manages the server lifecycle.
Removes PREVIEW_PID dance + the kill at the end. No code changes;
demo-smoke chunk plan only.
@Luis85
Copy link
Copy Markdown
Owner Author

Luis85 commented Apr 26, 2026

Addressed P2 finding in 26ef13d:

  • P2 (demo-smoke Step 4, line 133, race on preview startup): Backgrounding npm run build && npm run preview & returns as soon as the shell forks the chain, not when port 4173 is bound — npx playwright test could fire before vite was listening, making the selector-wiring loop flaky. Replaced with a synchronous npm run build followed by npx playwright test --headed and let the existing webServer block in playwright.config.ts (Step 2) own the server lifecycle. The webServer config already polls the URL until it responds and supports reuseExistingServer: !process.env.CI for 'leave dev server open' workflows. Dropped the PREVIEW_PID / kill dance entirely.

@codex review

@Luis85 Luis85 marked this pull request as ready for review April 26, 2026 13:07
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@Luis85 Luis85 merged commit d52e10c into develop Apr 26, 2026
16 checks passed
@Luis85 Luis85 deleted the chore/quality-automation-routines branch April 26, 2026 13:15
Luis85 added a commit that referenced this pull request Apr 26, 2026
Tracks: #130
Tracks: #131

## Summary

Adds a CodeQL JS/TS scan that runs on every push and PR to develop/main
plus a Mondays-06:00-UTC weekly cron, using the `security-and-quality`
query suite. All actions pinned to 40-char commit SHAs per the repo's
supply-chain rule.

Ticks row 1 of the umbrella tracker.

## Test plan

- [x] `npm run verify` green (523/523 tests, lint/typecheck/build all
pass)
- [x] `actionlint` clean on `.github/workflows/codeql.yml`
- [x] Action SHAs resolved via the umbrella's `resolve_action_sha`
peel-aware helper (annotated tags handled)
- [ ] Post-merge: confirm `CodeQL / Analyze (javascript-typescript)` job
runs green on the merge push to `develop`

Co-authored-by: Luis Mendez <hallo@luis-mendez.de>
Luis85 added a commit that referenced this pull request Apr 26, 2026
Tracks: #130
Tracks: #131

Adds the weekly Dependabot triage cloud-routine prompt + README scaffold
under `docs/dep-triage-bot/`. Mirrors the existing `docs/review-bot/`
and `docs/docs-review-bot/` shape (system prompt + README + rolling
tracker issue, idempotency via canonical SHA list in issue body).

## What's in the diff

- `docs/dep-triage-bot/PROMPT.md` — system prompt: role, scope, triage
policy table (dev-deps minor/patch auto-merge; runtime minor/patch
approval-comment; majors + peer-deps owner-only), hard rules, output
format (rolling tracker + per-PR table), idempotency, dry-run guard,
failure handling.
- `docs/dep-triage-bot/README.md` — routine setup, sinks, iteration
workflow, known tradeoffs.
- `.github/dependabot.yml` — added `npm-non-major` group block to BOTH
npm ecosystem entries (root + `examples/nurture-pet`) so minor + patch
updates land as one weekly bundled PR per ecosystem instead of N.
Replaced the prior `dev-dependencies` group on the root entry — the new
group covers both runtime + dev minors/patches under the same bundling
policy. Major bumps remain ungrouped (one PR per major, owner-reviewed).
GitHub Actions ecosystem grouping (`actions:`) is unchanged.
- `docs/plans/2026-04-26-quality-automation-routines.md` — ticked row 2
(this row) to `- [x] shipped` per the umbrella's tick rule.

## Out-of-repo follow-up (does NOT gate merge)

Once this PR lands, the owner needs to:

1. Create the rolling tracker issue `Dependency triage — develop` with
label `dep-triage-bot` and the seed body documented in
`docs/dep-triage-bot/README.md` setup checklist.
2. Schedule the cloud routine (Claude Cloud) at `0 8 * * 1` (Monday
08:00 UTC, one hour after Dependabot fires) using the wrapper prompt in
the README.
3. Ensure repo allows auto-merge (`Settings → General → Pull Requests →
Allow auto-merge`) so the dev-deps minor/patch path can actually `gh pr
merge --auto --squash`.

All three are spelled out in the README's Setup checklist.

## Verification

- `npm run verify` — green locally on this branch.
- Doc-only diff (plus a config grouping change in
`.github/dependabot.yml`); no `src/**`, no changeset.

Ticks row 2 of the umbrella tracker.

---------

Co-authored-by: Luis Mendez <hallo@luis-mendez.de>
Luis85 added a commit that referenced this pull request Apr 26, 2026
Tracks: #130
Tracks: #131

Adds a Sundays-04:00-UTC weekly workflow that snapshots npx size-limit
--json for the develop tip into docs/metrics/bundle-trend.jsonl. SHA
is resolved from the checked-out tree (git rev-parse HEAD), NOT
github.sha (which would point at the default-branch tip on schedule
events). Dedupe policy: same (sha, calendar-date) re-run is a no-op;
identical-payload week-over-week with a new sha appends a new row.

Ticks row 5 of the umbrella tracker.
Luis85 added a commit that referenced this pull request Apr 26, 2026
Tracks: #130
Tracks: #131

Adds the monthly plan-reconciliation cloud-routine prompt + README
scaffold under docs/plan-recon-bot/. Distinct from docs-review-bot
(which audits prose drift); this routine answers 'is this plan done?'
and archives finished plans via git mv. Failure issues land under the
plan-recon-bot label per the umbrella's cloud-routine output
convention.

Ticks row 4 of the umbrella tracker.

---------

Co-authored-by: Luis Mendez <hallo@luis-mendez.de>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Luis85 added a commit that referenced this pull request Apr 26, 2026
Tracks: #130
Tracks: #131

Adds the weekly action-SHA-bump cloud-routine prompt + README scaffold
under docs/actions-bump-bot/. The routine wraps the existing
scripts/bump-actions.mjs (which already peels annotated tags via
tagToCommitSha) into a weekly PR-opening workflow. Failure issues land
under the actions-bump-bot label per the umbrella's cloud-routine
output convention.

Ticks row 3 of the umbrella tracker.

---------

Co-authored-by: Luis Mendez <hallo@luis-mendez.de>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Luis85 Luis85 added the roadmap:quality-baseline Quality automation: CodeQL, Stryker, determinism replay, demo smoke, review-finding fixes label May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

roadmap:quality-baseline Quality automation: CodeQL, Stryker, determinism replay, demo smoke, review-finding fixes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants