Skip to content

chore(ci): add OS x tfjs-backend matrix (test-core + test-tfjs split)#113

Merged
Luis85 merged 2 commits into
developfrom
chore/ci-backend-and-os-matrix
Apr 25, 2026
Merged

chore(ci): add OS x tfjs-backend matrix (test-core + test-tfjs split)#113
Luis85 merged 2 commits into
developfrom
chore/ci-backend-and-os-matrix

Conversation

@Luis85
Copy link
Copy Markdown
Owner

@Luis85 Luis85 commented Apr 25, 2026

Summary

Row 4 of the polish + harden roadmap (docs/plans/2026-04-25-comprehensive-polish-and-harden.md).

  • Splits the existing single test job in .github/workflows/ci.yml into two:
    • test-core runs the full vitest suite across the OS matrix (ubuntu-latest, macos-latest, windows-latest). Coverage runs on ubuntu only — identical report across OSes; v8 instrumentation triples runtime on the other cells for no extra signal.
    • test-tfjs runs only the four tfjs-touching test files (TfjsReasoner.test.ts, TfjsLearner.test.ts, TfjsSnapshot.test.ts, learningMode.train.test.ts) under the OS × backend matrix = 3 × 2 = 6 cells (cpu, wasm).
  • strategy.fail-fast: false on both so a windows-only flake or a wasm-only kernel issue does not suppress unrelated cells.
  • New tests/setup/tfjsBackend.ts exports TEST_BACKEND derived from process.env.TFJS_BACKEND (default cpu). tests/setup/tfjsBackendSetup.ts is wired in vite.config.ts as a vitest setupFiles entry and side-effect-imports the matching @tensorflow/tfjs-backend-* package + calls tf.setBackend(...) + tf.ready() BEFORE any test file's static imports run.
  • Test bodies pass backend: TEST_BACKEND into every new TfjsReasoner(...) / TfjsReasoner.fromJSON(...) via local newReasoner / fromJSONReasoner helpers, so the constructor's "requested backend must match active backend" guard passes on every matrix cell. The single explicit backend: 'webgl' mismatch test is left untouched.

Matrix shape chosen

Shape (b) from the row-4 brief — split test-core (OS-only) from test-tfjs (OS × backend). The full vitest suite has zero opinion about the tfjs backend axis (it always imports @tensorflow/tfjs-backend-cpu statically); running it under the backend axis would burn 2× runtime for no extra signal.

Cell count: 3 OSes (test-core) + 6 cells (test-tfjs) = 9 total runner invocations on each push, vs 1 today.

Cost: ~1 extra runner-minute per push (matches roadmap estimate). The tfjs slice is small enough that the wasm cells finish in ~1s each.

Out of scope

  • webgl deliberately dropped — its factory throws on first use in headless Node and a real GPU runner is beyond row-4 scope. The activation-based detectBestBackend chain in src/ (which already covers webgl gracefully) is unchanged.
  • No src/ library change. The env-var wire lives entirely under tests/setup/.
  • No changeset — CI-only.

Dependencies

  • Row 20 (feat/tfjs-detect-backend-and-picker) already shipped TfjsReasoner.probeBackend / detectBestBackend, so the matrix defends a real consumer surface (the demo's backend picker).
  • Rows 21 + 22 are running in parallel agents in their own worktrees — they edit vite.config.ts and package.json respectively. Expect a 1-line plan-table merge conflict here when one of those merges first; resolution is to order rows numerically and re-verify.

Test plan

  • npm run verify green locally on develop-base
  • npx vitest run tests/unit/cognition/adapters/TfjsReasoner.test.ts — 23/23 passing under default cpu
  • TFJS_BACKEND=wasm npx vitest run tests/unit/cognition/adapters/TfjsReasoner.test.ts — 23/23 passing
  • TFJS_BACKEND=wasm npx vitest run tests/examples/learningMode.train.test.ts — 25/25 passing
  • node scripts/bump-actions.mjs — confirmed no new unpinned action references
  • CI: 9-cell matrix (3 + 6) all green on first run
  • ci-gate correctly aggregates test-core + test-tfjs (replaces previous test need)

…-tfjs)

- `test` job split into `test-core` (full vitest, OS matrix only) and
  `test-tfjs` (tfjs adapter slice, OS x cpu|wasm matrix). `fail-fast:
  false` on both so platform-specific failures don't suppress each
  other. Coverage runs on ubuntu-latest only — identical report across
  OSes; v8 instrumentation triples runtime on the others for no signal.
- `tests/setup/tfjsBackend.ts` exports `TEST_BACKEND` derived from
  `process.env.TFJS_BACKEND` (cpu | wasm; default cpu).
  `tests/setup/tfjsBackendSetup.ts` is wired in vite.config as a
  vitest setupFiles entry — side-effect-imports the matching tfjs
  backend package + activates it via `tf.setBackend` BEFORE any test
  file's static imports run, so `wasm` matrix cells don't pull cpu
  in via the historical `import '@tensorflow/tfjs-backend-cpu'`.
- `TfjsReasoner.test.ts` and `learningMode.train.test.ts` use
  `TEST_BACKEND` in `beforeAll` and inject `backend: TEST_BACKEND` via
  thin `newReasoner` / `fromJSONReasoner` helpers so the constructor's
  "requested backend must match active backend" guard passes on every
  matrix cell. The single `new TfjsReasoner({..., backend: 'webgl'})`
  case that asserts the throw path is left as-is.
- `webgl` deliberately dropped — its factory throws on first use in
  headless Node and a real GPU runner is out of scope.
- `build` / `ci-gate` `needs:` updated from `test` to
  `test-core, test-tfjs`. No `src/` library change.

Plan row 4 marked shipped in
docs/plans/2026-04-25-comprehensive-polish-and-harden.md, mirroring
row 3's section-B "What's already shipped" pattern.
@Luis85
Copy link
Copy Markdown
Owner Author

Luis85 commented Apr 25, 2026

@codex review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 25, 2026

size-limit report 📦

Path Size
dist/index.js (gzip) 37.87 KB (0%)
dist/integrations/excalibur/index.js (gzip) 1.4 KB (0%)
dist/cognition/adapters/mistreevous/index.js (gzip) 1.14 KB (0%)
dist/cognition/adapters/js-son/index.js (gzip) 1.36 KB (0%)
dist/cognition/adapters/tfjs/index.js (gzip) 8.48 KB (0%)

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: df1d8c24eb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/examples/learningMode.train.test.ts
`tests/examples/learningMode.train.test.ts` runs the demo's
`mountCognitionSwitcher` flow, whose probe loop seeds `cpu` first and
promotes the persisted backend afterwards. Under `TFJS_BACKEND=wasm`
the suite still exercises the cpu-fallback path (no localStorage seed
in beforeEach), so the wasm cell added zero new coverage versus cpu —
it just doubled runtime on a cpu-by-design suite.

Reasoner-level wasm coverage already lives in the three
`tests/unit/cognition/adapters/Tfjs*.test.ts` files, which the
matrix-selected backend genuinely flows through. Test-core still runs
the demo file once on every OS cell via the default `npm test`.
@Luis85
Copy link
Copy Markdown
Owner Author

Luis85 commented Apr 25, 2026

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@Luis85 Luis85 merged commit 42ede76 into develop Apr 25, 2026
22 checks passed
@Luis85 Luis85 deleted the chore/ci-backend-and-os-matrix branch April 25, 2026 22:21
Luis85 added a commit that referenced this pull request Apr 26, 2026
## Summary

Doc-audit pass over `docs/plans` + `docs/specs`. Three things land
together:

- **`docs/archive/{plans,specs}/`** — new home for plans whose roadmap
  rows have all shipped (or whose goals were folded into a successor)
  and specs whose design is now reflected in code. Includes a
  `README.md` explaining the policy; `CLAUDE.md` documents the
  convention.
- **`git mv` 23 plans + 3 specs into the archive.** The active live
  set is now the comprehensive polish-and-harden plan plus three
  specs (post-tfjs improvements, mvp-demo, vision), each with a
  refreshed status banner.
- **Refresh the live comprehensive plan** against current `develop`:
  - PR column updated for rows 16/19/20/3/4/22 (now shipped via
    PRs #91 / #98 / #104 / #110 / #113 / #111).
  - New "Post-roadmap follow-ups" section covers PRs #92#125
    (review-bot infra, tracker findings, demo + tfjs hotfixes,
    tooling).
  - Stale prose-baked counts dropped (size budgets now reference
    `package.json#size-limit` only).
  - Coverage-thresholds section gains a pointer to the sticky PR
    comment shipped in PR #124.

## Other doc fixes

- `README.md`: drop the unverifiable "Phase A milestones (M0–M15) are
  all green" claim — the milestones don't exist as documented IDs
  anywhere; replace with a pointer to the live polish plan.
- `vision.md`: refresh cadence note (was pinned to 2026-04-19 + "next
  review at 1.0").
- `2026-04-24-post-tfjs-improvements.md`: mark recommended-order items
  that have shipped (PRs #61, #76, #77, #83, #84, #91, #94, #96,
  #104, #113), link the active roadmap as the heir.
- `mvp-demo.md`: status banner explaining where active polish work is
  now tracked.

## Mechanical

- Update inline cross-refs in `CLAUDE.md`, `eslint.config.js`,
  `src/agent/{Agent,AgentModule}.ts`, `tests/unit/exports.test.ts`,
  and `docs/daily-reviews/2026-04-25.md` to point at the new
  `docs/archive/` paths so links keep resolving.

No code change beyond comment-path updates.

## Test plan

- [x] `npm run verify` green (`format:check` + `lint` + `typecheck` +
  `test` + `build` + `docs`). 523 tests pass; the 2 lint warnings
  are pre-existing (`CognitionPipeline.invokeSkillAction` complexity
  + `scoreFailure` param count) and on the ratchet menu.
- [x] `git ls-files docs/archive/` shows the moved files; renames are
  preserved (`git log --follow` works for any moved file).
- [ ] Codex review: clean, no blockers.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Luis Mendez <hallo@luis-mendez.de>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants