chore(ci): add OS x tfjs-backend matrix (test-core + test-tfjs split) by Luis85 · Pull Request #113 · Luis85/agentonomous

Luis85 · 2026-04-25T20:59:04Z

Summary

Row 4 of the polish + harden roadmap (docs/plans/2026-04-25-comprehensive-polish-and-harden.md).

Splits the existing single test job in .github/workflows/ci.yml into two:
- test-core runs the full vitest suite across the OS matrix (ubuntu-latest, macos-latest, windows-latest). Coverage runs on ubuntu only — identical report across OSes; v8 instrumentation triples runtime on the other cells for no extra signal.
- test-tfjs runs only the four tfjs-touching test files (TfjsReasoner.test.ts, TfjsLearner.test.ts, TfjsSnapshot.test.ts, learningMode.train.test.ts) under the OS × backend matrix = 3 × 2 = 6 cells (cpu, wasm).
strategy.fail-fast: false on both so a windows-only flake or a wasm-only kernel issue does not suppress unrelated cells.
New tests/setup/tfjsBackend.ts exports TEST_BACKEND derived from process.env.TFJS_BACKEND (default cpu). tests/setup/tfjsBackendSetup.ts is wired in vite.config.ts as a vitest setupFiles entry and side-effect-imports the matching @tensorflow/tfjs-backend-* package + calls tf.setBackend(...) + tf.ready() BEFORE any test file's static imports run.
Test bodies pass backend: TEST_BACKEND into every new TfjsReasoner(...) / TfjsReasoner.fromJSON(...) via local newReasoner / fromJSONReasoner helpers, so the constructor's "requested backend must match active backend" guard passes on every matrix cell. The single explicit backend: 'webgl' mismatch test is left untouched.

Matrix shape chosen

Shape (b) from the row-4 brief — split test-core (OS-only) from test-tfjs (OS × backend). The full vitest suite has zero opinion about the tfjs backend axis (it always imports @tensorflow/tfjs-backend-cpu statically); running it under the backend axis would burn 2× runtime for no extra signal.

Cell count: 3 OSes (test-core) + 6 cells (test-tfjs) = 9 total runner invocations on each push, vs 1 today.

Cost: ~1 extra runner-minute per push (matches roadmap estimate). The tfjs slice is small enough that the wasm cells finish in ~1s each.

Out of scope

webgl deliberately dropped — its factory throws on first use in headless Node and a real GPU runner is beyond row-4 scope. The activation-based detectBestBackend chain in src/ (which already covers webgl gracefully) is unchanged.
No src/ library change. The env-var wire lives entirely under tests/setup/.
No changeset — CI-only.

Dependencies

Row 20 (feat/tfjs-detect-backend-and-picker) already shipped TfjsReasoner.probeBackend / detectBestBackend, so the matrix defends a real consumer surface (the demo's backend picker).
Rows 21 + 22 are running in parallel agents in their own worktrees — they edit vite.config.ts and package.json respectively. Expect a 1-line plan-table merge conflict here when one of those merges first; resolution is to order rows numerically and re-verify.

Test plan

npm run verify green locally on develop-base
npx vitest run tests/unit/cognition/adapters/TfjsReasoner.test.ts — 23/23 passing under default cpu
TFJS_BACKEND=wasm npx vitest run tests/unit/cognition/adapters/TfjsReasoner.test.ts — 23/23 passing
TFJS_BACKEND=wasm npx vitest run tests/examples/learningMode.train.test.ts — 25/25 passing
node scripts/bump-actions.mjs — confirmed no new unpinned action references
CI: 9-cell matrix (3 + 6) all green on first run
ci-gate correctly aggregates test-core + test-tfjs (replaces previous test need)

…-tfjs) - `test` job split into `test-core` (full vitest, OS matrix only) and `test-tfjs` (tfjs adapter slice, OS x cpu|wasm matrix). `fail-fast: false` on both so platform-specific failures don't suppress each other. Coverage runs on ubuntu-latest only — identical report across OSes; v8 instrumentation triples runtime on the others for no signal. - `tests/setup/tfjsBackend.ts` exports `TEST_BACKEND` derived from `process.env.TFJS_BACKEND` (cpu | wasm; default cpu). `tests/setup/tfjsBackendSetup.ts` is wired in vite.config as a vitest setupFiles entry — side-effect-imports the matching tfjs backend package + activates it via `tf.setBackend` BEFORE any test file's static imports run, so `wasm` matrix cells don't pull cpu in via the historical `import '@tensorflow/tfjs-backend-cpu'`. - `TfjsReasoner.test.ts` and `learningMode.train.test.ts` use `TEST_BACKEND` in `beforeAll` and inject `backend: TEST_BACKEND` via thin `newReasoner` / `fromJSONReasoner` helpers so the constructor's "requested backend must match active backend" guard passes on every matrix cell. The single `new TfjsReasoner({..., backend: 'webgl'})` case that asserts the throw path is left as-is. - `webgl` deliberately dropped — its factory throws on first use in headless Node and a real GPU runner is out of scope. - `build` / `ci-gate` `needs:` updated from `test` to `test-core, test-tfjs`. No `src/` library change. Plan row 4 marked shipped in docs/plans/2026-04-25-comprehensive-polish-and-harden.md, mirroring row 3's section-B "What's already shipped" pattern.

Luis85 · 2026-04-25T20:59:08Z

@codex review

github-actions · 2026-04-25T21:01:35Z

size-limit report 📦

Path	Size
dist/index.js (gzip)	37.87 KB (0%)
dist/integrations/excalibur/index.js (gzip)	1.4 KB (0%)
dist/cognition/adapters/mistreevous/index.js (gzip)	1.14 KB (0%)
dist/cognition/adapters/js-son/index.js (gzip)	1.36 KB (0%)
dist/cognition/adapters/tfjs/index.js (gzip)	8.48 KB (0%)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: df1d8c24eb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

`tests/examples/learningMode.train.test.ts` runs the demo's `mountCognitionSwitcher` flow, whose probe loop seeds `cpu` first and promotes the persisted backend afterwards. Under `TFJS_BACKEND=wasm` the suite still exercises the cpu-fallback path (no localStorage seed in beforeEach), so the wasm cell added zero new coverage versus cpu — it just doubled runtime on a cpu-by-design suite. Reasoner-level wasm coverage already lives in the three `tests/unit/cognition/adapters/Tfjs*.test.ts` files, which the matrix-selected backend genuinely flows through. Test-core still runs the demo file once on every OS cell via the default `npm test`.

Luis85 · 2026-04-25T21:16:06Z

@codex review

chatgpt-codex-connector · 2026-04-25T21:20:28Z

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

## Summary Doc-audit pass over `docs/plans` + `docs/specs`. Three things land together: - **`docs/archive/{plans,specs}/`** — new home for plans whose roadmap rows have all shipped (or whose goals were folded into a successor) and specs whose design is now reflected in code. Includes a `README.md` explaining the policy; `CLAUDE.md` documents the convention. - **`git mv` 23 plans + 3 specs into the archive.** The active live set is now the comprehensive polish-and-harden plan plus three specs (post-tfjs improvements, mvp-demo, vision), each with a refreshed status banner. - **Refresh the live comprehensive plan** against current `develop`: - PR column updated for rows 16/19/20/3/4/22 (now shipped via PRs #91 / #98 / #104 / #110 / #113 / #111). - New "Post-roadmap follow-ups" section covers PRs #92 → #125 (review-bot infra, tracker findings, demo + tfjs hotfixes, tooling). - Stale prose-baked counts dropped (size budgets now reference `package.json#size-limit` only). - Coverage-thresholds section gains a pointer to the sticky PR comment shipped in PR #124. ## Other doc fixes - `README.md`: drop the unverifiable "Phase A milestones (M0–M15) are all green" claim — the milestones don't exist as documented IDs anywhere; replace with a pointer to the live polish plan. - `vision.md`: refresh cadence note (was pinned to 2026-04-19 + "next review at 1.0"). - `2026-04-24-post-tfjs-improvements.md`: mark recommended-order items that have shipped (PRs #61, #76, #77, #83, #84, #91, #94, #96, #104, #113), link the active roadmap as the heir. - `mvp-demo.md`: status banner explaining where active polish work is now tracked. ## Mechanical - Update inline cross-refs in `CLAUDE.md`, `eslint.config.js`, `src/agent/{Agent,AgentModule}.ts`, `tests/unit/exports.test.ts`, and `docs/daily-reviews/2026-04-25.md` to point at the new `docs/archive/` paths so links keep resolving. No code change beyond comment-path updates. ## Test plan - [x] `npm run verify` green (`format:check` + `lint` + `typecheck` + `test` + `build` + `docs`). 523 tests pass; the 2 lint warnings are pre-existing (`CognitionPipeline.invokeSkillAction` complexity + `scoreFailure` param count) and on the ratchet menu. - [x] `git ls-files docs/archive/` shows the moved files; renames are preserved (`git log --follow` works for any moved file). - [ ] Codex review: clean, no blockers. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Luis Mendez <hallo@luis-mendez.de> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector Bot reviewed Apr 25, 2026

View reviewed changes

Comment thread tests/examples/learningMode.train.test.ts

Luis85 merged commit 42ede76 into develop Apr 25, 2026
22 checks passed

Luis85 deleted the chore/ci-backend-and-os-matrix branch April 25, 2026 22:21

This was referenced Apr 25, 2026

demo: promote develop @ 42ede76 #114

Merged

docs: audit + archive shipped plans/specs #127

Merged

This was referenced Apr 27, 2026

Docs review — 2026-04-27 (d9b4b85) #157

Closed

Docs review — 2026-05-04 (0f72ad7) #191

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(ci): add OS x tfjs-backend matrix (test-core + test-tfjs split)#113

chore(ci): add OS x tfjs-backend matrix (test-core + test-tfjs split)#113
Luis85 merged 2 commits into
developfrom
chore/ci-backend-and-os-matrix

Luis85 commented Apr 25, 2026

Uh oh!

Luis85 commented Apr 25, 2026

Uh oh!

github-actions Bot commented Apr 25, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Luis85 commented Apr 25, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Luis85 commented Apr 25, 2026

Summary

Matrix shape chosen

Out of scope

Dependencies

Test plan

Uh oh!

Luis85 commented Apr 25, 2026

Uh oh!

github-actions Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

size-limit report 📦

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Luis85 commented Apr 25, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Apr 25, 2026 •

edited

Loading