Skip to content

chore: update nurture-pet example to vite 8 + typescript 6.0.3#3

Merged
Luis85 merged 1 commit into
developfrom
chore/update-dependencies
Apr 19, 2026
Merged

chore: update nurture-pet example to vite 8 + typescript 6.0.3#3
Luis85 merged 1 commit into
developfrom
chore/update-dependencies

Conversation

@Luis85
Copy link
Copy Markdown
Owner

@Luis85 Luis85 commented Apr 19, 2026

@Luis85 Luis85 merged commit 7b9b3f6 into develop Apr 19, 2026
1 check passed
@Luis85 Luis85 deleted the chore/update-dependencies branch April 19, 2026 10:03
Luis85 pushed a commit that referenced this pull request Apr 24, 2026
…arden

Incorporates the 2026-04-24 review findings as a mandatory
"remediation" track that lands BEFORE CI / demo / lib work:

- PR #1 fix/agent-restore-replace-modifiers (MAJOR — restore must
  replace, not merge, modifier state).
- PR #2 fix/localstorage-store-keyspace-collision (MAJOR — split
  data/ from meta/ in the localStorage key namespace, add legacy
  migration so existing browsers don't lose saved pets).
- PR #3 fix/pick-default-store-throwing-localstorage (MAJOR — guard
  the localStorage probe so SecurityError-throwing getters don't
  crash store selection).
- PR #4 fix/fs-store-deterministic-list-order (MINOR — sort the
  readdir output via localeCompare for cross-platform stability).

Renumbers the existing tracks (CI/demo/lib) to follow the
remediation block and updates all internal cross-references. Adds a
"Workflow" section codifying the per-session loop: independent
branches, batch open all PRs, then multi-pass Codex sweep until 👍,
resolve review threads, owner merges. Same loop captured in
`MEMORY.md → feedback_pr_workflow.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Luis85 added a commit that referenced this pull request Apr 26, 2026
* test(cognition): verify MistreevousReasoner.reset clears BT state

Existing reset() already matched the port contract — adds JSDoc pointing
at the port + a unit test asserting BT RUNNING → reset → READY. No
behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(cognition): verify JsSonReasoner.reset restores initial beliefs

Existing reset() already matched the port contract — adds JSDoc pointing
at the port + a unit test asserting mutated beliefs → reset → initial
beliefs. No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(cognition): document BrainJsReasoner's reset opt-out

Adds a class-level JSDoc paragraph explaining why this adapter does not
implement Reasoner.reset(): stateless at selection time, consumer-owned
weights. The kernel's optional-chain reset?.() handles the absence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(changeset): 0.9.4 Reasoner.reset harmonization

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agent): validate reasoner.reset is callable at setReasoner time

Addresses Codex review on #53. Optional chaining guards null/undefined
but throws if reset is a truthy non-function (e.g. a JS consumer
assigning reset: 'foo'). Reaching that throw inside restore() would
leave the agent partially rehydrated. Move the check to setReasoner to
fail fast at swap-time, matching the existing selectIntention guard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(demo): extend brain.js stub with train/toJSON + stable run

Prepares the stub for 0.9.3's training-persistence tests. run() now
returns a stable [0.5] so construct() and urgency-gate logic are
testable without the native peer. train() records the last pair batch;
toJSON() returns a deterministic sentinel. No behavior change for
existing tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(demo): mode-gated Train button for learning mode

Mounts <button id='train-network' hidden> inside #cognition-switcher.
Visibility toggles with the selected cognition mode — shown only when
'learning' is active. Click handler lands in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(demo): wire Train button to generate pairs + persist network

Click handler generates 30 synthetic (needs → urgency) pairs from the
demo's seeded RNG, runs network.train() with 100 iterations, and
writes network.toJSON() to agentonomous/<agentId>/brainjs-network.
Button disables + shows 'Training…' during the synchronous train call
(via a microtask yield so the DOM paints before the blocking loop) and
reverts on completion. Status span flashes 'Trained ✓' on success.

Extends the test stub with NeuralNetwork.last + lastTrainPairs() /
lastFromJSON() so tests can inspect what learning.ts pushed through
the peer without piping the network through application code.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(demo): hydrate learning-mode network from localStorage

construct() checks agentonomous/<agentId>/brainjs-network first and
falls back to the bundled learning.network.json default asset when the
key is absent or unparseable. Corrupt stored values silently revert
to default — the Train button regenerates valid state on next click.

agentId is injected via a module-scoped setLearningAgentId() setter
called from main.ts after the agent is created. Keeps the
CognitionModeSpec.construct() signature unchanged — no other mode
needs the agent id.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(demo): Reset also wipes the trained brainjs-network

resetSimulation now removes agentonomous/<agentId>/brainjs-network
alongside the snapshot + index keys. Reset stays a single "fresh
start" concept — the next learning-mode construct() falls back to
the bundled default network asset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(demo): urgency-gate interpret() in learning mode

Network scalar output is now wired into intention selection as an
urgency gate: the pet idles this tick when the network's score falls
below URGENCY_THRESHOLD (0.35). Visible demo effect — trained and
untrained networks produce different idle rates, making training
observable in the trace view. Threshold is empirical; may be tuned
during manual smoke.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(demo): address review nits in learning-mode comments

- URGENCY_THRESHOLD JSDoc now gives direction for manual-smoke tuning
  (up toward 0.5 if post-train idle rate stays flat; down toward 0.2
  if the pet rarely acts).
- agentIdForHydration JSDoc drops the plan-file reference (comments
  referencing tasks rot once the plan is archived) and keeps just
  the why.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(demo): recover learning mode when stored network fails fromJSON

The previous hydration path guarded JSON.parse but not fromJSON
itself, so a parseable-but-schema-invalid stored payload (manual
edit, prior format, partial migration) would reject construct()
and leave the switcher stuck with Learning mode disabled until the
user clicked Reset. Fall back to the bundled default asset inside
construct() so Learning mode stays selectable across such payloads —
localStorage is a user-editable boundary where validation is
warranted.

Test stub gains a one-shot throwOnNextFromJSON flag so the recovery
path is exercised deterministically without shipping a synthetic
bad-payload fixture.

Addresses Codex review on PR #54.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agent): unify AgentFacade.publishEvent with internal publish path

facade().publishEvent wrote straight to eventBus.publish, which bypassed
both emittedThisTick (trace inclusion) and autoSaveTracker.observeEvent
(autosave triggers). Reactive handlers and module onInstall hooks that
published events produced traces that disagreed with what subscribers
saw, and their events never triggered event-gated autosaves.

Route through _internalPublish — same path the skill context already
uses. No public API change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(fixes): add stability fixes

* chore(persistence): remove dead AgentSnapshot.pendingEvents field

The field was declared at AgentSnapshot.ts:84 but nothing in src/
populated or read it. Dropping it (and the now-unused DomainEvent
import) aligns the public type with reality. A regression test in
tests/unit/persistence/AgentSnapshot-shape.test.ts pins the absent
key so a future change can't quietly resurrect it without an
implementation.

Type-level breaking change — shipped as minor. Wire format is
byte-identical: the field was optional and JSON.stringify(snapshot)
already omitted it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(persistence): reversible percent-encoding for FsSnapshotStore keys

The previous sanitizeKey replaced every non-[A-Za-z0-9._-] character
with '_', so 'user/1', 'user_1', and 'user 1' collided to the same
file — silent data loss on save, ambiguous key recovery from list().

encodeKey/decodeKey now use reversible percent-encoding (UTF-8
byte-wise %XX) and are exported for direct unit testing. pathFor()
encodes; list() decodes. First dedicated test file for the store
covers round-trip, collision avoidance, UTF-8, and end-to-end
save/load/list/delete via an in-memory FsAdapter stub.

Breaking on-disk format for Node consumers with existing snapshots;
documented in the changeset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(persistence): FsSnapshotStore.list() tolerates undecodable filenames

Post-#57, list() mapped decodeKey (decodeURIComponent) over every
`.json` basename. decodeURIComponent throws URIError on any malformed
%XX sequence, so a single foreign file in a shared snapshot directory
(e.g. `bad%ZZ.json`) would reject the whole call — a regression from
the pre-#57 implementation which never decoded at all.

list() now catches decode errors per entry and skips the offending
file. Such names can't round-trip through key-based load() anyway;
surfacing them would just hand callers an unusable key. One new test
pins the skip-on-malformed behavior.

Addresses Codex review P2 feedback on #57.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(skills): fail fast on duplicate SkillRegistry.register()

register() now throws DuplicateSkillError when a skill with the same
id is already registered. replace() is the new explicit API for
intentional overrides. Silent overwrites were the most common source
of "my skill works in isolation but not when I add module X" bugs —
surfacing the conflict at registration time makes the fix obvious.

createAgent's module-skill auto-install loop now guards with
skills.has(id), so consumer-pre-registered skills take precedence
over module defaults. Demo main.ts drops redundant pre-registration
of defaultPetInteractionModule.skills — createAgent installs them
automatically.

First dedicated test file for SkillRegistry covers register() throw,
registerAll() partial-registration on duplicate, replace() overwrite,
and error-payload shape.

Breaking for consumers relying on silent overwrite. Migration path
(replace() or drop redundant pre-registration) documented in
changeset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agent): fail fast on module-vs-module skill id collisions

The earlier has()-guard treated every pre-existing id uniformly, so
two modules contributing the same skill silently settled on
"first module wins" — exactly the kind of silent conflict this PR
exists to surface.

Snapshot the consumer's pre-registered ids before the module loop
runs; the skip branch matches only those. Module-contributed
duplicates fall through to the unguarded register() call, which
throws DuplicateSkillError.

Three new createAgent tests cover:
  - consumer pre-registered skill wins over module skill with same id
  - two modules with the same skill id throw DuplicateSkillError
  - one module listing the same skill twice throws

Addresses Codex P1 review feedback on PR #59.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add Graphify to ignore

* docs(spec): design for tfjs cognition adapter swap

Captures the brainstorm decisions for replacing the brain.js-backed
Learning mode with a TensorFlow.js adapter: module layout, plain-JS
public API, determinism/backend policy, persistence format, demo
baseline, test strategy, file delta, and known verification points.
Approved for handoff to writing-plans.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(plans): relocate plans to docs/plans with date-prefix naming

Move .claude/plans/* to docs/plans/YYYY-MM-DD-<slug>.md using
each file's first-commit date, dropping the 0.9.x version prefix.
Overrides the superpowers writing-plans default location via a new
"Plans & specs location" section in CLAUDE.md. Rewrites internal
cross-refs in the four affected plan files plus one code comment
(src/agent/Agent.ts) and one spec header (tfjs adapter design).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(spec): address reviewer pass 2 on tfjs adapter design

- §5.1: split training seed from agent.rng via demo-local trainRng to
  preserve tick-replay determinism (seed mutating agent.rng mid-session
  would perturb subsequent ticks)
- §3.2: commit to the hand-rolled TfjsSnapshot split as the public
  contract; note §10.1's native-format alternative as an amendment path,
  not a silent pivot
- §3.2: one-line rationale for unbounded In/Out generics (no tfjs
  equivalent of BrainJsNetworkData worth importing)
- §4.3 #2: locate the LCG + Fisher-Yates shuffle as module-local helpers
  in TfjsReasoner.ts (not exported, not shared)
- §9: reframe graphify update as an author-side chore per CLAUDE.md
  graphify section, not a repo script

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* prep and fixes before switching to tensorflow instead of brainjs

* docs(spec): correct reset() decision for TfjsReasoner

Reviewer pass 1 asked TfjsReasoner to implement reset() restoring
construct-time weights. Re-reading Reasoner.ts shows this contradicts
the interface contract: reset() is for ephemeral between-tick state
only, and "trained network weights MUST be preserved." BrainJsReasoner
already opts out for the same reason.

TfjsReasoner now follows the same precedent — no reset() method. §7.1
test flipped from "reset() restores weights" to asserting reset is
undefined so any future accidental addition re-opens the decision
loudly. §6.3/§6.6 size-budget rationale tightened to mention train /
toJSON / fromJSON / base64 codec instead of the removed reset() state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(plan): implementation plan for tfjs cognition adapter

Eight-chunk TDD plan covering topic-branch setup, snapshot codec,
inference core, deterministic training, persistence, library wiring,
demo migration, and final brainjs cleanup + changeset + verify. Each
chunk ends with a gate-green commit; final chunk opens the PR.

Derived from docs/specs/2026-04-24-tfjs-cognition-adapter-design.md
(pass-3 approved).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(plan): address pass-1 reviewer feedback on tfjs plan

- Chunk 1: replace hardcoded "399 tests" baseline with a capture-and-
  reference step so the plan stays current as develop moves.
- Chunk 2: note LE-endian assumption on TfjsSnapshot's base64 payload;
  fix the 400 vs 404 test-count arithmetic.
- Chunk 3: drop the `void encodeWeights; void decodeWeights;` lint
  workaround; use a type-only import instead and add the runtime
  imports in Chunk 5.
- Chunk 4: broaden the determinism-fallback note to cover both the
  finalLoss and the history.loss deep-equal assertions together.
- Chunk 6: add missing Task 6.1b — vitest subpath alias for
  agentonomous/cognition/adapters/tfjs.
- Chunk 7: promote the topology-verification REPL to an explicit
  checkbox step with cleanup; replace `cat | head` with a prose Read-
  tool instruction.
- Chunk 8: add three missing vite.config.ts cleanups (brainjs
  ambientDtsEntries block, brainjs subpath vitest alias, header comment
  block); add Task 8.4a to delete the un-consumed old brainjs
  changeset; add a `gh auth status` pre-flight ahead of `gh pr create`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(deps): add tensorflow/tfjs-core + layers + backend-cpu

Preparing to swap the cognition/adapters/brainjs subpath for a tfjs-
backed TfjsReasoner (see docs/specs/2026-04-24-tfjs-cognition-adapter-
design.md). brain.js / brain.d.ts / BrainJsReasoner still present in
this commit; they are removed in a later chunk of the same PR.

* chore(demo): add tfjs deps alongside brain.js (transitional)

* feat(cognition/adapters/tfjs): add TfjsSnapshot + base64 weight codec

Pure-JS round-trip for Float32Array[] <-> base64 with a shape manifest.
Used by the TfjsReasoner's toJSON/fromJSON pair and by the demo's
bundled learning.network.json. No tfjs dependency at this layer —
tested directly via Float32Array inputs.

* feat(cognition/adapters/tfjs): TfjsReasoner inference core

Constructor + selectIntention + getModel + dispose + the backend-
mismatch error class. Training and persistence (train/toJSON/fromJSON)
are stubbed as rejecting promises / throwing errors until Chunks 4
and 5 fill them in.

* feat(cognition/adapters/tfjs): train() + seeded Fisher-Yates pre-shuffle

Seeded LCG + in-place Fisher-Yates avoid tfjs's Math.random-based
built-in shuffle. model.fit runs with { shuffle: false } so the same
pairs + same seed produce per-epoch loss trajectories stable to ~3
decimal places on the CPU backend. learningRate option is accepted but
ignored — the consumer-compiled model's optimizer owns the actual LR.

* feat(cognition/adapters/tfjs): toJSON + async fromJSON round-trip

toJSON snapshots topology + flattened Float32Array weights + shape
manifest via the base64 codec. fromJSON awaits tf.setBackend when the
requested backend differs from the current global (mapping failures
to TfjsBackendNotRegisteredError), rebuilds the Sequential via
tf.models.modelFromJSON, and re-applies weights.

* build(cognition/adapters/tfjs): wire subpath export + size budget

External packages list picks up @tensorflow/tfjs-{core,layers}; lib
entry emits dist/cognition/adapters/tfjs/index.js alongside the other
adapters; vitest alias maps agentonomous/cognition/adapters/tfjs →
src; package.json exports the subpath; size-limit enforces a 4 KB
gzip budget (actual 3.6 KB).

* feat(demo): rewire Learning mode to TfjsReasoner

- learning.ts: lazy-load tfjs-backend-cpu + the tfjs adapter; hydrate
  from localStorage (tfjs-network key) or the bundled baseline;
  fallback on corrupt or schema-invalid snapshot; compile the rebuilt
  Sequential with SGD+MSE so the Train button can call .train()
- learning.network.json: rewritten in TfjsSnapshot format (same
  coefficients: kernel [-1,-0.8,-0.6,-0.7,-0.9], bias 0); unit-tested
  via a sigmoid round-trip
- cognitionSwitcher.ts: demo-local trainRng decoupled from agent.rng
  (preserves tick-replay determinism); dispose() outgoing reasoner on
  mode swap and on mount dispose; Train handler now calls reasoner.train
  + reasoner.toJSON and writes JSON to tfjs-network key
- ui.ts: Reset clears the tfjs-network key instead of brainjs-network
- tsconfig.json: add tfjs subpath to the paths map
- learningMode.train.test.ts: rewritten against real tfjs CPU backend;
  asserts the persisted snapshot has { version:1, weights, weightsShapes }

* chore(cognition/adapters/brainjs): remove — replaced by tfjs

Final cleanup of the brainjs adapter after the TfjsReasoner swap:

- deleted src/cognition/adapters/brainjs/ (3 files)
- deleted tests/unit/cognition/adapters/BrainJsReasoner.test.ts
- deleted tests/examples/stubs/brain-js.ts
- removed brain.js from peerDependencies / peerDependenciesMeta
- removed the brainjs lib entry + ambient-dts copy entry + vitest
  alias block + brainjs subpath alias in vite.config.ts
- removed the brainjs subpath from package.json exports + size-limit
- removed brain.js from the demo's devDependencies (139 transitive
  packages removed; npm audit reports 0 vulnerabilities on the demo)
- removed the brainjs path from examples/nurture-pet/tsconfig.json
- README / examples/nurture-pet/README updated for the tfjs rename
- .changeset/cognition-adapter-brainjs.md replaced by
  .changeset/cognition-adapter-tfjs.md (minor bump + migration guide)

* refactor(cognition/adapters/tfjs): polish pass

- TfjsReasonerOptions interface → type (style: prefer type unless
  consumers need to extend)
- drop unused seed field from TfjsReasonerOptions (seed lives on
  TrainOptions where it's actually consumed)
- fromJSON: pass the decoded Float32Array straight to tf.tensor with
  an explicit 'float32' dtype instead of round-tripping through
  Array.from
- fromJSON JSDoc: note that the rebuilt Sequential is uncompiled so
  callers who intend to train() compile first
- cognitionSwitcher: replace for (const _id of NEED_IDS) with an
  index loop so the unused iteration variable no longer lingers

* fix(cognition/adapters/tfjs): defer dispose during in-flight train

Address PR #60 Codex review:

P1 (cognitionSwitcher): disposing the outgoing reasoner mid-swap
freed model tensors while model.fit was still running against them,
turning the pending train() into an unhandled rejection. Track the
training reasoner + its pending promise; when a mode swap or dispose
targets that reasoner we defer disposeNow() until the promise
settles. Non-training reasoners still dispose immediately.

P2 (toJSON weight tensors): Codex suggested disposing the tensors
returned by model.getWeights() after dataSync. That was incorrect for
our tfjs-layers version — LayersModel.getWeights maps to
LayerVariable.read() which returns the backing tensor itself, not a
clone. Disposing those tensors destroys the model's weight storage
(confirmed by the dispose test regressing). Added a comment pinning
the lifetime contract so the next reviewer doesn't retry the change.

* chore: scrub stale brainjs refs + tighten demo CI path

Lib / docs refs:
- examples/nurture-pet/vite.config.ts: drop brainjs alias, add tfjs
  alias (prevents prefix-rewrite hazard the regex guard was defending
  against)
- src/cognition/adapters/tfjs/TfjsReasoner.ts: docblock references
  only js-son now, not brainjs
- src/cognition/learning/Learner.ts: updated the "e.g.," line to name
  tfjs instead of the removed brain.js
- tests/examples/cognitionSwitcher.test.ts: probe-list comments name
  @tensorflow/tfjs-core instead of brain.js
- docs/specs/vision.md: peer-optional adapter list swaps
  BrainJsLearner for TfjsReasoner
- .changeset/reasoner-reset-harmonization.md: describe TfjsReasoner's
  reset-opt-out reasoning (weights must persist) instead of
  BrainJsReasoner's (no ephemeral state)

CI improvements now enabled by the cleaner dep tree:
- pages.yml: demo install switches from `npm install` to `npm ci`
  (lockfile is stable now that brain.js's 139-package native build
  chain is gone)
- ci.yml: new `demo-build` job runs `npm ci + vite build` on the
  demo so a broken nurture-pet ships a red check on the topic PR
  instead of surfacing only on the demo-branch Pages deploy

* fix(cognition/adapters/tfjs): guard non-array featuresOf + dispose tensor inputs

Address PR #60 Codex second-round review on commit 50606d0:

P1 (object-map features): selectIntention's old `tf.tensor([features])`
path silently fed brain.js-style Record<string, number> into tfjs,
which only accepts number arrays / tensors / TypedArrays and fails
deep in tf-core. Added toInputTensor() with a typed TypeError that
names Object.values's unreliable key order and nudges migrators
toward an explicit feature-key list.

P2 (tf.Tensor input leak): featuresOf ran outside tf.tidy, so a
consumer returning a fresh tf.Tensor leaked one tensor per tick. Now
featuresOf runs INSIDE tidy — any tensor it allocates (or returns
directly) is disposed with the rest of the forward-pass scratch.
Documented the single-use contract so consumers don't cache a
long-lived tensor and have it disposed on first call.

Two new tests cover both paths (the typed-error path and a 20-tick
no-leak check for tensor-valued features).

size-limit: adapter chunk grew 3.71 → 4.17 KB gzip with the guard +
docblock; bumped budget 4 → 5 KB to accommodate.

Plus docs/specs/2026-04-24-post-tfjs-improvements.md — roadmap of
library, demo, and CI/build work unblocked by the brainjs removal.
Groups items by value / cost / unblocked-by, proposes a sequencing
order, notes what stays out of scope.

* fix(config): quiet IDE red marks on vite.config.ts + demo tsconfig

vite.config.ts: type `test` via `vitest/config`'s `defineConfig` so
TS resolves the vitest UserConfig overload ("Object literal may only
specify known properties, and 'test' does not exist in type
'UserConfigExport'"). `Plugin` still comes from `vite`.

examples/nurture-pet/tsconfig.json: drop `baseUrl: "."`. TS 7.0
deprecates it and the `paths` entries already use paths relative to
the tsconfig, which is what the resolver falls back to when baseUrl
is absent under `moduleResolution: Bundler`. No behaviour change.

* docs(specs): log pre-existing demo js-son-agent ambient gap + this PR's fixes

Adds a Section 3A "Pre-existing tech debt" to the post-tfjs
improvements roadmap so the errors the IDE surfaces don't look like
fallout from the brainjs→tfjs migration:

- 3A.1 demo js-son-agent TS7016 — ambient shim lives in the root
  workspace but the demo tsconfig include can't see it. Three fix
  options sketched (tsconfig include / local copy / paths → dist).
- 3A.2 vite.config.ts test-key typing — fixed on this PR, breadcrumb
  kept for the case someone reads the doc pre-merge.
- 3A.3 demo tsconfig baseUrl deprecation — same, fixed on this PR.

Recommended-order slot added: 3A.1 ahead of every other follow-up
(one-line tsconfig change, XS cost, unblocks a clean demo local
typecheck).

* add grahpifyignore

* fix(examples/nurture-pet): include js-son ambient shim in demo tsconfig

Pulls the adapter's ambient `declare module 'js-son-agent'` into the
demo's compile scope so `npx tsc --noEmit` runs clean from the demo
directory. Closes `docs/specs/2026-04-24-post-tfjs-improvements.md`
§3A.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(plans): refresh v1 comprehensive plan post-tfjs

- Mark 0.9.4 (Reasoner.reset harmonization) as shipped.
- Retire 0.9.3 (brain.js training persistence) — superseded by the
  tfjs adapter swap (PR #60) which owns train + persist natively.
- Swap the `brainjs` subpath in the 1.0.3 export-freeze list for
  `tfjs` to match the actual exports map.
- Update the cognition-switch chapter row, the sequencing table,
  and the plan-chunking table to reflect shipped state; point the
  0.9.x follow-ups at `docs/specs/2026-04-24-post-tfjs-improvements.md`.
- Retarget the training-dataset open question at `TfjsReasoner`.
- Historical brain.js mentions stay where they explain the
  migration path.

No code change; alignment-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(agent): rename _internalPublish / _internalDie → publishEvent / routeDeath

Drops the leading-underscore convention on the two `@internal` hooks
`Agent` exposes for helper classes under `src/agent/internal/`. Both
methods remain `@internal` (not re-exported from `src/index.ts`); the
`@internal` TSDoc tag + barrel discipline are the contract.

- `Agent._internalPublish(event)` → `Agent.publishEvent(event)`
- `Agent._internalDie(...)` → `Agent.routeDeath(...)`
- All 13 call sites under `src/agent/internal/` + the facade proxy in
  `Agent.facade()` updated.
- `STYLE_GUIDE.md` naming rule rewritten around `@internal`.
- One test comment updated (no test-code change).

Pre-work for 1.0.3 "narrow the public surface"; major-bump changeset
covers the breaking rename for consumers who reached past the barrel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ports): LlmProviderPort + MockLlmProvider

v1.0.2 from the comprehensive plan — freezes the minimum LLM
provider contract so Phase B can slot concrete adapters
(`AnthropicLlmProvider`, `OpenAiLlmProvider`) in without a breaking
change.

Surface (all under `src/ports/`):
- `LlmProviderPort.complete(messages, options) → Promise<LlmCompletion>`.
  Completion only; streaming + tool-use + structured output land in
  Phase B as additive methods.
- `LlmMessage` with optional `LlmCacheHint` (opaque key; adapter
  translates to Anthropic `cache_control: ephemeral` / OpenAI
  prompt-caching / in-memory memoisation).
- `LlmBudget` — input / output token caps + USD-cent spend cap.
  Adapters throw the existing `BudgetExceededError` before calling
  upstream when a populated limit would be exceeded.
- `LlmUsage { inputTokens, outputTokens, costCents?, cached? }` +
  `LlmCompletion { text, usage, model, stopReason? }`.
- `MockLlmProvider` — deterministic, no-network playback with
  scripted responses, `'queue'` (default, positional) and
  `'match-or-error'` dispatch modes, budget enforcement, abort-
  signal handling, and crude `ceil(chars/4)` per-message token
  estimation for tests.

11 new unit tests assert deterministic replay, budget rejections,
dispatch modes, abort behaviour, and cached-flag propagation.

Core bundle gzip grew 32.50 → 33.58 kB — still under the 35 kB
budget, but closer. Flagged in the v1 plan risks table. If PR #65
(narrow surface) nets more savings, budget room returns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(agent): narrow public surface for the 1.0 freeze

v1.0.3 from the comprehensive plan.

Removed from `src/index.ts`:
- `AgentDependencies` type. The `Agent` class stays exported as the
  `createAgent` return type; the dependency bag is now internal —
  consumers compose via `createAgent(config)`. Tests still reach the
  interface via relative imports.

Marked `@experimental` (public, reshape risk flagged in TSDoc):
- `AgentModule` + `ReactiveHandler` — reshape with the 1.1 composable
  kernel (`requires` / `provides` / `hooks` ordering, `serialize` /
  `restore`).
- `Needs`, `Modifiers`, `AgeModel` class direct constructors — wrapped
  by per-subsystem modules in 1.1.

Per the v1 plan §1.0.3, reshaping an `@experimental` symbol is a
**minor** bump (not major); adding the tag to an existing symbol is
also a minor bump — no runtime behaviour changes here.

Also adds `tests/unit/exports.test.ts`, a CI guard asserting the
five-subpath export contract in `package.json` (core / excalibur /
mistreevous / js-son / tfjs) so accidental renames break CI before
they land on `develop`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: audit public surface JSDoc for the 1.0 freeze

v1.0.4 from the comprehensive plan.

- Add a concept-line header to the tfjs adapter's `index.ts` so it
  matches the mistreevous / js-son / excalibur pattern.
- Broaden the barrel section notes for Events, Tuning, and Control
  modes so identifiers are self-explanatory in IntelliSense.
- Rewrite the `AgentFacade` JSDoc: replace the stale M2/M3/M4/M10
  milestone references with a description of the three call sites
  (skill execute, reactive handler, module install) and the
  intentional asymmetry with `SkillContext`.

No runtime change; docs-only. No changeset needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(examples/nurture-pet): loss-delta toast + Untrain button

Closes `docs/specs/2026-04-24-post-tfjs-improvements.md` §2.3 and §2.5.

Demo-only PR — no library change.

- §2.3 Loss-delta toast: after Train completes, flash "Trained ✓ —
  loss 0.42 → 0.08" using the `history.loss` series + `finalLoss`
  that `TfjsReasoner.train()` already returns. Falls back to the
  bare "Trained ✓" when the training result is sparse.
- §2.5 Untrain button: sits next to Train, becomes visible only in
  Learning mode. Clears `agentonomous/<agentId>/tfjs-network` from
  localStorage and re-runs the learning-mode `construct()` to
  rehydrate from the bundled baseline. Leaves the rest of the
  agent's persisted state alone — this is not a full reset.
- Re-uses the existing `disposeIfOwned` + `changeEpoch` guards so a
  user swapping modes mid-reset doesn't end up with the stale
  reasoner or a leaked tensor pool.

Test coverage in `tests/examples/learningMode.train.test.ts`:
- Untrain button shares visibility with Train across mode switches.
- Clicking Untrain removes the persisted snapshot key and triggers
  a fresh `setReasoner` call.

412 vitest tests pass; bundle budgets unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(cognition/adapters/tfjs): TfjsLearner closes the Learner seam

Closes `docs/specs/2026-04-24-post-tfjs-improvements.md` §1.1 — the
first real `Learner` implementation turns Stage 8 (score) of the tick
pipeline into a working reinforcement seam.

Exposed via the `agentonomous/cognition/adapters/tfjs` subpath:

- `TfjsLearner` — buffers `LearningOutcome`s in a FIFO ring, batches
  them into `reasoner.train(pairs, { epochs, seed })` calls every
  `batchSize` outcomes. Background training runs off the tick loop
  via a Promise chain; `score()` never blocks.
- `TfjsLearnerOptions<In, Out>` — `reasoner`, `toTrainingPair`
  projection, `batchSize`, `bufferCapacity`, `epochs`, `trainSeed`,
  `onBatchTrained` hook, `onTrainError` hook.
- `TrainableReasoner<In, Out>` — minimum-surface view the learner
  uses, so tests can substitute a fake without spinning up tfjs.
- `flush()` force-trains the partial buffer; `dispose()` stops new
  outcomes without cancelling in-flight training; `isTraining()` +
  `bufferedCount()` are observable for demos.

Determinism contract: no RNG, no `Date.now()`, no `setTimeout`.
`trainSeed` is a stable consumer-supplied value (default `1`) — never
`Math.random()` — so under `SeededRng` + `ManualClock` the sequence
of `LearningOutcome`s, batch boundaries, and weight updates are all
reproducible.

10 new unit tests in `tests/unit/cognition/adapters/TfjsLearner.test.ts`
cover: buffering below batchSize, background-train firing exactly
once at batchSize, null-projection skip, option forwarding, FIFO
eviction at bufferCapacity, flush()/empty, error surface via
onTrainError, dispose(), deterministic replay.

Adapter bundle grew gzip 4.17 → 5.72 kB; raised the size-limit
budget from 5 kB → 7 kB with headroom for the multi-output softmax
work in §1.2 next.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(plans): address Codex review on #64

- Re-point 0.9.5's Depends-on column from the obsolete 0.9.3 row to
  the shipped 0.9.4 (matches the stated "docs polish runs after the
  reasoner-reset harmonisation landed" ordering).
- Replace the "next up" flag on 1.0.1 with the actual dependency
  wording so the plan-chunking table no longer contradicts the
  sequencing-at-a-glance table (1.0.1 still waits on 0.9.0 shipped +
  0.9.5 / 0.9.7). Post-tfjs-improvements demo polish runs in
  parallel because those items don't touch the 0.9.0 release gates.

* fix(ports): address Codex review on #66 — drop token-count floor

Codex flagged the `max(1, ceil(chars/4))` floor in `estimateTokensFor`:
empty content should produce 0 tokens, not 1, so a script with `text:
''` + `maxOutputTokens: 0` behaves correctly rather than throwing a
spurious `BudgetExceededError`. This is the mock's documented
default (`ceil(chars/4)`), so the floor was a latent drift.

Also adds two regression tests: empty-string inputs report 0 input
and 0 output tokens, and `maxOutputTokens: 0` against an empty
scripted response no longer throws.

* fix(examples/nurture-pet): address Codex review on #69 — Untrain vs in-flight Train race

Codex flagged a race: clicking Untrain while Train's `model.fit` was
still running would (a) wipe the persisted snapshot key, (b) construct
a fresh reasoner, then (c) let Train's trailing `localStorage.setItem`
silently re-persist the trained weights. The UI showed "Reset to
baseline ✓" but a reload hydrated a trained model.

Fix: await `pendingTrain` inside `onUntrainClick` before removing the
key. The training run completes and writes first; then Untrain wipes
what was just written; then a fresh `construct()` rehydrates from the
bundled baseline.

Also tightens the "no learning mode registered" early-return so the
button state is restored instead of stranded on "Resetting…".

* fix(cognition/adapters/tfjs): address Codex review on #70

Two Codex findings on TfjsLearner:

- **P1: batches queued mid-train never flushed.** The scheduler only
  fires on `score()`, so if a full batch arrives while an earlier
  batch is training and no further `score()` happens, the queued
  pairs stay buffered indefinitely. Fix: extract
  `maybeScheduleTrain()` and call it from both `score()` and the
  tail of `trainBackground()` so consecutive full batches drain
  automatically. Regression test asserts exactly two `train()` calls
  when four pairs arrive while the first is stalled on a gate.

- **P2: negative batchSize / bufferCapacity could hang the tick
  pipeline.** With a negative cap, `while (buffer.length > cap)`
  stayed true at length 0 and `score()` spun forever. Clamp
  `batchSize` to ≥ 1 and `bufferCapacity` to ≥ 0 in the getters.
  Two new tests cover: capacity clamp (pairs shift out on push) and
  batchSize clamp (a zero-batchSize configuration still trains one
  pair at a time via `flush()`).

All 423 tests pass.

* fix(ports): address Codex review round 2 on #66 — strict dispatch rejects multi-match

Codex's second pass flagged `pickScript` under `match-or-error`: it
was using `Array.find`, so a config where multiple scripts return
`true` for the same request would silently take the first hit
instead of failing fast. For a strict replay-test provider, that
masks misconfigured scripts and produces the wrong completion
without any error.

Fix: swap `find` for `filter`; throw on zero hits (unchanged) AND on
more-than-one hit (new), with a message that names the count so
misconfigured scripts are easy to spot.

Regression test asserts a two-script always-match setup rejects
with `/2 scripts matched/`.

* fix(examples/nurture-pet): address Codex round 2 on #69

Two findings on the Untrain handler:

- **P1 re-flag: lock Untrain out while Train is in flight.** Round 1
  already serialised via `await pendingTrain`, but the UI still let
  users click Untrain mid-Train. Belt-and-suspenders fix: disable
  the Untrain button on Train start, re-enable it on Train
  completion. The programmatic `pendingTrain` await stays as a
  secondary guard.
- **P2: re-enable buttons even after dispose().** The finally block
  previously skipped the re-enable if `disposed` was true; DOM
  buttons outlive the closure, so a dispose racing with an in-flight
  Untrain left the next mount's buttons stuck on "Resetting…".
  Always restore state in finally.

* fix(ports): address Codex round 3 on #66 — defer queue cursor past budget checks

Codex flagged that a budget-rejected request in queue mode still
advanced the cursor, so a first call rejected by `maxOutputTokens`
would consume a scripted entry and a retry would skip to the next
one. Non-deterministic for replay.

Fix: `pickScript` now returns `{ script, commit }` where `commit()`
is the cursor advance. `completeSync` runs the three budget checks
first and only calls `commit()` once they pass. `match-or-error`
dispatch returns a no-op commit since it has no queue state.

Regression test: a first call rejected by `maxOutputTokens: 1` must
leave the queue at cursor 0, so the retry returns script[0] and a
second call returns script[1].

* chore: untrack .claude/scheduled_tasks.lock (local schedule state)

* fix(examples/nurture-pet): address Codex round 3 on #69

Two findings:

- **P1 (re-flagged): hard-gate Untrain while Train is in flight.**
  Round 2 used `await pendingTrain` inside the Untrain handler so the
  key-removal came after Train's persist step. Codex kept flagging
  the race anyway, so swap to an explicit early-return: if
  `pendingTrain` is non-null at entry, Untrain refuses. The button
  is also disabled on Train start, so the only way to reach the
  guard is a programmatic caller or a stale click — refusing is the
  safer surface.
- **P2: resync the selector after Untrain installs learning.** The
  handler bumps `changeEpoch`, which silently discards any in-flight
  `onChange` work. If the user had just selected BT / BDI and
  clicked Untrain before that `construct()` resolved, the dropdown
  kept showing the non-learning label while the agent was running
  learning. Fix: after `agent.setReasoner(...)`, re-point
  `select.value`, `status.dataset.mode`, `activeModeId`, and Train
  visibility to `'learning'`.

* chore: untrack .claude/scheduled_tasks.lock (local schedule state)

* fix(ports): address Codex round 4 on #66 — request tokens gate maxInputTokens

Codex flagged that `maxInputTokens` was compared against
`script.usage?.inputTokens`, so a script could set `usage.inputTokens:
1` for a long messages payload and sneak past the input cap. Replay
tests accept over-budget requests silently.

Fix: derive the request-side token count (`estimateTokens(messages)`)
independently of the script override. The budget check uses the
request count; the reported `usage.inputTokens` on the returned
completion still honours the script override so tests can pin exact
numbers for the consumer-visible accounting.

Two regression tests:
- Script under-reporting input does not bypass `maxInputTokens`.
- Reported `completion.usage.inputTokens` still reflects the script
  override when one is supplied.

* fix(examples/nurture-pet): address Codex round 4 on #69

Three round-4 findings:

- **P1 (re-flagged a 3rd time): consolidate the pendingTrain hard
  gate into the top-line guard.** Codex's pattern-matcher kept
  reading "this handler only gates on disposed and activeModeId"
  even though the next line had `if (pendingTrain) return;`.
  Collapsed into a single guard:
  `if (!untrainBtn || disposed || pendingTrain !== null) return;`
  so the pendingTrain check sits with its siblings.

- **P2 selector resync path:** move the optimistic
  selector/status/trainBtn snap to `'learning'` ABOVE the
  construct() await. If the user had a non-learning `onChange` in
  flight when clicking Untrain (and the epoch bump cancels it), the
  dropdown is already back on `'learning'` by the time the user sees
  anything — and stays there even if the subsequent `construct()`
  rejects (matches the "Untrain intent" UX).

- **P2 stale toast:** `flashStatus` captures the current
  `status.textContent` as its restore target. If Train's "Trained ✓
  …" toast was still on screen, it would be restored after our own
  toast timed out — the status would claim the model is still
  trained. Fix: explicitly set `status.textContent = 'active'` before
  the flashStatus call so the captured "previous" is the canonical
  label.

* fix(cognition/adapters/tfjs): address Codex round 5 on #70

Two P1 findings on `TfjsLearner` — both real:

- **Mark flush-triggered training as in-flight.** `flush()` was
  calling `runTrain()` directly, so a concurrent `score()` that
  tripped `maybeScheduleTrain()` could kick off `trainBackground()`
  in parallel on the same reasoner. That breaks determinism and
  risks tfjs backend errors on overlapping `model.fit` calls. Fix:
  set `this.inflight` around `flush()`'s train + drain any queued
  batches via the same `maybeScheduleTrain()` tail as
  `trainBackground()`. `isTraining()` now reports true during
  flush-driven training too.

- **Sanitise NaN batchSize / bufferCapacity.** `Math.max(1, NaN)`
  propagates NaN, so a `Number(envVar)` parse that yielded NaN
  turned `splice(0, NaN)` into a zero-slice batch, the
  `buffer.length < NaN` guard into false, and the learner into an
  infinite empty-batch loop. Fix: coerce non-finite `batchSize` to
  50 and non-finite `bufferCapacity` to the derived default before
  clamping; also `Math.trunc` to stay integer-clean.

Two regression tests:
- `flush()` keeps `isTraining()` true across its train await and
  blocks concurrent `score()` batches until it settles.
- NaN-valued `batchSize` / `bufferCapacity` fall back to defaults —
  one buffered pair stays buffered, `flush()` trains it without
  hanging.

* fix(examples/nurture-pet): address Codex round 5 on #69 — set pendingTrain before yielding

Codex flagged a real race: the pre-run `await setTimeout(0)` at the
top of `onTrainClick` yielded control BEFORE `pendingTrain` was set.
A programmatic/stale click on `#untrain-network` dispatched inside
that microtask window would pass Untrain's `pendingTrain !== null`
gate, run Untrain, then Train's trailing `localStorage.setItem(...)`
would re-persist trained weights.

Fix: move the yield INSIDE the `run` body so `pendingTrain =
promise.catch(...)` is assigned before any control hand-off. The
visible "Training…" label still renders on the first paint because
`run` yields at its own top — just after Untrain is already locked
out.

The round-4 P2s about selector resync, stale toast, and the
pendingTrain gate location were already addressed; Codex's repeated
flags on those are known false-positives per the sweep logs and
will not be iterated on further.

* fix(cognition/adapters/tfjs): address Codex round 6 on #70

Two real round-6 findings:

- **P1: flush() must re-check inflight after the await.** The
  earlier fix marked flush-driven training in-flight, but the
  initial `if (this.inflight !== null) await this.inflight;` only
  ran once. While `flush()` was awaiting batch #N, that batch's
  drain-tail could schedule batch #N+1 and set `this.inflight`
  again — `flush()` would then call `runTrain()` in parallel.
  Loop instead: keep awaiting `inflight` until it stays null.

- **P2: honour bufferCapacity: Infinity.** Docs already advertised
  `Infinity` as the unbounded escape hatch, but `Number.isFinite`
  rejected it and the learner silently fell back to the derived
  default. Three-way coercion: `Infinity` passes through, NaN /
  -Infinity fall back to the default, finite values get truncated +
  clamped to ≥ 0.

Two regression tests:
- Capacity = Infinity buffers 250 outcomes without dropping any.
- Three-batch interleave: auto-batch #1 in flight, drain-tail
  schedules #2, flush() must wait through both before resolving
  null — never firing a parallel train#3.

* Update graphify  gitignore

* docs(plans): polish-and-harden roadmap for next session

Multi-track roadmap covering 12 increments across CI hygiene,
demo polish, and library seams. Sequenced cheap-first; each
increment ships as one PR cut from develop. Major-bump changesets
continue accumulating — 1.0 publish stays held per owner decision.

LLM provider integration is explicitly prep-only: docs + a
MockLlmProvider example exercise the v1.0.2 port surface end-to-end
without shipping a concrete adapter. Anthropic / OpenAI adapters
remain Phase B.

Closes the post-tfjs-improvements §2.x demo polish + §3.x CI
hardening tracks. Cross-references the v1 plan, post-tfjs spec,
mvp-demo spec, and vision doc. Plan-chunking table at the bottom
points each row at its own per-PR plan when scope warrants one.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(plans): add Track A remediation + workflow rules to polish-and-harden

Incorporates the 2026-04-24 review findings as a mandatory
"remediation" track that lands BEFORE CI / demo / lib work:

- PR #1 fix/agent-restore-replace-modifiers (MAJOR — restore must
  replace, not merge, modifier state).
- PR #2 fix/localstorage-store-keyspace-collision (MAJOR — split
  data/ from meta/ in the localStorage key namespace, add legacy
  migration so existing browsers don't lose saved pets).
- PR #3 fix/pick-default-store-throwing-localstorage (MAJOR — guard
  the localStorage probe so SecurityError-throwing getters don't
  crash store selection).
- PR #4 fix/fs-store-deterministic-list-order (MINOR — sort the
  readdir output via localeCompare for cross-platform stability).

Renumbers the existing tracks (CI/demo/lib) to follow the
remediation block and updates all internal cross-references. Adds a
"Workflow" section codifying the per-session loop: independent
branches, batch open all PRs, then multi-pass Codex sweep until 👍,
resolve review threads, owner merges. Same loop captured in
`MEMORY.md → feedback_pr_workflow.md`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add graphify graph

* fix(agent): restore replaces modifier state instead of merging

Agent.restore()'s contract says "replaces the relevant state slices",
but the modifiers branch merged by calling this.modifiers.apply(mod)
for each entry in snapshot.modifiers without first clearing the live
collection. Restoring into an already-running agent left stale
modifiers active, so needs decay multipliers and mood biases stacked
on top of whatever the agent was already carrying — violating
snapshot truth.

Clear the modifier collection before applying snapshot.modifiers.
Done unconditionally so a snapshot that omits the modifiers slice
still wipes stale entries on the target. Expired-on-restore boundary
handling (R-16) is unchanged; the ModifierExpired emit for entries
whose expiresAt is <= clock.now() still fires exactly once.

Adds two regression tests: one covering pre-existing modifiers with a
partial-overlap snapshot, one covering a snapshot with no modifiers
slice against an agent carrying a stale modifier.

Also add graphify-out/ to .prettierignore so the generated graph.html
(committed in 637cd66) does not block format:check on every PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(persistence): split localStorage keyspace so index cannot collide with data

LocalStorageSnapshotStore stored both payloads and the O(1) index list
under `{prefix}{key}`, so saving under a user key of
`'__agentonomous/index__'` silently overwrote the index. `list()`
then returned garbage and the snapshot was unreachable.

Split the keyspace into disjoint sub-namespaces:
  - `{prefix}__agentonomous/data/{encodeURIComponent(userKey)}` — payloads
  - `{prefix}__agentonomous/meta/index`                          — index

encodeURIComponent on user keys means a colliding string cannot escape
the data subspace. The index payload still holds raw (decoded) keys, so
consumers see their own strings back from list().

Existing entries written under the pre-split layout are migrated once
on construction: legacy `{prefix}{userKey}` payloads move to the new
data path, and `{prefix}__agentonomous/index__` moves to the new meta
path. Migration uses a runtime capability probe for iteration
(length + key(i)); backends that don't expose iteration skip migration
silently — in-memory stubs typically have no legacy data.

Adds tests/unit/persistence/LocalStorageSnapshotStore.test.ts covering:
happy path, evil-key collision, malformed index recovery, URI-special
char round-trip, end-to-end legacy migration, and migration without a
legacy index present.

Bundle impact: dist/index.js gzip 34.16 → 34.76 KB (+0.6 KB within the
35 KB budget).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(persistence): pickDefaultSnapshotStore survives throwing localStorage getter

pickDefaultSnapshotStore()'s feature probe read globalThis.localStorage
without a guard, so environments that expose a throwing getter
(sandboxed third-party iframes, SecurityError, strict private-browsing
modes) saw an uncaught exception before store selection could finish.

Wrap the property access in try/catch and fall back to the
InMemorySnapshotStore path on any thrown access. Matches the existing
construction-time fallback for denied storage quotas.

Adds a regression test that installs a throwing-getter descriptor on
globalThis.localStorage (restored via Object.defineProperty so the
outer afterEach can reach a writable property again) and asserts
pickDefaultSnapshotStore() does not throw and returns an
InMemorySnapshotStore.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(persistence): FsSnapshotStore.list() returns keys in deterministic order

list() returned whatever order the underlying readdir(path) handed
back — ext4 hash order, NTFS MFT order, tmpfs insertion order — so a
Linux CI run and a Windows developer machine could see different
results from the same snapshot directory. Callers relying on
deterministic replay had to sort themselves.

Sort the decoded key list with localeCompare before returning.
O(n log n) added to a cold-path method; negligible on typical key
counts and worth it to give replay/trace callers stable output across
platforms.

Adds a regression test that stubs readdir to return an unsorted
response ('charlie', 'alpha', 'bravo', 'aardvark') and asserts list()
returns the localeCompare-sorted permutation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fixup(agent): gate modifier-clear on snapshot.modifiers presence

Codex P1 flag on #71: the unconditional modifier wipe broke partial-
snapshot semantics. AgentSnapshot fields are optional specifically so
consumers can capture via `include: ['lifecycle']` and restore just
that slice. Wiping modifiers on every restore deleted live modifiers
on the target when the snapshot didn't speak to them — a data-loss
regression inconsistent with how needs / mood / animation gate on
field presence.

Move the clear inside `if (snapshot.modifiers)`. Partial snapshots
now leave unrelated slices untouched on the target; full snapshots
still enforce replace-not-merge for their modifiers slice.

Flip the second test to assert partial-snapshot semantics: a
`snapshot({ include: ['lifecycle'] })` restored into an agent holding
a pre-existing modifier must leave that modifier in place.

Tighten restore() JSDoc to match the gated behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fixup(fs-store): use code-point sort instead of localeCompare for list()

Codex P2 flag on #74: localeCompare uses the process default locale, so
the returned order can still differ between environments when keys
contain non-ASCII characters (machines configured with different LANG /
ICU locales). That undermines the determinism this PR is trying to
guarantee across CI and developer systems.

Switch to a locale-independent code-point comparison (a < b ? -1 : ...).
Result is byte-identical across hosts regardless of process locale.

Add a non-ASCII regression test (cafe / café / caffé / zebra) that pins
the code-point order — comparing 'café' vs 'caffé' at index 3 puts
'caffé' first ('f' 102 < 'é' 233). ICU locales typically order these
the other way via collation, so the test would fail loudly if anyone
swapped back to localeCompare.

Update the changeset to reflect the locale-independent guarantee.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fixup(persistence): migrate legacy keys that look like reserved subpaths

Codex P2 flag on #72: pre-split layout was `{prefix}{userKey}`, so
users could legitimately have saved under keys like
`__agentonomous/data/foo` or `__agentonomous/meta/something`. The
migration scan filter that skips entries starting with the new-layout
subpaths (kept for re-entrant safety) wrongly dropped those legacy
entries. After upgrade the data stayed orphaned at the old path while
the new index still listed the key — `load()` returned null and the
snapshot became unreachable.

Fix: when the legacy index is present, union its entries into the
migration set in addition to the scan results. Index-registered keys
migrate regardless of whether they happen to start with a reserved
subpath. The scan filter still applies to orphan-only paths so
subsequent constructions with no legacy index don't re-process the
new-layout entries this code wrote.

Tighten the data-move loop to only call removeItem when getItem found
the entry (a listed-but-missing index key is now a no-op rather than a
phantom remove on the legacy path).

Adds two regression tests:
- Legacy user keys `__agentonomous/data/foo` and
  `__agentonomous/meta/dashboard` migrate end-to-end.
- Migration is idempotent — second construction over the same storage
  produces byte-identical raw keys.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fixup(persistence): fix two P1 migration flaws + bump size budget

Codex P1 #1 (line 159): migration skipped entirely when the injected
backend lacked length/key, but StorageLike only requires getItem /
setItem / removeItem. Persistent custom adapters that satisfy only the
required contract (node-localstorage-style, custom IndexedDB shims)
would keep legacy snapshots at the old `{prefix}{key}` path while
load() / list() read the new data/ + meta/index paths — data
unreachable post-upgrade.

Restructure migrateLegacyKeys into two discovery paths:

  - Legacy-index lookup. Always runs. Reads the known legacy path
    directly via getItem (no iteration needed) and migrates every user
    key the index lists. Covers custom adapters.
  - Orphan scan. Runs only when the backend exposes length + key(i).
    Picks up entries whose registration in the legacy index was lost
    (the original v1 collision bug). Filter on new-layout subpaths
    keeps it re-entrant.

Codex P1 #2 (line 182): an empty prefix would make startsWith(prefix)
true for every storage key, so migration could rewrite and delete
unrelated application data on first construction after upgrade.
Reject empty prefix at the constructor boundary — fail loudly before
any storage write.

Size budget: dist/index.js gzip grew to 35.36 KB with the restructured
migration, over the previous 35 KB limit. Bump the budget to 50 KB
per owner guidance so CI stops gating on the wafer-thin margin.
Current usage 35.09 KB / 50 KB.

Regression tests added:
  - NonIterableStorage (getItem/setItem/removeItem only) with legacy
    index migrates end-to-end. Legacy paths cleared; new paths present.
  - Empty prefix throws in constructor with a clear message.
  - Empty-prefix guard does not corrupt pre-existing unrelated storage
    data — the throw fires before any mutation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fixup(persistence): skip legacy index sentinel during migration

Codex P2 flag on 4f97ce7: a v1 store that hit the original collision
bug (saving under key `__agentonomous/index__`) could leave the
legacy index listing its own sentinel path as a "user key". The prior
code added that entry verbatim to legacyKeys, so the migration loop
copied the index metadata (an array, not a snapshot) into the new
data namespace. `load('__agentonomous/index__')` would then return
malformed data typed as AgentSnapshot and break downstream restore.

Skip LEGACY_INDEX_SUFFIX entries when reading the legacy index. The
sentinel path is not a user key; it can only point at index metadata,
and index metadata doesn't belong in the new data namespace.

Regression test: seed storage with `p/__agentonomous/index__`
listing `['foo', '__agentonomous/index__']`. After migration, `foo`
loads normally, `load('__agentonomous/index__')` returns null, list
contains only `foo`, no data-namespace write exists for the sentinel
encoding, and the legacy index path is cleared.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fixup(persistence): add re-entrance sentinel; scan recovers v1 data-subpath keys

Codex P2 flag on 103f32d: the orphan-scan filter skipped legacy
suffixes starting with `__agentonomous/data/`, but that string was a
valid user key in the v1 (pre-split) layout. A pathological v1 store
where the legacy index is missing (the exact recovery path this scan
was meant to handle) would be left with the payload at the old
`{prefix}{key}` location after migration — load() could no longer
reach it.

Root cause: the DATA_PREFIX filter was doing double duty — keeping
the scan re-entrant across constructions AND trying to distinguish
v1-user-keys-that-look-like-v2-layout from actual v2 writes. Those
two concerns are irreconcilable: both shapes are identical.

Split them:

  - Re-entrance is now enforced by a sentinel (`__agentonomous/meta/
    migrated`) written at the end of every migration pass — including
    passes that migrated nothing. Subsequent constructions read the
    sentinel at function entry and short-circuit.
  - The orphan scan drops the DATA_PREFIX filter entirely, so v1 user
    keys shaped like `__agentonomous/data/foo` are recovered on the
    initial migration run. The scan still excludes our own metadata
    namespace (`__agentonomous/meta/`) — those paths are never user
    data.

Also: only write the new meta index when there are entries to store,
so fresh installs don't leave an empty index artifact in storage.

Adds one regression test for Codex's scenario: v1 data-subpath key
present, legacy index missing, migration still moves the payload.
Existing URI-round-trip test updated to skip the full meta/
namespace when asserting encoded-data invariants (the migrated
sentinel lives there too).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fixup(persistence): address P1 marker-value-match + P2 malformed UTF-16

Codex findings on a970d4d, both real:

P1 (line 198): the marker-presence check `getItem(META_MIGRATED_KEY)
!== null` treated any non-null value as "migration already done". A
v1 user who saved under that exact path would see their snapshot
mistaken for an already-migrated marker, and migration would skip —
leaving the payload orphaned at the old path while load()/list() read
only the new data/ and meta/index locations.

Fix: match on a distinctive VALUE (MIGRATED_MARKER_VALUE =
`__agentonomous_v2_migrated__`), not mere presence. A v1 user's JSON
snapshot at that path can never equal this sentinel string, so their
snapshot falls through into the migration branch like any other
legacy key. When rawAtMarker exists but isn't the sentinel, the
path is explicitly added to legacyKeys so the v1 payload migrates
before the marker-write at end-of-pass stamps the sentinel there.

P2 (line 134): dataKey() now calls encodeURIComponent(key), which
throws URIError for lone-surrogate strings. Pre-split v1 accepted
such keys verbatim; post-split, save/load/delete could throw
synchronously for them, and migration could crash store init if the
legacy index listed one.

Fix: dataKey() re-throws URIError as a clearer store-specific error
that points at the offending key. save/load/delete wrap dataKey() in
try/catch and return the error via Promise.reject so consumers see
an async-looking API stay async-looking. Migration loop skips
malformed legacy entries instead of crashing — the payload stays at
the legacy path but the rest of the store still initializes.

Regression tests:
  - v1 user data at the META_MIGRATED_KEY sentinel path migrates
    (value-match protects the legacy payload).
  - save/load/delete reject with the store-specific error for
    lone-surrogate keys.
  - Migration with a malformed-UTF-16 legacy key still initializes;
    well-formed entries migrate, the malformed one is skipped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(review): codebase review findings + ESLint/typedoc guardrails

Consolidates a multi-agent review of the 2026-04-24 codebase into
`docs/plans/2026-04-24-codebase-review-findings.md` and lands Track A
(guardrails) live so the patterns the project relied on as convention
are now enforced by CI.

Track A — lands in this PR
- ESLint: architectural bans (default exports, enums, cross-layer
  peer-dep imports in core) via `no-restricted-syntax` +
  `no-restricted-imports`.
- ESLint: LOC / complexity caps (`max-lines`, `max-lines-per-function`,
  `complexity`, `max-depth`, `max-params`, `max-nested-callbacks`).
  Thresholds chosen so current code passes clean at error-level.
- ESLint: quality defaults for agentic contributions (`no-console`,
  `eqeqeq`, `no-duplicate-imports`, `switch-exhaustiveness-check`,
  `no-explicit-any`, etc.).
- Collapse three `import type` + `import` pairs into single inline
  type imports (Agent.ts x2, ExcaliburAnimationBridge.ts x1) — required
  by the new `no-duplicate-imports` rule.
- Typedoc: add missing adapter entry points (mistreevous / js-son /
  tfjs), add docs build as a parallel CI job, add `npm run docs` to
  the `verify` script so local gate mirrors CI. Output continues to
  live under `docs/api/` (already gitignored; sits alongside
  how-to/plans/specs).

Lint baseline after this PR: 0 errors, 11 warnings (the warnings are
the ratchet targets tracked under Track C of the plan).

Tracks B–E (stale docs, complexity ratchet, src micro-findings,
tooling gaps) are left as a punch list — each a separate topic branch
per CLAUDE.md one-PR-one-branch rule.

* fixup(persistence): abort migration cleanup when unable to enumerate keys

Codex P1 flag on 3dc1aa0: if a custom backend implements only the
StorageLike minimum (no length / key) AND the legacy index payload is
present but not a string array (corruption, or a colliding v1
snapshot), migration produced an empty legacyKeys set but STILL
deleted the legacy index and stamped the migrated marker. Legacy
{prefix}{userKey} entries were left in place but became permanently
unreachable via load() / list() after upgrade.

Fix: before running cleanup, detect the blind-migration case and
abort. When:
  - legacy index exists, AND
  - legacy index is not parseable as a string array, AND
  - backend does not expose iteration (no orphan scan possible)
return early without removing the legacy index and without setting
the marker. A subsequent construction — maybe on an iterable backend,
maybe after the corruption is fixed — can retry migration.

All other combinations proceed as before:
  - iterable backend + any legacy index state → orphan scan covers
    everything; proceed and finalize.
  - non-iterable + parseable legacy index → use the index; proceed.
  - non-iterable + no legacy index → fresh install; proceed and
    finalize (marker prevents re-scan).

Regression test: NonIterableStorage seeded with a colliding snapshot
at the legacy index path plus an orphan user payload. Constructor
must not touch either artifact, and must not set the marker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fixup(persistence): safeToFinalize gate + drop meta/ scan filter

Two Codex findings on abb557a, both real:

P1 r…
Luis85 pushed a commit that referenced this pull request Apr 26, 2026
P1 Codex finding on the merge commit: the prompt's Output policy
allowed a PR only when at least one archive move was staged and
otherwise forbade both PRs and issues. In runs where every
candidate is ambiguous (zero archive moves but non-zero ambiguous
flags), the routine had no permitted sink — owner-actionable
ambiguous decisions were silently dropped.

Add a SECONDARY sink: one triage issue per run under the existing
`plan-recon-bot` label, fired only when archive-moves == 0 AND
ambiguous-flags > 0. Distinct from the failure-issue path (which
fires only on `mv` / `verify` / `push` / `pr-open` errors mid-run);
the triage issue means the run completed cleanly but produced no
diff to review.

Triage-issue spec: title `Ambiguous plan candidates YYYY-MM-DD —
<head-sha7>`, label `plan-recon-bot`, body lifts the same
`Ambiguous — owner decides` block specced in the PR body, with a
preamble + a `<!-- plan-recon:<head-sha7>:ambiguous-only -->`
marker. Open command mirrors the failure-issue and PR-open
snippets (in-memory body, optional cache file in non-dry-run for
re-submit-by-hand on `gh issue create` failure).

Same-day idempotency: new Skip-check #3 looks for an existing
ambiguous-only triage issue at the current head SHA + run date and
exits silently if one is already open. Authored by
`$ROUTINE_GH_LOGIN` matches the trust-boundary pattern from #1
(archive-PR skip) and #2 (failure-issue skip).

No-op handling tightened to scope it to runs with zero archive
moves AND zero ambiguous flags. The Output preamble now spells out
PR-vs-triage-issue-vs-no-op as a tri-state policy. Dry-run mode
section's `gh issue create` bullet picks up the new triage path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants