feat(demo): grow learning-mode feature vector to 13 dims (row 18) by Luis85 · Pull Request #96 · Luis85/agentonomous

Luis85 · 2026-04-25T13:54:06Z

Summary

Extends Learning mode's feature vector from 5 need levels to 13 dims: 5 needs + 4 mood one-hot (MOOD_KEYS = ['happy','sad','sleepy','playful']) + 1 normalized active-modifier count + 3 normalized recent-event counts (SkillCompleted/SkillFailed/NeedCritical) over a 30-tick window.
Renames setLearningAgentId(id) → setLearningAgent(agent) (pre-1.0 clean break, no compat shim). The new function subscribes to the standard event bus to populate module-scoped mood + recent-event state without widening the adapter's helpers shape.
projectLearningOutcome now produces 13-dim training pairs at outcome time; cognitionSwitcher.ts's synthetic Train-button generator + scripts/seed-learning-network.ts both pad with uniform-[0, 1] noise so labels stay conditioned on the 5 need dims.
Bundled examples/nurture-pet/src/cognition/learning.network.json regenerated at the new [13, 16, 7] topology. Hydration's existing input + output dim guard (PR feat(demo,tfjs): N-way softmax learning mode (row 17) #94) lets old 5-input snapshots fail fast and fall back to the new baseline.

Test plan

npm run verify green (format / lint / typecheck / 491 tests / build / docs)
weightsShapes topology test updated to [13, 16] / [16] / [16, 7] / [7] (tests/examples/learningMode.train.test.ts, tests/unit/cognition/adapters/TfjsReasoner.test.ts)
Bundled-baseline test feeds a neutral-zero rich tail [0.05, 0.6, 0.6, 0.6, 0.6, 0, 0, 0, 0, 0, 0, 0, 0] and still asserts softmax invariants (7-vector, ∈ [0,1], sums to 1)
Plan rows 17 + 18 marked shipped in the same diff per feedback_docs_alongside_pr

Notes for review

Library impact: none — featuresOf is consumer-supplied. No changeset.
The 4 mood one-hot keys cover the demo's most narratively-distinct states; off-roster moods (content, angry, bored, sick, scared) collapse to all-zero on that section. This is intentional — the network treats those as "uninformative" rather than mistakenly grouping them with one of the 4 picked categories.
The synthetic Train-button data appends uniform [0, 1] noise on the 8 trailing dims because the labels don't condition on them. This trains the network to ignore those dims under the synthetic regime; predictive weight on mood / modifier / event counts comes only from the in-game TfjsLearner reinforcement loop fed by real outcomes.
projectLearningOutcome's mood / modifier-count / event-count dims are read at outcome time (post-skill). The need dims still come from details.preNeeds to preserve the row-17 invariant that need-level features reflect pre-skill state. Mood + modifier counts are stable across a single skill invocation; the event-count window includes the outcome's own emission, but that bias is uniform across every training pair so the net gradient direction is unaffected.

🤖 Generated with Claude Code

Extends `featuresFromNeeds` from the 5 need levels to 13 dims: - 5 need levels (existing) - 4 mood one-hot dims (`MOOD_KEYS = ['happy', 'sad', 'sleepy', 'playful']`) — off-roster moods collapse to all-zero on this section. - 1 active-modifier count, normalized via `min(count, 5) / 5`. - 3 recent-event counts (`SkillCompleted` / `SkillFailed` / `NeedCritical`) over a 30-tick sliding window, each normalized via the same cap. Mood + recent-event state lives in module-scoped variables populated by the standard event bus: `setLearningAgentId(id)` was renamed to `setLearningAgent(agent)` (pre-1.0 clean break, no compat shim). The new function subscribes to `MoodChanged`, `AgentTicked`, `SkillCompleted`, `SkillFailed`, `NeedCritical` and tears down via `null` invocation. `projectLearningOutcome` now produces 13-dim training pairs at outcome time. Mood + modifier-count come from `agent.getState()`; event counts come from the same module-scoped window. `cognitionSwitcher.ts`'s synthetic Train-button generator appends 8 uniform-`[0, 1]` noise dims to each archetype sample so the network learns to ignore them under the synthetic regime — real predictive weight on those dims comes from the in-game `TfjsLearner` reinforcement loop. `scripts/seed-learning-network.ts` now builds a `[13, 16, 7]` Sequential and pads training pairs with the matching uniform-noise tail; the bundled `learning.network.json` was regenerated. The hydration guard's existing input + output dim check (PR #94) lets old 5-input snapshots fail fast and fall back to the new baseline. Library impact: none — `featuresOf` is consumer-supplied. No changeset. Plan rows 17 + 18 marked shipped in the same diff.

github-actions · 2026-04-25T13:55:58Z

size-limit report 📦

Path	Size
dist/index.js (gzip)	37.13 KB (+0.11% 🔺)
dist/integrations/excalibur/index.js (gzip)	1.4 KB (0%)
dist/cognition/adapters/mistreevous/index.js (gzip)	1.14 KB (0%)
dist/cognition/adapters/js-son/index.js (gzip)	1.36 KB (0%)
dist/cognition/adapters/tfjs/index.js (gzip)	6.32 KB (0%)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 116b20b86c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-25T13:59:32Z

+  const state = agent.getState();
+  const moodCategory = state.mood?.category;
+  const moodOneHot = MOOD_KEYS.map((key) => (moodCategory === key ? 1 : 0));
+  const modCount = Math.min(state.modifiers.length, COUNT_NORM_CAP) / COUNT_NORM_CAP;
+  let completed = 0;


Use pre-skill modifier count when projecting training features

projectLearningOutcome labels a sample for the action chosen from pre-skill state, but this path reads state.modifiers.length after the skill has already executed. In this codebase several default skills mutate modifiers (for example FeedSkill adds well-fed, CleanSkill removes dirty, and ScoldSkill adds/removes modifiers), so training inputs can encode post-action modifier state while inference (featuresFromNeeds) uses pre-action ctx.modifiers. That mismatch leaks action effects into inputs and can train a policy that does not match runtime feature semantics.

Useful? React with 👍 / 👎.

Fixed in cfdc8c4. featuresFromNeeds now snapshots the 8-dim rich suffix into lastRichFeatureSuffix at Stage 4 (pre-skill); projectLearningOutcome reuses that snapshot at Stage 8 instead of recomputing from agent.getState() (which would encode FeedSkill/CleanSkill/ScoldSkill modifier mutations). Fallback path recomputes from current state only when no featuresFromNeeds call has run yet — test harnesses scoring outcomes without driving the reasoner first.

chatgpt-codex-connector · 2026-04-25T13:59:32Z

+        case 'SkillCompleted':
+          recentEvents.push({ tick: currentTick, kind: 'completed' });
+          return;


De-duplicate SkillCompleted events in recent-event features

This listener counts every SkillCompleted event, but successful feed emits two such events in this repo (one from FeedSkill.execute, one from CognitionPipeline.invokeSkillAction). As a result, recent:SkillCompleted increments twice for feed and once for most other skills, so the new feature no longer represents true completion frequency and can bias both prediction and learning toward event-emitter quirks.

Useful? React with 👍 / 👎.

Fixed in cfdc8c4. Added a per-tick dedupe Set keyed by ${kind}:${skillId | needId} cleared on AgentTicked. Default skills that emit SkillCompleted from execute() (FeedSkill etc.) plus the kernel's emission now count once per skillId per tick. Same dedupe applied to SkillFailed (skillId) and NeedCritical (needId).

Two Codex findings on PR #96 (commit 116b20b): P1 — `projectLearningOutcome` was reading `state.modifiers.length` post-skill, but several default skills mutate modifiers during `execute()` (FeedSkill adds `well-fed`, CleanSkill removes `dirty`, ScoldSkill add/remove). Training inputs would encode action effects while inference (`featuresFromNeeds`) used pre-action state. Fix: snapshot the 8-dim rich-feature suffix during `featuresFromNeeds` at Stage 4 (pre-skill) into `lastRichFeatureSuffix`, then reuse it at Stage 8 outcome projection. Fall back to a from-state recompute only when no `featuresFromNeeds` call has run yet (test harnesses). P2 — `recent:SkillCompleted` etc. counted every emission, but the default `FeedSkill` (and others) emit SkillCompleted from inside `execute()` AND the kernel emits a second one from `CognitionPipeline.invokeSkillAction`. Doubled count for affected skills biased the feature toward emitter quirks. Fix: per-tick dedupe set keyed by `${kind}:${skillId | needId}`, cleared on every AgentTicked event.

Luis85 · 2026-04-25T14:05:07Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cfdc8c48bb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-25T14:09:17Z

+  // arrived before any `featuresFromNeeds` call (e.g. the consumer's
+  // first invocation of `learner.score` happens without a tick
+  // boundary, like the demo's test harness).
+  const suffix = lastRichFeatureSuffix ?? recomputeRichFeatureSuffixFromAgent(agent);


Snapshot rich suffix per scored outcome

projectLearningOutcome always reuses lastRichFeatureSuffix when it is non-null, but that suffix is only refreshed in featuresFromNeeds (Stage 4). Player interactions are executed from the reactive handler path in Stage 1 (InteractionRequested → invokeSkill) before Stage 4 runs for that tick, so those outcomes are paired with current preNeeds and a stale mood/modifier/event suffix from a previous tick. This creates feature/label misalignment for interaction-driven training samples and can push the learner toward incorrect correlations.

Useful? React with 👍 / 👎.

Fixed in bbbe6a8. Took a different approach than the Stage-4 cache: added a small library snapshot (details.preModifierCount) in CognitionPipeline.invokeSkillAction, and dropped lastRichFeatureSuffix entirely. projectLearningOutcome now recomputes the suffix at outcome time — mood from current state (MoodModel reconciles Stage 2.7, before skill execution), modifier count from details.preModifierCount, event counts from current recentEvents minus 1 for the matching kind to compensate for self-emission. Player-click paths get the same pre-skill snapshot the reasoner-driven path gets, since the kernel snapshots before invoking the skill regardless of caller. Minor bump changeset added.

chatgpt-codex-connector · 2026-04-25T14:09:17Z

+          const key = `completed:${e.skillId ?? '?'}`;
+          if (processedThisTick.has(key)) return;
+          processedThisTick.add(key);


Don't collapse distinct same-skill events in one tick

The per-tick dedupe key uses only completed:${skillId} (with the same pattern for failed/critical), so multiple legitimate events of the same kind and id in one tick are treated as duplicates. This happens when users queue repeated interactions quickly or when multiple matching actions run in the same tick, causing recentEvents to undercount true frequency and weakening the new recent-event feature dimensions.

Useful? React with 👍 / 👎.

Kept the per-tick ${kind}:${skillId|needId} dedupe — the actual case it addresses (FeedSkill etc. emit SkillCompleted from execute() AND the kernel emits a second one) is real and consistent across default skills, while the legitimate-distinct-twice-per-tick case is hypothetical for the demo's pacing. Reasoned in the bbbe6a8 commit message + JSDoc; the new -1 self-emission compensation in projectLearningOutcome relies on this single-increment-per-tick invariant. If a consumer skill genuinely emits multiple distinct SkillCompleted of the same skillId in one tick, the recent-event count under-represents by 1 per pair.

Two Codex findings on PR #96 commit cfdc8c4: P1 — `lastRichFeatureSuffix` was only refreshed in `featuresFromNeeds` (Stage 4). Player-click paths skip the reasoner, so outcomes scored from button clicks reused a stale suffix from a prior tick. Library: `CognitionPipeline.invokeSkillAction` now snapshots `agent.modifiers.list().length` alongside `preNeeds` and includes it on every `LearningOutcome.details` payload as `preModifierCount`. Mirrors the existing `preNeeds` mechanism. Minor bump (additive field). Demo: dropped the `lastRichFeatureSuffix` cache. `projectLearningOutcome` now recomputes the suffix at outcome time: - Mood from `agent.getState().mood?.category` — `MoodModel` reconciles at Stage 2.7 (before Stage 7 skill execution), so this is still pre-skill state for any single-skill invocation. - Modifier count from `details.preModifierCount` (kernel-supplied); falls back to current state for consumer-emitted outcomes. - Event counts from current `recentEvents` window with `completed` decremented by 1 to compensate for the outcome's own emission (dedupe ensures only +1 was added). P2 — Earlier dedupe key collapsed all same-skill events per tick. Kept the dedupe but acknowledged in JSDoc that legitimate distinct same-skill emissions per tick (rare in the demo's pacing) collapse to one increment. Trade-off reasoned: the actual kernel+skill double-emit is consistent for default skills; legitimate-twice-per-tick is hypothetical. The -1 self-emission compensation in `projectLearningOutcome` relies on this single-increment-per-tick invariant.

Luis85 · 2026-04-25T14:19:04Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bbbe6a8846

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-25T14:23:51Z

+          const key = `completed:${e.skillId ?? '?'}`;
+          if (processedThisTick.has(key)) return;
+          processedThisTick.add(key);


Preserve distinct same-skill events in one tick

This dedupe key collapses all SkillCompleted events for the same skillId within a tick, not just the duplicate kernel+skill emission from a single invocation. In this repo, Agent.dispatchReactiveHandlers processes every queued InteractionRequested event in order (src/agent/Agent.ts), so two quick identical interactions before one tick can legitimately invoke the same skill twice; the second completion is dropped here. That undercounts recentEvents features and skews projectLearningOutcome's pre-skill event-count compensation for later invocations in the same tick.

Useful? React with 👍 / 👎.

Fixed in 77e0979 by changing the counting source. SkillCompleted/SkillFailed are no longer subscribed via agent.subscribe — they're counted via the learner's own outcome stream (buildLearningLearner now wraps toTrainingPair to project first, then call recordOutcomeForFeatureWindow). Kernel invocations produce exactly one outcome each → one count per invocation. dispatchReactiveHandlers's back-to-back same-skill case now counts both correctly; the kernel+skill SkillCompleted double-emit no longer double-counts (only the kernel's score call reaches the learner). Dropped processedThisTick + the -1 self-emission compensation. NeedCritical stays on the subscription path (single emitter).

Codex P2 round 3 on PR #96 commit bbbe6a8: per-tick \`\${kind}:\${skillId}\` dedupe collapses legitimate distinct same-skill events. \`Agent.dispatchReactiveHandlers\` can process two queued \`InteractionRequested\` events in one tick, legitimately invoking the same skill twice; the second completion was dropped. Switched the SkillCompleted/SkillFailed counting source from \`agent.subscribe('SkillCompleted'|'SkillFailed')\` to the learner's own outcome stream. \`buildLearningLearner\` now wraps \`toTrainingPair\` to: 1. Project the outcome (sees pre-this-outcome counts). 2. THEN call \`recordOutcomeForFeatureWindow\` to push the matching kind into \`recentEvents\`. Each kernel invocation produces exactly one outcome → exactly one push. Kernel + skill double-emits don't double-count (only the kernel's score call reaches the learner). Reactive-handler back-to-back invocations DO count separately. Both Codex P2 round 1 (double-emit) and round 3 (legitimate distinct same-skill in one tick) are addressed. Dropped the \`processedThisTick\` Set + the -1 self-emission compensation in \`projectLearningOutcome\` (no longer needed since projection runs before the increment). \`NeedCritical\` stays on the subscription path — \`NeedsTicker\` is the only emitter.

Luis85 · 2026-04-25T14:28:16Z

@codex review

chatgpt-codex-connector · 2026-04-25T14:34:21Z

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

## Summary Doc-audit pass over `docs/plans` + `docs/specs`. Three things land together: - **`docs/archive/{plans,specs}/`** — new home for plans whose roadmap rows have all shipped (or whose goals were folded into a successor) and specs whose design is now reflected in code. Includes a `README.md` explaining the policy; `CLAUDE.md` documents the convention. - **`git mv` 23 plans + 3 specs into the archive.** The active live set is now the comprehensive polish-and-harden plan plus three specs (post-tfjs improvements, mvp-demo, vision), each with a refreshed status banner. - **Refresh the live comprehensive plan** against current `develop`: - PR column updated for rows 16/19/20/3/4/22 (now shipped via PRs #91 / #98 / #104 / #110 / #113 / #111). - New "Post-roadmap follow-ups" section covers PRs #92 → #125 (review-bot infra, tracker findings, demo + tfjs hotfixes, tooling). - Stale prose-baked counts dropped (size budgets now reference `package.json#size-limit` only). - Coverage-thresholds section gains a pointer to the sticky PR comment shipped in PR #124. ## Other doc fixes - `README.md`: drop the unverifiable "Phase A milestones (M0–M15) are all green" claim — the milestones don't exist as documented IDs anywhere; replace with a pointer to the live polish plan. - `vision.md`: refresh cadence note (was pinned to 2026-04-19 + "next review at 1.0"). - `2026-04-24-post-tfjs-improvements.md`: mark recommended-order items that have shipped (PRs #61, #76, #77, #83, #84, #91, #94, #96, #104, #113), link the active roadmap as the heir. - `mvp-demo.md`: status banner explaining where active polish work is now tracked. ## Mechanical - Update inline cross-refs in `CLAUDE.md`, `eslint.config.js`, `src/agent/{Agent,AgentModule}.ts`, `tests/unit/exports.test.ts`, and `docs/daily-reviews/2026-04-25.md` to point at the new `docs/archive/` paths so links keep resolving. No code change beyond comment-path updates. ## Test plan - [x] `npm run verify` green (`format:check` + `lint` + `typecheck` + `test` + `build` + `docs`). 523 tests pass; the 2 lint warnings are pre-existing (`CognitionPipeline.invokeSkillAction` complexity + `scoreFailure` param count) and on the ratchet menu. - [x] `git ls-files docs/archive/` shows the moved files; renames are preserved (`git log --follow` works for any moved file). - [ ] Codex review: clean, no blockers. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Luis Mendez <hallo@luis-mendez.de> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector Bot reviewed Apr 25, 2026

View reviewed changes

Luis85 merged commit a6b2099 into develop Apr 25, 2026
14 checks passed

Luis85 deleted the feat/demo-richer-feature-vector branch April 25, 2026 14:50

This was referenced Apr 25, 2026

demo: promote develop @ 42ede76 #114

Merged

docs: audit + archive shipped plans/specs #127

Merged

Conversation

Luis85 commented Apr 25, 2026

Summary

Test plan

Notes for review

Uh oh!

github-actions Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

size-limit report 📦

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 commented Apr 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 commented Apr 25, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Luis85 commented Apr 25, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Apr 25, 2026 •

edited

Loading