Context
Tracking issue for deferred testing work from #26.
The evolution rework adds archetype-driven personalization to three unprompted surfaces:
- Heartbeat seed — dominant archetype (1.3×-clear) drives proactive check-in topic selection; sub-threshold falls back to neutral "general productivity" content
- Memory consolidation — nightly/weekly tasks weight archetype-relevant content when dominant is 1.3×-clear
- Skill loading priority — archetype-relevant skills get a tiebreaker priority boost under token budget pressure, but never override task-relevant skills
In #26 these are implemented as wiring changes. The pure-logic deep modules (scorer, xp_curve, plan_emission, signature) are unit-tested in that PRD's scope.
Integration tests for the three surfaces are intentionally skipped in #26 because the wires are still physically being connected.
What this issue tracks
Once #26 ships and the three surfaces are wired up end-to-end, write integration tests that exercise the full path:
- Heartbeat seed test — given an Ops-dominant user (with archetype clearly above 1.3× threshold), assert the next heartbeat picks an Ops-flavored topic. Given a hybrid user (no archetype clears 1.3×), assert the heartbeat falls back to neutral content. End-to-end through the heartbeat scheduler.
- Memory consolidation test — given a session with mixed Ops + Marketing observations and an Ops-dominant user, assert the consolidated long-term memory entries preferentially preserve the Ops content. Given a hybrid user, assert consolidation falls back to recency-only weighting.
- Skill loading priority test — given an Ops-dominant user under token-budget pressure, assert Docker/K8s skills load before lower-priority skills. Given a task-relevant signal pointing to a different domain (e.g., user asks about marketing), assert task-relevant skills still load regardless of archetype.
Test quality bar
Per CLAUDE.md:
- Real code paths, not mock orchestration
- Each test must have a real failure mode (not tautological)
- Cover the dominance-clear path AND the sub-threshold fallback path for each surface
Dependency
Blocked by #26 landing. Reopen / promote to active work once that PR merges and the heartbeat / memory consolidation / skill loader changes are in place.
🤖 Generated with Claude Code
Context
Tracking issue for deferred testing work from #26.
The evolution rework adds archetype-driven personalization to three unprompted surfaces:
In #26 these are implemented as wiring changes. The pure-logic deep modules (
scorer,xp_curve,plan_emission,signature) are unit-tested in that PRD's scope.Integration tests for the three surfaces are intentionally skipped in #26 because the wires are still physically being connected.
What this issue tracks
Once #26 ships and the three surfaces are wired up end-to-end, write integration tests that exercise the full path:
Test quality bar
Per
CLAUDE.md:Dependency
Blocked by #26 landing. Reopen / promote to active work once that PR merges and the heartbeat / memory consolidation / skill loader changes are in place.
🤖 Generated with Claude Code