push#2
Conversation
|
Caution Review failedThe pull request is closed. WalkthroughThe changes introduce enhanced validation and configuration handling across three files. In the server logic, a safeguard is added to check for the existence of a tool validator before accessing it, preventing potential runtime errors. In a sample hook script, a previously commented-out guard clause is removed from a validator function, ensuring the function now always performs its intended validation. Additionally, the assistant server configuration JSON is updated to explicitly disable a server entry by adding an "enabled": false property. Changes
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (3)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Connects the implementation of the previous commits into the running session so the feature actually fires when configured. Implements §9 step 4 of the design note. services.SelfLearn (services.go) - New field on the app's services struct that holds the live Reviewer when at least one arm is enabled. Nil ⇒ zero overhead: no goroutine, no counters, no extra model calls (§3.1 promise). wireSelfLearn (internal/app/selflearn.go, new) - Called from ensureAgentSession after Agent.Start so the ReviewFunc can capture the live System. - Reads setting.SelfLearn via selflearn.ResolveSettings, which also runs Validate() so an illegal config is logged and skipped instead of crashing the session. - Builds NewMemoryStoreWithCap + NewSkillManager with the resolved cap and permissions, threads both into a ForkConfig the ReviewFunc constructs at fire time. - Rebuilds the LLM client from the captured BuildParams.Provider/ModelID /MaxTokens so the fork hits the same prefix-cache key as the parent (§6 invariant #2) without sharing the parent's request cycle. - Seeds the memory arm's counter from the preloaded user-turn history via SeedTurns (invariant #8 resume). - Review summary is logged for Phase 1; user-visible MessageEvent + status-bar surface is Phase 1.5 (§"User-visible surface"). OnTurnEnd hook (model_agent_events.go) - Forwards every turn Result to forwardTurnToSelfLearn unconditionally. The Reviewer gates on StopEndTurn internally so cancelled / interrupted turns are silently skipped at one place rather than scattered up the call stack. StopAgentSession cleanup (agent.go) - clearSelfLearn nils out services.SelfLearn so a future ReviewFunc closure can't hold references to a torn-down agent. In-flight goroutines complete on their own via DefaultForkDeadline. systemOnlyAgent shim - Minimal core.Agent wrapper that exposes only System(). The ReviewFunc needs the parent's System for prefix-cache parity but nothing else from the live agent — keeps the dependency surface narrow and lets us avoid extending the core.Agent interface with an LLM() accessor that nothing else needs. ReviewKind.String (reviewer.go) - Stable log-friendly labels (none / memory / skill / memory+skill) for the wire-up's review log entry. Covered by TestReviewKindString. -race passes across selflearn, setting, reminder, and app packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#1 mergeSettings dropped SelfLearn (setting/merger.go) - The merger enumerated every Data field except the new SelfLearn block, so every Load + every SaveToFile silently wiped selfLearn from disk. The L1 feature was effectively unreachable from settings.json. Added mergeSelfLearn with field-level semantics (int coalesce, bool OR for enable/deny — safety-biased), plus a regression test that exercises base→overlay → base+overlay merges and the symmetric base-only case. #2 /config Space on rowHeader panicked (app/input/on_config.go) - Cursor started at index 0 (the Memory header) whose toggle field is nil, so the first Space keystroke after /config crashed the TUI. Up/Down didn't skip non-editable rows either. Fixed by (a) starting the cursor at the first editable row in Enter, (b) replacing the cursor++ / cursor-- math with nextEditableRow / prevEditableRow that walks past headers/spacers/advanced-hints, and (c) guarding Space and Enter with an explicit kind/toggle check so a stale cursor index can never crash via a nil call. Test: navigation across the panel never lands on a non-editable row. #3 DrainActions ran BEFORE the deferred Complete (app/selflearn.go + app/selflearn_ui.go) - The defer pattern meant the success path drained s.actions to a local var, returned, and only then ran Complete in the defer — which snapshotted doneCount = len(s.actions) = 0. The 2-second done-hold status bar always rendered "evolved" without the count. Restructured the wire-up to call Complete inline before Drain (no defer for the success path; explicit Fail call on the error path). Test: TestCompleteCapturesCountBeforeDrain exercises the Complete→Drain ordering. #4 RunReview rooted at context.Background (app/selflearn.go + app/services.go + app/agent.go) - A torn-down session couldn't cancel an in-flight fork — the LLM request held tokens / HTTP for up to forkDeadline (5 minutes) after the user moved on. Added services.SelfLearnCancel + a session-scoped context.WithCancel in wireSelfLearn; StopAgentSession cancels it so the fork unblocks immediately. #5 Phantom "evolving → evolved" on a torn-down session (app/selflearn.go + app/selflearn_ui.go) - The deferred Complete fired even on the !Active early-return, flashing the status bar for a review that never ran. Two fixes: (a) moved the liveness check above BeginReview/publishStarted so a bail produces zero UI state change; (b) Complete itself now goes straight to idle when len(s.actions) == 0 — the §6 invariant #7 "silent when nothing changed" promise applied to the bar surface too. Test: TestCompleteSilentOnEmptyActions. -race clean across selflearn, setting, reminder, app, app/conv, app/input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two CI failures on this branch, both fixed: #1 internal/core/system → internal/session (layer violation) - AutoMemoryDir called session.EncodePath. core ranks 3 and session ranks 2 (feature), so this was an upward import in the layered architecture (caught by tools/layercheck). Inlined a 5-line encodeProjectPath helper in core/system that mirrors session.EncodePath — both functions are tiny string-ops and must stay in lockstep so memory and transcript stores resolve to the same project partition; comment makes the dependency explicit. #2 internal/selflearn unmapped in layercheck - Added "internal/selflearn": "feature" to tools/layercheck/main.go's layerOf map. selflearn imports core (system, agent) and infra (markdown) — both downward, consistent with feature-layer peers. Pre-push tooling - New `make ci` target runs every step the .github workflow runs in the same order (format-check → build-all → lint → unit tests → integration tests). Catches format/vet/layercheck/test failures locally instead of round-tripping through Actions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…om /code-review Findings from the max-effort /code-review pass on this branch: - #1 config disable can't persist: UpdateSelfLearnAt now replaces the selfLearn block instead of OR-merging it via mergeSelfLearn, so a true->false toggle from /config actually sticks (other settings in the file are preserved). Cross-level Load layering still ORs by design. - #2/#3 lifecycle race + leak: write observers gate on a captured atomic.Bool instead of racing on m.services.SelfLearn; new teardownSelfLearn() cancels the prior fork context before a re-wire (the agent-toggle rebuild path bypassed StopAgentSession) and on stop. - #4 skill name traversal: resolve() validates name with skillNameRe so patch/edit/delete/write_file/remove_file can't escape the skills dir. - #5 yamlScalar under-quoting: emit the description via yaml.Marshal so a value like "[draft] ..." can't produce invalid frontmatter that bricks every later parseSkill. - #7 applyPatch overlap: lineWindowReplace collects non-overlapping fuzzy windows (advance by w on a hit), mirroring the exact tier. - #8 stale settings: reload the in-memory Settings handle on ConfigSavedMsg. - #10 memory delimiter: reject a bare "§" line that fuses with the entry delimiter and shifts boundaries on read. - #9 relax Validate: "create + update allowed, delete denied" is now a legal config (delete is already agent-created-only, so opting out removes no safety). Rule, test, and design note updated. Adds regression tests for the config round-trip, traversal, description round-trip, and fuzzy-overlap cases (previously untested). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Connects the implementation of the previous commits into the running session so the feature actually fires when configured. Implements §9 step 4 of the design note. services.SelfLearn (services.go) - New field on the app's services struct that holds the live Reviewer when at least one arm is enabled. Nil ⇒ zero overhead: no goroutine, no counters, no extra model calls (§3.1 promise). wireSelfLearn (internal/app/selflearn.go, new) - Called from ensureAgentSession after Agent.Start so the ReviewFunc can capture the live System. - Reads setting.SelfLearn via selflearn.ResolveSettings, which also runs Validate() so an illegal config is logged and skipped instead of crashing the session. - Builds NewMemoryStoreWithCap + NewSkillManager with the resolved cap and permissions, threads both into a ForkConfig the ReviewFunc constructs at fire time. - Rebuilds the LLM client from the captured BuildParams.Provider/ModelID /MaxTokens so the fork hits the same prefix-cache key as the parent (§6 invariant #2) without sharing the parent's request cycle. - Seeds the memory arm's counter from the preloaded user-turn history via SeedTurns (invariant #8 resume). - Review summary is logged for Phase 1; user-visible MessageEvent + status-bar surface is Phase 1.5 (§"User-visible surface"). OnTurnEnd hook (model_agent_events.go) - Forwards every turn Result to forwardTurnToSelfLearn unconditionally. The Reviewer gates on StopEndTurn internally so cancelled / interrupted turns are silently skipped at one place rather than scattered up the call stack. StopAgentSession cleanup (agent.go) - clearSelfLearn nils out services.SelfLearn so a future ReviewFunc closure can't hold references to a torn-down agent. In-flight goroutines complete on their own via DefaultForkDeadline. systemOnlyAgent shim - Minimal core.Agent wrapper that exposes only System(). The ReviewFunc needs the parent's System for prefix-cache parity but nothing else from the live agent — keeps the dependency surface narrow and lets us avoid extending the core.Agent interface with an LLM() accessor that nothing else needs. ReviewKind.String (reviewer.go) - Stable log-friendly labels (none / memory / skill / memory+skill) for the wire-up's review log entry. Covered by TestReviewKindString. -race passes across selflearn, setting, reminder, and app packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#1 mergeSettings dropped SelfLearn (setting/merger.go) - The merger enumerated every Data field except the new SelfLearn block, so every Load + every SaveToFile silently wiped selfLearn from disk. The L1 feature was effectively unreachable from settings.json. Added mergeSelfLearn with field-level semantics (int coalesce, bool OR for enable/deny — safety-biased), plus a regression test that exercises base→overlay → base+overlay merges and the symmetric base-only case. #2 /config Space on rowHeader panicked (app/input/on_config.go) - Cursor started at index 0 (the Memory header) whose toggle field is nil, so the first Space keystroke after /config crashed the TUI. Up/Down didn't skip non-editable rows either. Fixed by (a) starting the cursor at the first editable row in Enter, (b) replacing the cursor++ / cursor-- math with nextEditableRow / prevEditableRow that walks past headers/spacers/advanced-hints, and (c) guarding Space and Enter with an explicit kind/toggle check so a stale cursor index can never crash via a nil call. Test: navigation across the panel never lands on a non-editable row. #3 DrainActions ran BEFORE the deferred Complete (app/selflearn.go + app/selflearn_ui.go) - The defer pattern meant the success path drained s.actions to a local var, returned, and only then ran Complete in the defer — which snapshotted doneCount = len(s.actions) = 0. The 2-second done-hold status bar always rendered "evolved" without the count. Restructured the wire-up to call Complete inline before Drain (no defer for the success path; explicit Fail call on the error path). Test: TestCompleteCapturesCountBeforeDrain exercises the Complete→Drain ordering. #4 RunReview rooted at context.Background (app/selflearn.go + app/services.go + app/agent.go) - A torn-down session couldn't cancel an in-flight fork — the LLM request held tokens / HTTP for up to forkDeadline (5 minutes) after the user moved on. Added services.SelfLearnCancel + a session-scoped context.WithCancel in wireSelfLearn; StopAgentSession cancels it so the fork unblocks immediately. #5 Phantom "evolving → evolved" on a torn-down session (app/selflearn.go + app/selflearn_ui.go) - The deferred Complete fired even on the !Active early-return, flashing the status bar for a review that never ran. Two fixes: (a) moved the liveness check above BeginReview/publishStarted so a bail produces zero UI state change; (b) Complete itself now goes straight to idle when len(s.actions) == 0 — the §6 invariant #7 "silent when nothing changed" promise applied to the bar surface too. Test: TestCompleteSilentOnEmptyActions. -race clean across selflearn, setting, reminder, app, app/conv, app/input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two CI failures on this branch, both fixed: #1 internal/core/system → internal/session (layer violation) - AutoMemoryDir called session.EncodePath. core ranks 3 and session ranks 2 (feature), so this was an upward import in the layered architecture (caught by tools/layercheck). Inlined a 5-line encodeProjectPath helper in core/system that mirrors session.EncodePath — both functions are tiny string-ops and must stay in lockstep so memory and transcript stores resolve to the same project partition; comment makes the dependency explicit. #2 internal/selflearn unmapped in layercheck - Added "internal/selflearn": "feature" to tools/layercheck/main.go's layerOf map. selflearn imports core (system, agent) and infra (markdown) — both downward, consistent with feature-layer peers. Pre-push tooling - New `make ci` target runs every step the .github workflow runs in the same order (format-check → build-all → lint → unit tests → integration tests). Catches format/vet/layercheck/test failures locally instead of round-tripping through Actions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…om /code-review Findings from the max-effort /code-review pass on this branch: - #1 config disable can't persist: UpdateSelfLearnAt now replaces the selfLearn block instead of OR-merging it via mergeSelfLearn, so a true->false toggle from /config actually sticks (other settings in the file are preserved). Cross-level Load layering still ORs by design. - #2/#3 lifecycle race + leak: write observers gate on a captured atomic.Bool instead of racing on m.services.SelfLearn; new teardownSelfLearn() cancels the prior fork context before a re-wire (the agent-toggle rebuild path bypassed StopAgentSession) and on stop. - #4 skill name traversal: resolve() validates name with skillNameRe so patch/edit/delete/write_file/remove_file can't escape the skills dir. - #5 yamlScalar under-quoting: emit the description via yaml.Marshal so a value like "[draft] ..." can't produce invalid frontmatter that bricks every later parseSkill. - #7 applyPatch overlap: lineWindowReplace collects non-overlapping fuzzy windows (advance by w on a hit), mirroring the exact tier. - #8 stale settings: reload the in-memory Settings handle on ConfigSavedMsg. - #10 memory delimiter: reject a bare "§" line that fuses with the entry delimiter and shifts boundaries on read. - #9 relax Validate: "create + update allowed, delete denied" is now a legal config (delete is already agent-created-only, so opting out removes no safety). Rule, test, and design note updated. Adds regression tests for the config round-trip, traversal, description round-trip, and fuzzy-overlap cases (previously untested). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…#75) * feat(selflearn): L1 trigger core + selfLearn settings First slice of the self-learning loop (#52). Package internal/selflearn (not a cryptic "l1") holds the per-turn Reviewer: - Two-signal trigger: memory on user-turn cadence, skills on tool-iteration threshold; combined when both trip the same turn. - Only StopEndTurn turns count (cancelled/interrupted/max-turns skipped). - ≤1 review in flight; a trigger that arrives mid-review is dropped and NOT reset, so it fires again on the next clean turn. - The fork+review is an injected ReviewFunc so the trigger is unit-testable without an LLM; SeedTurns hydrates cadence on resume. - selfLearn settings (memory/skills enabled + intervals), off by default. Tests cover memory cadence, skill threshold, combined, interrupted-turn skip, concurrency cap (drop + retry), and seed. Build + format-check + tests green. Next slices: the fork (restricted core.Agent), memory_write + skill_manage tools, the memory store + injection source, skill origin field, review prompts, and Task.Start wiring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(selflearn): project-partitioned memory store + memory_write tool Add the L1 reviewer's durable-memory write surface: a project-partitioned MemoryStore under ~/.gen/projects/<encoded-cwd>/memory/ with add/replace/ remove-by-substring entry semantics (mirroring the hermes memory tool), atomic temp+rename writes, exact-duplicate dedup, a per-file char budget, and a coarse injection/exfiltration scan since entries are injected into a future system prompt. The memory_write tool exposes these to the fork only. Add the read side as a distinct source so agent-written memory and user-authored GEN.md/CLAUDE.md never mix: system.AutoMemoryDir/IndexPath and LoadAutoMemory (capped, line-boundary truncation). Reminder-layer injection is wired in a later slice. Also fix two stale "l1:" log prefixes left from the package rename. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(selflearn): skill_manage tool + agent-created provenance Add the Origin frontmatter field to Skill (absent = user-created) so the reviewer can tell apart skills it owns. Add the L1-only skill_manage tool over gen-code's existing user/project skill scopes (no agent-created/ subdir): create / patch / edit / write_file / remove_file / delete, all refusing to touch user-created skills. patch uses a pragmatic fuzzy-match chain (exact -> line-trimmed -> whitespace-collapsed) with an escape-drift guard and a unique-match requirement; the block-anchor and context-similarity tiers from the design are deferred. Class-level kebab names are enforced (also a traversal guard), and support files are confined to references/templates/scripts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(selflearn): reviewer fork + review prompts Add RunReview: forks a restricted core.Agent over the just-completed turn snapshot to capture durable learnings. The fork inherits the parent's system prompt verbatim via a fresh System (avoids clobbering the parent's telemetry observer that NewAgent installs), is granted only memory_write + skill_manage behind a static allow-only permission policy (never prompts the TUI), runs headless (OutboxBuf -1) with a bounded MaxTurns, and is driven via SetMessages + Append + ThinkAct. Add the three review prompts (memory / skill / combined) selected by which arms fired; each embeds the current memory store and skill inventory so the fork refreshes and dedupes rather than blindly appending. Thread the turn snapshot through ReviewFunc so the reviewer reviews the turn it observed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(selflearn): scan skill writes for injection; refresh doc anchors Security — extend the injection scan from memory writes to all four skill write paths (Create, Edit, Patch, WriteFile). Skills are loaded into a future system prompt the same way memory is, so a poisoned skill body or description is the same stored-injection vector with the same fix. Refactor scan.go to expose two checks: - scanContent — rejects empty + threats; for required content (memory entries, skill bodies). - scanForThreats — threats only, allows empty; for optional fields (skill descriptions, support-file contents, patch results that may legitimately remove text). Add TestSkillRejectsInjectionContent covering create / edit / patch / write_file paths plus the rollback guarantee (a rejected write must not mutate the on-disk skill). memory.go's two existing callsites are renamed scanMemoryContent → scanContent to match the refactored API. Docs alignment (post-rebase) — update three stale §3a / §3b code-comment references to the merged design doc's new structure: §4 Memory flow, §5.2 Provenance / write scope, §5.3 Tool surface. invariant #8 and §3 references remain valid and are untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(selflearn): action permission system per design §5.5 Implement the action-level permission scheme defined in the merged design note: three booleans (allowCreate / allowUpdate / allowDelete) gate the corresponding skill_manage actions on the agent-created scope; one advanced opt-in (allowUpdateUserCreated) extends patch — and only patch — to user-created skills. Config (internal/setting/settings.go) - SelfLearnMemory grows a MaxKB field with the 25 KB upper-bound invariant (§4.2). Constants: SelfLearnMaxMemoryKB, SelfLearnDefaultMemoryKB. - SelfLearnSkills grows four permission fields. The three core ones are *bool so unset (nil) cleanly distinguishes from explicit false; default resolves to true via AllowCreateOr / AllowUpdateOr / AllowDeleteOr. - SelfLearnSettings.Validate enforces the three rejected combinations documented in §3.1: create-without-update, only-add-never-subtract, advanced-opt-in-without-update-base; plus the maxKB range check. - MaxKBOr returns the default fallback so callers don't repeat the zero-means-default branch. SkillManager (internal/selflearn/skill.go) - ActionPermissions struct + DefaultActionPermissions() constructor. Carries the four booleans. NewSkillManager now takes ActionPermissions explicitly so the caller's intent is visible at every construction site. - requirePatchable extends the existing requireAgentOwned guard: patch on user-created is allowed iff AllowUpdateUserCreated is set. All other update-shaped actions (edit, write_file, remove_file) stay on requireAgentOwned — only patch crosses the user-created boundary (§5.2 / §5.5 invariant). - Each of the six action entry points (Create, Edit, Patch, WriteFile, RemoveFile, Delete) opens with a uniform errActionDenied early-return when its allow flag is off, so the model sees a consistent "permission denied" shape on the veto path. Tests - TestSelfLearnValidate covers the four legal combinations (default, freeze, patch-only, feature-off), the three rejected combinations, and the maxKB range — both the upper bound and the negative case. - TestSelfLearnAllowAccessors confirms the nil→true / explicit-pass-through contract every downstream consumer depends on. - TestSelfLearnMemoryMaxKBOr exercises the default fallback. - TestSkillActionPermissions runs each allow* flag through its action(s), asserting both the error message and the rollback guarantee (a denied write must not mutate disk). - TestSkillAllowUpdateUserCreated asserts the advanced opt-in extends patch to user-created, but never Edit, Delete, or WriteFile — the design invariant that only patch crosses the boundary. Existing helpers (newTestSkillManager, fork tests) are updated to pass DefaultActionPermissions, preserving prior behavior. -race passes for both selflearn and setting packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(selflearn): complete remaining design-doc surface area Wraps up the Phase 1 implementation surface of the L1 design note (notes/active/l1-background-review.md): configurable memory cap, fork deadline, defensive snapshot copy, permission-aware prompt synthesis, the setting→runtime bridge, and the memory-auto reminder provider. Memory cap configurable (memory.go) - Rename memoryFileCharLimit → DefaultMemoryFileCharLimit and store it on MemoryStore via NewMemoryStoreWithCap(cwd, maxFile). Add/Replace enforce per-instance maxFile instead of the package constant, so the app can lower the cap via memory.maxKB (§3.1) without forking the store. NewMemoryStore preserves the old default for callers that don't know about §4.2 yet. Fork deadline (fork.go) - ForkConfig grows a Deadline (default 5 min, DefaultForkDeadline). RunReview wraps the caller's ctx in context.WithTimeout, satisfying invariant #5: a hung provider call now returns instead of leaving inFlight=true and silently disabling all future reviews. Snapshot copy (reviewer.go) - Observe copies result.Messages into a fresh slice before handing it to the background goroutine. The main agent loop reuses its message slice; without the copy the goroutine would race on truncate-on-compact / append. -race passes for concurrent Observe (concurrency_test.go). Prompt synthesis per permissions (prompts.go) - skillSection constant replaced with skillSectionFor(mgr): action steps the SkillManager will veto at dispatch are stripped from the prompt. AllowCreate=false swaps "create is last resort" for "creation is disabled"; AllowDelete=false drops the retire-skill bullet entirely; AllowUpdateUserCreated=true widens the scope sentence to "patch any existing skill (including user-created)". Permission layer remains the hard floor — this is steering, not enforcement. Setting → runtime bridge (config_setting.go, new) - ResolveSettings(setting.SelfLearnSettings) → Resolved{Config, Perms, MemoryMaxChars}. Single conversion point with Validate() up front so wire-up code never branches on per-field defaults. Auto-memory reminder provider (reminder + app/agent.go) - New ProviderMemoryAuto constant. agent.go's reminder registration picks up the L1 store via system.LoadAutoMemory(cwd) and emits it under scope="auto" — distinct from memory-user / memory-project so agent-written and user-authored memory never mix. Wraps the same PostCompact / cwd-change refresh path the other memory providers use. Tests - concurrency_test.go: snapshot-copy isolation + concurrent Observe hammered by 8×20 goroutines under -race. - prompts_test.go: 5 permission-scenario coverage of skillSectionFor. - config_setting_test.go: happy path, validation error propagation, MaxKB default fallback and pass-through. -race passes for selflearn, setting, reminder, and app packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(selflearn): wire L1 reviewer into the app session lifecycle Connects the implementation of the previous commits into the running session so the feature actually fires when configured. Implements §9 step 4 of the design note. services.SelfLearn (services.go) - New field on the app's services struct that holds the live Reviewer when at least one arm is enabled. Nil ⇒ zero overhead: no goroutine, no counters, no extra model calls (§3.1 promise). wireSelfLearn (internal/app/selflearn.go, new) - Called from ensureAgentSession after Agent.Start so the ReviewFunc can capture the live System. - Reads setting.SelfLearn via selflearn.ResolveSettings, which also runs Validate() so an illegal config is logged and skipped instead of crashing the session. - Builds NewMemoryStoreWithCap + NewSkillManager with the resolved cap and permissions, threads both into a ForkConfig the ReviewFunc constructs at fire time. - Rebuilds the LLM client from the captured BuildParams.Provider/ModelID /MaxTokens so the fork hits the same prefix-cache key as the parent (§6 invariant #2) without sharing the parent's request cycle. - Seeds the memory arm's counter from the preloaded user-turn history via SeedTurns (invariant #8 resume). - Review summary is logged for Phase 1; user-visible MessageEvent + status-bar surface is Phase 1.5 (§"User-visible surface"). OnTurnEnd hook (model_agent_events.go) - Forwards every turn Result to forwardTurnToSelfLearn unconditionally. The Reviewer gates on StopEndTurn internally so cancelled / interrupted turns are silently skipped at one place rather than scattered up the call stack. StopAgentSession cleanup (agent.go) - clearSelfLearn nils out services.SelfLearn so a future ReviewFunc closure can't hold references to a torn-down agent. In-flight goroutines complete on their own via DefaultForkDeadline. systemOnlyAgent shim - Minimal core.Agent wrapper that exposes only System(). The ReviewFunc needs the parent's System for prefix-cache parity but nothing else from the live agent — keeps the dependency surface narrow and lets us avoid extending the core.Agent interface with an LLM() accessor that nothing else needs. ReviewKind.String (reviewer.go) - Stable log-friendly labels (none / memory / skill / memory+skill) for the wire-up's review log entry. Covered by TestReviewKindString. -race passes across selflearn, setting, reminder, and app packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(selflearn): runtime UI — evolving status + completion notice Implements two of the three §"User-visible surface" pieces from the design note (the /config Self-Learning panel remains for a follow-up): Status-bar "evolving …" indicator (conv/message.go + view.go) - New SelfLearnEvolving flag on OperationModeParams and a paired renderSelfLearnIndicator producing a muted, italic "evolving …" segment in the bottom-left when a review fork is in flight. Plain text — no emoji, no spinner — so it reads as a static piece of structural UI rather than a decoration (matches the design's text-not-icon ask). - view.go reads services.SelfLearnRunning at render time. Hidden in both the idle and disabled cases — the status bar pixel-matches a feature-off session whenever no review is actively running. Completion / failure notices (selflearn.go + services.go) - services.SelfLearnRunning (*atomic.Bool) is the bridge between the ReviewFunc goroutine and the render path; pointer-typed so the surrounding services struct stays freely copy-able via the View() path. - ReviewFunc sets/clears the flag with defer, then on success publishes a hub.Event{Target: "main", Subject: header, Data: summary} that the app's main event loop routes through onMainEvent → injectNotification, surfacing the recap as a notice + content block in the conversation flow. Failures publish a parallel selflearn.review.failed event with the trimmed error message — silent swallowing would leave the user wondering what happened. - "Nothing to save" stays silent (the caller already filters on empty summary). Tests - TestRenderModeStatusShowsSelfLearnIndicatorWhenEvolving locks in the visibility contract: hidden when the flag is false, visible — and only then — when true. Deferred to a follow-up - Braille / target-name spinner with tea.Tick animation and per-tool-call status updates (requires the fork to publish progress events back). - /config multi-panel Self-Learning panel. -race passes across selflearn, setting, reminder, and the app + conv packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(selflearn): braille spinner + 4-phase status state + write observers Promotes the bottom-left "evolving" indicator from a static text to the full §"User-visible surface" specification: hidden → "evolving ⠋" (review started) → "evolving ⠋ go-testing" (each successful tool call) → "evolved · N changes" (2 s after success) → "evolving failed" (3 s after failure) → hidden State machine (selflearn_ui.go, new) - SelfLearnUIState packs phase + target + frame + change-count behind a single mutex. BeginReview / Step / Complete / Fail are called from the reviewer fork's goroutine; Tick is called from the Bubble Tea Update goroutine. Snapshot returns a consistent triple for the render path. - Step debounces target swaps at ≥ 400 ms so rapid tool calls don't flicker the bar; the change-counter is unaffected and counts every write. - Done/failed phases auto-decay back to idle after 2 s / 3 s respectively; Tick reports active so the dispatcher knows when to stop re-arming. Write observers (selflearn package) - MemoryStore.SetWriteObserver and SkillManager.SetWriteObserver register goroutine-safe callbacks fired after each successful write. The fork package itself stays UI-agnostic; the app layer hooks the callbacks to Step the UI state with the affected name (skill kebab name) or "memory · <topic>" (basename minus .md, MEMORY.md collapses to bare "memory"). memoryTopicSuffix encodes that mapping. Tick + hub plumbing (update.go, model_turn_queue.go, selflearn.go) - selflearnTickMsg + scheduleSelflearnTick drive the spinner at ~100 ms. - ReviewFunc publishes a "selflearn.review.started" hub event the instant it begins so the main loop schedules the first tick — the indicator is visible from frame zero rather than waiting for a render. - onMainEvent intercepts the started event and routes it to the tick scheduler instead of letting it become a user-visible notice (the recap block at completion is still routed normally for "selflearn.review.done" / "selflearn.review.failed"). - handleSelflearnTick advances the state and re-arms while non-idle; the loop quiesces automatically when the state decays to idle. services / view - services.SelfLearnUI replaces the previous atomic.Bool. Always non-nil (constructed by newServices) so render can take Snapshot without a nil-check and idle just renders empty. - conv.OperationModeParams.SelfLearnSegment carries the pre-rendered text so the conv layer doesn't have to import the app-side state struct. Tests - TestSelfLearnUIPhaseTransitions: idle→reviewing→done lifecycle, render format, and decay timing. - TestSelfLearnUIFailDecay: failed-phase hold outlasts done-hold. - TestSelfLearnUIStepDebouncesTarget: rapid Step calls keep the previous target but still bump the change counter. - TestSelfLearnUITickFrameAdvances: spinner moves each tick. - TestMemoryTopicSuffix: index / topic file → status tail mapping. -race passes for selflearn, setting, reminder, and the app + conv + input packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(selflearn): /config Self-Learning panel Implements the §"User-visible surface" /config slash command — the third and last runtime UI surface in the design note (status-bar spinner and completion notice landed earlier). A multi-panel sidebar (Provider, Permissions, Appearance, …) is planned; for now Self-Learning is the single panel and the scaffolding is shaped so siblings can be added without breaking the slash command path. setting.UpdateSelfLearnAt (loader.go) - New helper that validates the config (§3.1) before writing and merges the change into either ~/.gen/settings.json (user-wide) or ./.gen/settings.json (project-local) via the existing atomic saveToFile path. Re-uses Validate() so disk never ends up holding an illegal combination, and clears the loadedSettings cache so the next Snapshot picks up the new values. input/on_config.go (new) — the panel - ConfigSelector struct holding a working snapshot of the L1 config that the user mutates in place; Save merges back to disk, Esc discards. - rows() materializes the panel layout as a flat list of typed rows: header / spacer / bool / int / advanced-hint / save. The same function drives navigation bounds and Render so they can never disagree. - Boolean toggles cycle the *bool fields through nil→true/false so users can land on any of the four §5.5 legal combinations. Int rows enter/edit/commit with backspace + digit handling and clamp the result to the row's [min, max] range — MaxKB clamps to SelfLearnMaxMemoryKB so the §4.2 invariant is enforced from the UI. - Render lays the panel out with rounded borders, a cursor caret, and a ≈ N EN words / N 中文字 hint next to the MaxKB row. Validation errors surface inline beneath the rows; the Save row reads as muted/grey when the snapshot is illegal and green when it is legal, so the user has a pre-write affordance for the same check setting.UpdateSelfLearnAt applies. - Tab toggles the save target between user-wide and project-local; surfaced in the title row so the action key is one keystroke. Wiring - input.Model grows a Config field; NewConfigSelector takes deps.Setting. - /config registered in slash_command.go's dispatch table; the handler just calls Config.Enter(width, height). - ConfigSelector added to popups() so the active panel intercepts keys before the textarea. - Update handles ConfigSavedMsg by adding a "Self-learning config saved (<scope>)" notice to the conversation flow. Tests - TestConfigSelectorIsActivated: Enter/Esc lifecycle. - TestConfigSelectorTogglesBool: space and enter both flip a bool row. - TestConfigSelectorIntEditAndClamp: int edit flow + clamp to row max. - TestConfigSelectorRenderShowsValidationError: illegal §3.1 combination surfaces the underlying error message in the rendered panel. - TestConfigSelectorTabFlipsScope: Tab cycles user ↔ project. -race passes for selflearn, setting, reminder, and the app + conv + input packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(selflearn): compliance — silent on Nothing-to-save, env override, prompt rewrites Closes four §-level discrepancies between the L1 implementation and the design note: §6 invariant #7 — silent on "Nothing to save" - selflearn.go now suppresses the user-visible recap when the model's reply is empty or a case/whitespace/period variant of "Nothing to save". The previous code only checked summary == "" so any "Nothing to save." reply produced a noisy notice. Helper isNothingToSave + TestIsNothingToSave covers the variants. §3.1 — GEN_DISABLE_SELF_LEARN env override - New const selfLearnDisableEnv. wireSelfLearn returns early when the env var is "1" / "true" regardless of settings.json. Matches the Claude-Code-style CLAUDE_CODE_DISABLE_AUTO_MEMORY documented in §3.1. §4.1 — eviction-first memory prompt - memorySection rewritten with the 4-step decision flow the design diagram encodes: (1) retire stale entries first; (2) replace, don't duplicate; (3) at-cap forces another prune; (4) only then add. The "a pass that only adds is a missed pruning opportunity" framing is now explicit. §5.1 — full trigger conditions in skill prompt - skillSectionFor now enumerates the table's CREATE-ALL-FOUR criteria (non-trivial + no coverage + class-level name + not anti-pattern), the UPDATE three paths (including the previously-missing "user-voiced style/format/workflow correction"), and the DELETE three paths verbatim. Preference order UPDATE > DELETE > CREATE is stated once at the top instead of being implicit in step ordering. - Tests updated for the new prose anchors. reviewPreamble + reviewClosing also adjusted: the model no longer writes a free-form summary line — the user-visible recap is built from the actual tool calls (the action log via the existing observers), so the closing instruction reduces to "reply with 'Nothing to save.' or empty". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(selflearn): code-quality cleanup — dead exports, redundant locks, stale comments Mostly the audit's "should-fix" bucket — nothing changes behavior, but the package's surface and read-flow tighten up: - systemOnlyAgent shim deleted. currentSystem() now returns (core.System, bool) directly; the previous shim was a core.Agent that lied about its capabilities for one caller. - forwardTurnToSelfLearn / clearSelfLearn one-line wrappers inlined at their two call sites. - MaxFileChars (zero callers) and SelfLearnUISnapshot.IsActive (no non-test callers; redundant with Tick's bool return) deleted. - NewMemoryStoreWithCap merged into NewMemoryStore (one constructor, one default behavior). Two test call sites updated to pass 0 explicitly. - Observer fields (MemoryStore.onWrite, SkillManager.onWrite) drop their RWMutex. The reviewer fork is single-flight per session (§6 invariant #8), so the field is written once at wire-up and read on every write by the same goroutine — no race. Type docs now state the "SetWriteObserver before first write" contract. - m.services.Agent == nil dead-branch dropped from currentSystem (Agent is set at newServices and never nilled). - Empty `if !replaceAll {}` block in skillpatch.go removed; comment moved to the outer loop where it actually applies. - errActionDenied's stale §3.1 anchor corrected to §5.5 (action permissions live there in the merged design doc). - Validate "0 < maxKB <= 25" error message corrected to "0 <= maxKB <= 25 (0 = default)" — 0 is intentionally accepted as "unset". - ConfigSelector.Render caches Validate() once instead of calling it twice per frame (Save row label + footer error line). - Trivial restate-the-name comments on Complete/Fail/Dir trimmed; the comments that remain explain why, not what. Snapshot-copy test rewritten to exercise the actual failure mode (caller truncates the slice while the goroutine still holds the original) rather than mutating the immutable element it used to. -race passes across selflearn, setting, reminder, and app + conv + input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(selflearn): A+D — structured recap from action log; dedup parses A · MessageEvent-style recap block (§"User-visible surface") - New ReviewAction type + SelfLearnUIState.RecordAction supersede the earlier Step(target). Each successful tool call now records (verb, kind, target) into the per-pass action log; the spinner-tail target still swaps with the existing 400 ms debounce. - MemoryWriteObserver gains an action parameter ("add" | "replace" | "remove"); SkillWriteObserver already had one. The wire-up maps both to verb / kind / target rows via memoryVerb / skillVerb. - formatRecapBlock turns the drained log into the delimiter-bounded block from the design example: ───────────────────────────────────────────────── · updated skill go-testing · saved memory memory · debugging · retired skill outdated-thing ───────────────────────────────────────────────── Empty input ⇒ "" so the no-write pass keeps quiet (§6 invariant #7). - publishSelfLearnSummary now takes []ReviewAction (not the model's one-line summary) and routes through the existing hub → notice path, so the recap surfaces as structural data rather than LLM self-report. - DrainActions clears the log after the publish so back-to-back passes don't cross-contaminate. D · de-duplicate the double frontmatter parse - New parsed{path, origin, fm, body} plus parseSkill(name) — guards requireAgentOwned and requirePatchable now return the full parsed struct, so Edit and Patch no longer call ParseFrontmatterFile a second time for the same file. WriteFile / RemoveFile / Delete ignore fm/body (they only need the path). - The mutex range itself is unchanged: ≤1 in-flight per session (§6 invariant #8) means the broad mu doesn't actually contend, and collapsing the locks would invite TOCTOU on origin checks. Tests - TestFormatRecapBlock locks the delimiter shape + per-row layout. - TestVerbMapping covers every memory_write / skill_manage action's verb mapping so the recap stays in sync with the design example. - TestSelfLearnUIPhaseTransitions / TestSelfLearnUIStepDebouncesTarget updated to the RecordAction API. -race passes across selflearn, setting, reminder, app + conv + input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(selflearn): C — invert permission polarity to Deny-encoded zero defaults settings.json public schema change. The three skill permission flags switch from *bool tri-state (nil = default true) to Deny-encoded bool (zero = allow). Eliminates the AllowCreateOr/AllowUpdateOr/AllowDeleteOr accessor sprawl and three local boolPtr / ptr helpers that existed only to construct the previous shape. settings.json schema (§3.1) - skills.allowCreate / allowUpdate / allowDelete *bool fields → DenyCreate / DenyUpdate / DenyDelete bool fields. Zero ⇒ allowed (Go idiom "zero value should be sensible default"). Every omitted field reads as "allow"; setting true is the explicit lockdown. - skills.allowUpdateUserCreated stays an Allow-encoded bool — it is an opt-in for an off-by-default capability, so the polarity stays positive. - Three new read-side methods AllowCreate() / AllowUpdate() / AllowDelete() return the negated stored value. Downstream code reads through these so the polarity inversion is invisible past this layer. Validate (§3.1) - Same three rejected combinations, error messages updated to the new field names: · denyUpdate=true requires denyCreate=true (created-but-never-refined) · denyDelete=true requires denyCreate=true (only-add-never-subtract) · allowUpdateUserCreated=true requires denyUpdate=false Callers - ResolveSettings (selflearn package) reads via the new accessors. - ConfigSelector's three skill rows toggle s.Skills.DenyX directly (no more *bool ceremony). The toggleAllowCreate/Update/Delete helpers and boolPtr in on_config.go are dropped; the row toggle is now a one-line lambda. - Three test helpers (boolPtr / ptr in selflearn / setting / input test files) deleted alongside the *bool fields they were built for. Design doc (notes/active/l1-background-review.md) - §3.1 config block, fields table, validation table, and rationale paragraph all switched to the Deny-encoded schema. - §5.1 trigger-conditions table and §5.0 trigger-layers row also updated so the permission-check column shows the new field names. - §5.5 meaningful-combinations table now shows the effective Allow permissions in the columns (preserving the at-a-glance reading) but annotates each row with the corresponding denyX field setting so the config-side correspondence is unambiguous. -race passes across selflearn, setting, reminder, app + conv + input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor(selflearn): /simplify cleanup — fix two regressions + drop dead code Four parallel /simplify agents found ~40 items. Applying the high-leverage subset; skipping the rest with notes below. Two correctness regressions caught by the audit (both introduced by this branch, not pre-existing): - setting/settings.go Data.Clone() never copied the new SelfLearn field, so Settings.Snapshot() silently returned the on-disk default instead of the user's saved config. /config would always have shown stale defaults. Restored the row + extended clone_test.go to lock it down alongside the rest of the scalar-drift guard. - app/selflearn.go publishSelfLearnSummary/Failure put the recap block in hub.Event.Data, which routes through injectNotification's Content branch into SubmitToAgent — every successful review would have re-prompted the LLM with its own audit trail, breaking §6's "out-of-band" promise. Recap now goes in Subject (Notice-only path). Documented the trap so the next caller doesn't relearn it. Simplifications: - isNothingToSave removed. The action log is authoritative (the design doc says so explicitly); empty log ⇒ no writes ⇒ silent. Substring matching the model's text was redundant + undermined the invariant. - currentSystem(*model) inlined at its one caller. The systemOnlyAgent shim it once defended against has been deleted upstream; the helper no longer earned its existence. - ForkConfig.MaxTurns / Deadline override fields removed (always zero-defaulted in practice; no caller sets them). Exported DefaultMaxTurns / DefaultForkDeadline collapsed to package-private forkMaxTurns / forkDeadline constants used inline. - selflearn.readOrigin (dead outside tests) removed. The two test call sites use the production parseSkill helper now. - configRow.tail (write-only field) removed. - SelfLearnUIState.changes (kept in lockstep with len(actions)) merged into a single doneCount snapshot that's only needed for the done-hold render after DrainActions clears the live log. - Stale "Step" reference in the package doc comment fixed (the method was renamed to RecordAction earlier). Reuse: - brailleSpinnerFrames promoted to internal/app/kit.BrailleSpinnerFrames. on_provider_view.go's providerSpinnerFrames is now a one-line alias so the two consumers share the table and any non-Unicode fallback lands in one place. Skipped (each noted, not argued): - Inventory() disk re-read on every prompt build — would need a per-write cache invalidation hook that crosses observers; out of scope for /simplify, queued for a perf pass. - Tick re-arming during done/failed hold — same; the cleanest fix schedules a single deadline tick instead of polling, which is a small but real refactor of the state machine. - Snapshot fast-path on idle (lock-free check) — would require an atomic phase field that mirrors the mutex-guarded one; the cost of divergence outweighs the per-frame lock saving. - ResolveSettings / ActionPermissions / SkillManager observer constructor — the three "drop the indirection" findings each touch multiple packages; they read cleanly but would be a separate refactor commit, not a quality cleanup. - hub.Event "selflearn.review.started" special case in onMainEvent — the proper fix is a dedicated hub target with its own subscriber; out of /simplify scope. - yaml.Marshal vs hand-rolled buildSkillMD — a careful refactor; hand-rolled quoter works for current inputs. - Inventory via skill.Registry — already documented as deliberate (registry is stale-cached against same-session writes). -race passes across selflearn, setting, reminder, and the app + conv + input + kit packages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(selflearn): batch 1 — critical correctness from /code-review #1 mergeSettings dropped SelfLearn (setting/merger.go) - The merger enumerated every Data field except the new SelfLearn block, so every Load + every SaveToFile silently wiped selfLearn from disk. The L1 feature was effectively unreachable from settings.json. Added mergeSelfLearn with field-level semantics (int coalesce, bool OR for enable/deny — safety-biased), plus a regression test that exercises base→overlay → base+overlay merges and the symmetric base-only case. #2 /config Space on rowHeader panicked (app/input/on_config.go) - Cursor started at index 0 (the Memory header) whose toggle field is nil, so the first Space keystroke after /config crashed the TUI. Up/Down didn't skip non-editable rows either. Fixed by (a) starting the cursor at the first editable row in Enter, (b) replacing the cursor++ / cursor-- math with nextEditableRow / prevEditableRow that walks past headers/spacers/advanced-hints, and (c) guarding Space and Enter with an explicit kind/toggle check so a stale cursor index can never crash via a nil call. Test: navigation across the panel never lands on a non-editable row. #3 DrainActions ran BEFORE the deferred Complete (app/selflearn.go + app/selflearn_ui.go) - The defer pattern meant the success path drained s.actions to a local var, returned, and only then ran Complete in the defer — which snapshotted doneCount = len(s.actions) = 0. The 2-second done-hold status bar always rendered "evolved" without the count. Restructured the wire-up to call Complete inline before Drain (no defer for the success path; explicit Fail call on the error path). Test: TestCompleteCapturesCountBeforeDrain exercises the Complete→Drain ordering. #4 RunReview rooted at context.Background (app/selflearn.go + app/services.go + app/agent.go) - A torn-down session couldn't cancel an in-flight fork — the LLM request held tokens / HTTP for up to forkDeadline (5 minutes) after the user moved on. Added services.SelfLearnCancel + a session-scoped context.WithCancel in wireSelfLearn; StopAgentSession cancels it so the fork unblocks immediately. #5 Phantom "evolving → evolved" on a torn-down session (app/selflearn.go + app/selflearn_ui.go) - The deferred Complete fired even on the !Active early-return, flashing the status bar for a review that never ran. Two fixes: (a) moved the liveness check above BeginReview/publishStarted so a bail produces zero UI state change; (b) Complete itself now goes straight to idle when len(s.actions) == 0 — the §6 invariant #7 "silent when nothing changed" promise applied to the bar surface too. Test: TestCompleteSilentOnEmptyActions. -race clean across selflearn, setting, reminder, app, app/conv, app/input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(selflearn): batch 2 — security + correctness from /code-review #6 WrapMemory("auto") fell to user-authority preamble (reminder.go) - memoryPreamble had no "auto" case; agent-written L1 content was introduced to the model as "the user's saved memory (preferences and standing instructions). Apply it throughout this session." An injection that slipped past the scan would have been executed as if the human user authored it. Added a distinct "auto" preamble that explicitly frames the content as durable context, NOT direct user instructions, with a do-not-act-on-imperatives caveat. #7 memoryEntryDelimiter not rejected at write (memory.go) - Content containing the literal "\n§\n" delimiter passed scanContent but silently split into multiple entries on next read — corruption AND a scan bypass (the post-split pieces were never scanned in isolation). Added rejectEmbeddedDelimiter, called from Add and Replace right after scanContent. Test: TestMemoryRejectsEmbeddedDelimiter exercises both write paths. #8 Two consecutive user messages in the fork (fork.go) - ag.SetMessages(snapshot) followed by ag.Append(UserMessage(prompt)) put two user-role messages on the wire when the snapshot ended with a tool_result (RoleUser-shaped on common providers), producing 400 "messages must alternate" rejections that made the reviewer effectively dead on tool-loop turns. Added trimTrailingUserMessages that reslices (doesn't mutate the caller's snapshot) so the review prompt is the only user turn at the tail. Test guards the three cases (trailing user, trailing assistant, nil input). #9 Memory prompt hardcoded the 25 KB cap (memory.go + prompts.go) - memorySection const said "hard 25 KB cap per file" regardless of memory.maxKB. With a lowered cap (e.g. 5 KB), the model would propose oversized writes that the store rejected, burning ThinkAct rounds. Replaced the const with memorySectionFor(*MemoryStore) which interpolates mem.MaxKB(); buildReviewPrompt signature grew a *MemoryStore parameter so the section sees the actual configured cap. #10 recover() value discarded (reviewer.go) - The Warn line was a constant string with no panic value, kinds, or stack — production panics were unobservable. Logged zap.Any("panic", rec), zap.String("kinds", …), zap.Stack so the operator has actionable info. -race clean across selflearn, setting, reminder, app, app/conv, app/input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(selflearn): batch 3 — lifecycle + perf from /code-review #11 Observer outlives session (app/selflearn.go) - The closures registered via SetWriteObserver captured SelfLearnUI by pointer and continued to fire after StopAgentSession had nilled SelfLearn — mutating UI state for a session the user terminated. Added a `m.services.SelfLearn == nil` early-return inside both closures so a write that lands post-stop is silently dropped. #12 Tick chain stacking on back-to-back reviews (app/selflearn_ui.go + app/model_turn_queue.go) - Every selflearn.review.started event unconditionally scheduled a fresh tick chain; if review B started before review A's done-hold finished, two chains advanced the spinner together at 2× cadence. Added a tickArmed bool guarded by the existing mutex plus an ArmTick() reservation method; onMainEvent only schedules a new tick when ArmTick returns true. tickArmed is cleared in Tick when the state decays to idle, so subsequent reviews rearm cleanly. #13 itersSinceSkill unbounded during in-flight (reviewer.go) - A 4-minute review during which 50 turns each contributed 10 tool iters left the counter at 500, then fired on every turn for many turns after the release. Capped the accumulator at 2× the threshold so a stuck review produces at most two immediate refires once it releases — preserves the "fire again next clean turn" intent without the post-release burst. Test: TestSkillIterCounterCappedDuringInFlight. #14 Snapshot took mutex on every render frame (app/selflearn_ui.go) - TUI repaints on every key press, resize, and inbound message; Snapshot's unconditional Lock/Unlock was pure overhead in the dominant idle case. Added phaseAtomic (sync/atomic.Int32) mirroring phase; writers update both together under the mutex; Snapshot fast-paths to the empty value when phaseAtomic reads idle without touching the mutex. #15 Tick re-armed at 100 ms during done/failed hold (app/selflearn_ui.go + app/selflearn.go) - ~20 useless tick round-trips during the 2 s done-hold and ~30 during the 3 s failed-hold, each forcing a full tea.Cmd → scheduler → Msg → Update dispatch. Tick now returns the delay for the next scheduled tick: selflearnTickInterval while reviewing, REMAINING hold time while done/failed (one deadline tick instead of polling). handleSelflearnTick uses the returned delay to schedule via tea.Tick directly. Test call sites updated for the new (delay, active) signature. -race clean across selflearn, setting, reminder, app, app/conv, app/input. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(selflearn): gofmt CI fix format-check was failing in CI on three files left over from manual edits across the recent batches: - internal/app/input/on_config.go - internal/app/services.go - internal/app/view.go Pure whitespace / alignment changes — no behavior delta. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(selflearn): CI green — layer violation + make ci target Two CI failures on this branch, both fixed: #1 internal/core/system → internal/session (layer violation) - AutoMemoryDir called session.EncodePath. core ranks 3 and session ranks 2 (feature), so this was an upward import in the layered architecture (caught by tools/layercheck). Inlined a 5-line encodeProjectPath helper in core/system that mirrors session.EncodePath — both functions are tiny string-ops and must stay in lockstep so memory and transcript stores resolve to the same project partition; comment makes the dependency explicit. #2 internal/selflearn unmapped in layercheck - Added "internal/selflearn": "feature" to tools/layercheck/main.go's layerOf map. selflearn imports core (system, agent) and infra (markdown) — both downward, consistent with feature-layer peers. Pre-push tooling - New `make ci` target runs every step the .github workflow runs in the same order (format-check → build-all → lint → unit tests → integration tests). Catches format/vet/layercheck/test failures locally instead of round-tripping through Actions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(selflearn): batch 4 — config persistence, lifecycle, hardening from /code-review Findings from the max-effort /code-review pass on this branch: - #1 config disable can't persist: UpdateSelfLearnAt now replaces the selfLearn block instead of OR-merging it via mergeSelfLearn, so a true->false toggle from /config actually sticks (other settings in the file are preserved). Cross-level Load layering still ORs by design. - #2/#3 lifecycle race + leak: write observers gate on a captured atomic.Bool instead of racing on m.services.SelfLearn; new teardownSelfLearn() cancels the prior fork context before a re-wire (the agent-toggle rebuild path bypassed StopAgentSession) and on stop. - #4 skill name traversal: resolve() validates name with skillNameRe so patch/edit/delete/write_file/remove_file can't escape the skills dir. - #5 yamlScalar under-quoting: emit the description via yaml.Marshal so a value like "[draft] ..." can't produce invalid frontmatter that bricks every later parseSkill. - #7 applyPatch overlap: lineWindowReplace collects non-overlapping fuzzy windows (advance by w on a hit), mirroring the exact tier. - #8 stale settings: reload the in-memory Settings handle on ConfigSavedMsg. - #10 memory delimiter: reject a bare "§" line that fuses with the entry delimiter and shifts boundaries on read. - #9 relax Validate: "create + update allowed, delete denied" is now a legal config (delete is already agent-created-only, so opting out removes no safety). Rule, test, and design note updated. Adds regression tests for the config round-trip, traversal, description round-trip, and fuzzy-overlap cases (previously untested). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * selflearn: clearer names and trimmed comments Rename `SelfLearnSegment` field → `SelfLearnStatus`, `SelfLearnUIState` → `SelfLearnIndicator` (file `selflearn_ui.go` → `selflearn_indicator.go`), `ArmTick` → `TryStartTicker`, `tickArmed` → `tickerRunning`, and `internal/selflearn/config_setting.go` → `resolve.go`. Trim verbose doc blocks across the L1 surface (selflearn.go, indicator, reviewer, fork, memory, skill, resolve) — keep WHY comments where they're load-bearing. No behaviour change. * selflearn: /simplify pass - Rename `selflearn.Resolved` → `Runtime` (clearer: it's the runtime bundle ResolveSettings returns) - Extract `SelfLearnIndicator.decayPhase` to dedupe done/failed branches in Tick - Extract `ConfigSelector.renderCursor` helper (3 inline call sites) - Use `s.Memory.MaxKBOr()` in /config panel instead of local `defaultIfZero` - `slices.Delete` for memory entry removal - `countUserTurns([]core.ChatMessage)` reads `m.conv.Messages` directly, avoiding a `ConvertToProvider()` allocation just for counting * selflearn: clearer types and grouped services - Flatten `selflearn.Config`: drop the awkward `Resolved`/`Runtime`/`Triggers` wrappers; the resolved bundle is just `Config` with `Memory`, `Skills`, `Perms`, `MemoryMaxChars` at the top level. - Rename `internal/selflearn/resolve.go` → `config.go` to match. - Group the four `services.SelfLearn*` fields under a `SelfLearnServices` sub-struct (`services.SelfLearn.Reviewer / .Cancel / .Live / .Indicator`) — cleaner field cluster and aligned with how the data moves together. * selflearn: /config panel — extensible shell + spacious layout - Extract a Panel interface so /config can host multiple sub-panels in the future. ConfigSelector becomes the popup shell: breadcrumb, optional tab strip when >1 panel is registered, esc/ctrl-tab handling. Today only Self-Learning is registered; adding a sibling panel (Provider, Permissions, …) is a one-liner in NewConfigSelector. - Move all Self-Learning UI into a private selfLearnPanel that implements Panel: rows, scope, snapshot, validate, save. - Adopt the /plugin overlay layout via kit.Panel — centered floating box, Padding(1, 2), full-width separator lines, kit.PanelTab tab strip, kit.HintLine footer. Right-aligned value column on int rows. - Add "/config › Self-Learning" breadcrumb so the popup advertises which command opened it. - Replace on_config.go / on_config_test.go with config_panel.go + config_selflearn.go + config_selflearn_test.go. * selflearn: /config layout polish + register slash command Layout: - Cap popup width to ~84 chars so on a wide terminal the form values stay near their labels instead of drifting to the far right. - Move the cursor inside the indent so ▸ sits right next to the row it marks, not orphaned in the leftmost column. - Flatten the section hierarchy: promote "Allowed actions" and "Advanced" from sub-headers under Skills to top-level sections. Result is four clean sections (Memory / Skills / Allowed actions / Advanced). - "Review cadence (user turns)" → "Run every" with the unit ("user turns", "tool iterations", "KB") rendered as a muted suffix next to the value. Reads naturally: "Run every 10 user turns". - The "Max size" footnote (≈ words / 中文字 equivalence) moves under the label column instead of right-indented. Slash: - Register /config in the builtin command list so it shows in the command picker — handler was already wired, but the metadata entry was missing so autocomplete never offered it. * selflearn: /config column-aligned layout - Hoist explicit layout columns: every row's label starts at the same column (labelCol=8) regardless of row type; numeric values right-end at valueCol=40 with units flowing past in muted style. Bool rows render "[ ] " between cursor and label; int rows pad an equivalent gutter so the label lands on the same column. Result: labels align vertically across the form, and "10" / "25" line up in a clean value column instead of trailing off to the box edge. - Slightly wider cap (84→92) and taller (32→34) so the form has room to breathe at "expanded" sizes. - Save button is column-aligned with the row content above it. * selflearn: /config compact value layout — value next to label Drop the right-alignment that pushed values to col 40 with a wide gap between label and value. Render int rows as a compact phrase instead: "Run every 10 user turns". The value sits right next to the label (highlighted), the unit follows in muted style — no floating value in the middle of empty space. * selflearn: /config — value-first rows, inline advanced hint - Int rows render value-first as a compact phrase with the underline highlighting the number, aligned in the same column as "[": 10 user turns - Run every 25 KB - Max size ~ 4500 EN words / 8500 中文字 (UTF-8) - "Max size" footnote moves inline after the label with "~" separator instead of a separate line below. - Merge the rowAdvHint row into the bool row via a new advHint field. "Update user-authored skills ⚠ rewrites your authored skill files" renders on one line — at narrow widths the hint wraps naturally. - Save button aligns at the same indent as the "[ ]" rows. * selflearn: /config — only Memory and Skills as top sections Restore the proper hierarchy: only Memory and Skills are top-level sections. "Allowed actions" and "Advanced" move back under Skills as sub-headers, with their bool rows nested one indent deeper. The layout math is rewritten around a single `contentCol(indent) = indent * 4` formula so adding deeper nesting is just bumping an indent value. Result reads as one Skills section with two grouped sub-blocks instead of four independent top-level sections. * selflearn: /config — wider box cap Bump the popup width cap from 92 → 120 (and floor 60 → 70) so on a wide terminal the form gets more horizontal room. The Advanced row's inline "⚠ rewrites your authored skill files" hint now stays on one line at typical widths instead of wrapping. * selflearn: /config — production-grade visual polish Pushes the form from "tidy monospace" toward a production terminal UI: - **Section dividers**: section labels render as caps with a hairline rule filling the rest of the row ("MEMORY ─────…", "SKILLS ─────…"). - **Vertical rail**: each section's rows get a left-edge "│" colored by the section's enabled state — green when on, muted when off. Lets the user scan section status without reading. - **Segmented scope control**: "scope user · project" with the active segment in bold + underline + accent (no background fill). - **Keycap-style hint line**: "[ ↑↓ ] navigate [ space ] toggle …" reads as labeled keys rather than prose. - **Unsaved indicator**: "● unsaved" tag pinned top-right (warning color), shown only when the working snapshot diverges from the baseline captured at Enter() time. - **Save row**: "[ Save ] or [ esc ] to discard" — the discard hint shares the keycap aesthetic for consistency. Layout: rail is prepended to each section row, consuming the first blank column so the column math (contentCol(indent) × 4) stays intact. Section header rows get enabledFn so the rail color can swap with the section's toggle. * selflearn: /config — label-first phrase for int rows Switch int rows from "value first - label" back to label-first prose so they read naturally instead of inverted ("Run every 10 user turns", "Max size 25 KB ~ ..."). Label aligns under the bool-row labels at the same indent. The number stays underlined to mark it as a value. * selflearn: /config — filled chips for Save, scope, values Push the design closer to the mockup with explicit chip/button surfaces: - Save: filled accent-bg pill ("Save" white on green) instead of bracketed text. Disabled state uses a muted-bg pill. - Scope segments: active segment is a filled accent pill ("user" on green); inactive is a flat padded label. Reads as a real segmented control. - Editable values: wrapped in muted "(N)" chip brackets so they stand out as input fields ("Run every (10) user turns"). Bracket choice "( )" avoids collision with bool-row "[ ]" checkboxes. * selflearn: /config — rail weight signals on/off state Use different rail glyphs for enabled vs disabled sections so the on/off signal carries shape AND color: - enabled → "┃" (heavy vertical) in accent green - disabled → "╎" (dashed light) in muted text-dim This way the active section's rail clearly outweighs the section- divider hairlines ("MEMORY ────"), making "which one is running" the loudest signal on the panel — a glance reads it without parsing the [✓] checkbox. * selflearn: /config — fainter section-divider hairlines Apply lipgloss Faint(true) on top of TextDim so the "MEMORY ────" and "SKILLS ────" hairlines recede further into the background. The active vertical rail (┃ in accent green) is the loudest signal again — the horizontal dividers are just whispered structural cues. * selflearn: /config — left-align, rail closure, dimmed children, key-cap bg Four targeted fixes from the latest design review: 1. Left-align the popup (lipgloss.Place uses Left, not Center) so the box shares a left edge with the surrounding app chrome — kills the L-shaped hole on wide terminals. 2. End the vertical rail at the last content row of each section. The trailing rowSpacer no longer emits a rail char, so MEMORY's "┃" closes after Max size and SKILLS' "╎" closes after the Advanced row instead of dangling down to Save. 3. Mute disabled-section children. When a section's enabledFn returns false, every row underneath (bools, ints, sub-headers, hints) gets wrapped in Faint(true) so they no longer read as "active". The parent toggle reads honestly as off. 4. Keycaps swap brackets for a gray bg fill — " space " on kit.SearchBg looks like a physical key cap, and the "[ ]" form is freed up to mean checkbox only. Removes the same-glyph-three- meanings overload in the bottom hint line. Also drop the redundant "\n" prefix before subsequent section headers (the trailing rowSpacer already inserts the gap). * selflearn: /config — extended rules, centered popup, drop scope label Three layout tweaks per the latest review: - Top/bottom horizontal rules span the full box width (boxWidth dashes) instead of just the inner content width, so they read as section chrome rather than as another indented row. The body keeps its 2-col gutter on each side, but the rules now bracket the whole box. - Re-center the popup on screen (lipgloss.Place uses Center again). The wider rules + centered placement remove the leftward weight that the previous left-align introduced. - Drop the "scope" label in front of the segmented control — the two visible segments ("user · project") are self-evident once they're styled as a real selector. Implementation note: vertical box padding (1, 0) is now the only padding lipgloss applies; the 2-col horizontal gutter is added per line in indent() so the rules can ignore it. * selflearn: /config — pin "● unsaved" to the breadcrumb line Promote the dirty signal to the Panel interface (new Dirty() method) so the shell can render the "● unsaved" tag at the right edge of the header line instead of the panel body adding its own row. Result: when there are unsaved edits, the indicator just appears next to the breadcrumb — no extra blank line above the scope control, no spacing jolt as the panel goes from clean to dirty. The panel body no longer renders renderUnsaved (deleted), and the top-right tag styles move out of the panel and into the shell. * selflearn: /config — fill the terminal like /plugin Drop the width and height caps in boxSize so the popup expands to match the surrounding /plugin and /model overlays: w = max(60, terminal_width - 6) (no upper bound) h = max(18, terminal_height - 4) (no upper bound) On a wide terminal the form now reads as a full overlay with a 6-col margin on each side, instead of a narrow centered card. * selflearn: validation errors speak UI language The /config panel surfaces Validate() errors inline; the previous messages exposed raw JSON field names (denyUpdate, allowUpdateUserCreated, denyCreate) that don't appear anywhere on screen, leaving the user wondering which option to flip. Rewrite the three messages to reference the on-panel labels: - ⚠ "Create new skills" needs "Update existing skills" — otherwise created skills could never be refined - ⚠ "Update user-authored skills" needs "Update existing skills" — the base update permission - ⚠ memory size must be between 1 and 25 KB (got N) Test expectations updated to match the new wording. * chore: bump version to 1.19.2 Signed-off-by: Meng Yan <yanmxa@gmail.com> * selflearn: simpler spinner, drop 'evolving' text Swap braille spinner for the classic |/-\ ASCII set and remove the 'evolving' prefix: - reviewing → '/ Skill · docx — reading SKILL.md' (just the spinner + target) - done → '✓ N changes' (still suppressed for zero-write passes) - failed → '× review failed' The braille glyphs render as wide cells in some PTYs, jittering the surrounding label width. ASCII frames stay 1-cell across the board. * selflearn: record L1 fork in main session transcript (sidechain) The L1 background reviewer fork was invisible to the inspector — it created a fresh core.Agent with no OnEvent hook, so its inference requests/responses never landed in the session transcript. Wire it into the existing session-recorder pipeline as a sidechain: - selflearn.ForkConfig gains an OnEvent field that RunReview threads through to core.NewAgent. - session.Setup.NewSidechainRecorder builds a fresh Recorder bound to the current session but flagged Sidechain=true, with a caller- supplied AgentID ("selflearn-review"). The recorder isn't cached because forks can run concurrently and each wants its own ID. - session.Recorder carries agentID + isSidechain on the struct and passes them through to AppendMessageCommand so the fork's messages land with the right attribution. In wireSelfLearn the ReviewFunc closure constructs the sidechain recorder per pass and threads its OnAgentEvent into ForkConfig.OnEvent. Inspector behaviour: by default IncludeSidechain=false so the main thread view stays clean; opening "selflearn-review" surfaces every PreInfer/PostInfer/OnAppend the fork emitted. * selflearn: LLM-supplied done summary + uniform kind labels The post-review status tag was "✓ N changes" — generic and bland. Two changes give it a one-line concrete recap: 1. The reviewer prompt now asks the model to close with a single short sentence ("trimmed go-testing SKILL.md by 1.8KB") instead of being forbidden from writing one. The fork's reply text flows through selflearn.RunReview → Indicator.Complete(summary). 2. Indicator.Complete takes the LLM summary and stores it as doneSummary, falling back to a templated phrase built from the action log if the LLM is silent or its line exceeds 80 cells. Action-log fallback shapes: - 1 action → "updated skill · go-testing" - 2 actions (≤60c) → "saved memory · debugging, updated skill …" - 3+ or too long → "saved 2 memory entries, updated 1 skill, …" The in-progress tail also gets a small polish: ReviewAction.Target is now the bare identifier (memory topic, skill name) and the indicator's actionLabel() formats "memory" / "memory · debugging" / "skill · go- testing" at display time. Skill labels finally carry the "skill · " prefix that memory always had, so both kinds read uniformly. Recap block, RecordAction debounce, and the silent-on-zero-writes invariant are preserved. Added an indicator preview test so the status-line look is inspectable via go test -v. * selflearn: /selflearn-demo for hand-testing the indicator Adds a hidden slash command that drives the L1 status-bar indicator through one scripted lifecycle on a background goroutine — useful for eyeballing the spinner / target tail / done summary in a real terminal without firing a live LLM review. Usage in the running app: /selflearn-demo Timeline (~6 s): T+0 reviewing | (spinner) T+0.8s \ memory — index write T+2.0s \ memory · debugging — topic write T+3.2s \ skill · go-testing — skill update T+4.4s \ skill · python-typing — skill create T+5.2s ✓ trimmed go-testing SKILL.md by 1.8KB · saved 2 notes T+7.2s (decays back to idle) The command is intentionally NOT in builtinCommands(), so it stays out of the slash picker — type it manually when you want the demo. * selflearn: italic-dim recap with 💭 prefix, drop chrome Three follow-ups from the live demo: 1. /selflearn-demo no longer prints a "demo started" toast — the indicator itself is the entire signal. 2. Drop the "Self-improvement review (memory+skill)" header and the bracketing "─────" rules from the recap. They added visual weight without information. 3. Reformat each recap line as italic + TextDim with a 💭 thinking icon prefix — recap reads as a quiet "background thought" sequence rather than a section box: 💭 saved memory · debugging 💭 updated skill · go-testing 💭 created skill · python-typing * selflearn: per-action notes + dialog-style recap Replaces the "💭 saved memory · debugging" recap rows with a small self-improvement dialog block, and adds the per-action description the user asked for ("memory · <file> : <what changed>"). The dialog stays out of the spotlight when nothing changed. How it threads end-to-end: 1. Tools (memory_write + skill_manage) gain a required "note" parameter — one short clause describing what THIS specific call changed. The review prompt instructs the model: - "added 3 race-condition repro tips" - "trimmed examples section by 1.8KB" - "removed vague tooling guidance" 2. Store/manager method signatures (Add/Replace/Remove/Create/Patch/ Edit/WriteFile/RemoveFile/Delete) accept the note. fireWrite passes it to the observer. Tests pass "" for note. 3. MemoryWriteObserver / SkillWriteObserver become (action, name, note). wireSelfLearn's two observers route the note into the new ReviewAction.Note field. 4. The recap publish renders as: 💬 Self-improvement memory · debugging: added 3 race-condition repro tips skill · go-testing: trimmed examples section by 1.8KB skill · python-typing: new skill, typing-hints Header + rows all italic + TextDim so the block reads as a quiet background thought, never competing with the chat content. Rows without a note degrade gracefully to "<kind · target>" (no trailing colon). 5. Zero-action passes skip the publish entirely — the dialog never appears for silent reviews. * selflearn: drop 💬 from recap…
Summary by CodeRabbit
Bug Fixes
Chores