feat(memos-local-plugin): one-round-one-card UI + language-aware knowledge + L2/L3 boundary prompts#1516
Merged
hijzy merged 1 commit intoMemTensor:mainfrom Apr 22, 2026
Conversation
…ledge + L2/L3 boundary prompts UI: one user turn = one memory card - New `traces.turn_id INTEGER` column (migration 013) stamped by `step-extractor` with the user turn's ts; every sub-step of the same user message shares the same turnId. - `MemoryGroup` aggregation in `web/src/views/MemoriesView.tsx` collapses rows by (episodeId, turnId): one card per turn, role pill chosen by group-level rule (any tool → "tool"), aggregate V/α displayed as the member-row mean. - Drawer rewritten as `<StepList>`: every member step renders as a collapsible <details> block with its own ts / V / α / agentThinking / toolCalls / reflection. First step expanded, rest collapsed so a 10-tool turn doesn't drown the user. - Bulk actions (select / delete / share / export) operate on whole cards: card checkbox toggles the full set of member ids; delete / share / export bulk over `g.ids` so a card never half-disappears. - Algorithm layer untouched — every L1 trace stays step-level so V/α reflection-weighted backprop, L2 incremental association, Tier-2 error-signature retrieval, and Decision Repair keep their per-step granularity (V7 §0.1). Per-tool reasoning capture (carryover, see PR MemTensor#1515) - ToolCallDTO carries `value` / `reflection` / `thinkingBefore` so the drawer's per-step section can show the per-tool intermediate thinking and any LLM-assigned per-tool score without a schema change. - StepCandidate.meta.turnId / subStep / subStepIdx / subStepTotal threaded through capture.ts → traces.turn_id; `pickTurnId` falls back to the trace's own ts so old fixtures still produce singleton groups instead of crashing. Knowledge generation in user's language - `core/llm/prompts/index.ts` adds `detectDominantLanguage(samples, {minSignal})` — counts CJK ideographs + ASCII letters and returns "zh" / "en" / "auto" (allocation-free, runs on every gen call). - All five knowledge-generation sites now emit a `languageSteeringLine` system message keyed off their evidence: * core/capture/alpha-scorer.ts ← reflection-quality reason * core/capture/batch-scorer.ts ← per-step batch reflections * core/memory/l2/induce.ts ← L2 policy fields * core/memory/l3/abstract.ts ← L3 (ℰ, ℐ, C) bullets * core/skill/crystallize.ts ← skill body + scope - Effect: a Chinese-speaking user no longer gets a half-English skill card. An English user no longer gets a 中文-mixed reflection. L2 / L3 prompts: hard boundary against drift - `L2_INDUCTION_PROMPT` v1 → v2: explicit "what NOT to write" guard rejects environment topology, declarative behavioural rules, and generic taboos. New same-fact-two-framings example shows how to re-fold an env fact into a state-level trigger or step-level caveat. - `L3_ABSTRACTION_PROMPT` v1 → v2: bans imperative verbs (do/should/use/ install/run) under any of ℰ/ℐ/C; reworked all three example sets to pure declarative ("loading a glibc-linked binary wheel inside Alpine raises a dynamic-link error" instead of "if pip fails, install dev libs and retry"). Same-fact contrast example included. - Test mock keys updated v1 → v2 in induce.test.ts / l2.integration.test.ts / openclaw-full-chain.test.ts / v7-full-chain.e2e.test.ts. Historical `inducedBy` audit strings intentionally left at v1 — they're metadata recording the prompt version a row was generated under, not call-time keys. Retrieval injector: heading hierarchy - `# User's conversation history (from memory system)` is now H1, with `## Memories` / `## Skills` / `## Environment Knowledge` as H2 so the injected block has a clean outline in the LLM's context (previously the inner sections used H1 too, breaking the visual hierarchy). Migration runner: SQLite defensive mode - better-sqlite3 ≥ v11 enables `SQLITE_DBCONFIG_DEFENSIVE` which blocks writes to `sqlite_master` even with `PRAGMA writable_schema=ON`. Migration 012 (status unification) needs that pragma to swap CHECK constraints in-place. `runMigrations` now flips `db.raw.unsafeMode` on at the outer boundary if any pending migration uses `writable_schema`, then off again in `finally`. Migrations are shipped with the plugin (never user input) so this is safe. - Migration 012 SQL itself rewritten to use single-quote string literals with doubled inner quotes (instead of double quotes that better-sqlite3 strict mode treats as identifiers). Documentation - New `docs/GRANULARITY-AND-MEMORY-LAYERS.md` — mental-model alignment doc explaining: 小步/轮/任务 三个粒度的关系、打分粒度(每步 α/V, 每任务 R_human,"轮"无独立分)、检索粒度(技能/单步/子任务序列/ 环境认知,没有"按轮"召回)、生成链路(小步→经验→环境认知→技能)、 以及 §6 "经验 vs 环境认知 边界裁剪" 章节回答"该不该合并"问题:7 条 反对合并的理由 + 三种折中方案对比 + 同事实多框架对照判别表。 - `docs/Reflect2Skill_算法设计核心.md` 头部加阅读顺序提示,引导新人 先看上面那篇粒度对齐文档。 - `docs/README.md` 索引同步更新,标粗 GRANULARITY-AND-MEMORY-LAYERS。 Tests - `tests/unit/capture/step-extractor.test.ts`: turnId stability assertions across sub-steps; multi-tool turn shares one turnId. - All other test fixtures' LLM mock keys synchronized with new prompt versions; non-mock `inducedBy` audit fields kept at v1 by design.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR continues the v2 Reflect2Evolve plugin work merged in #1515 with three orthogonal improvements that landed together because they share the same trace fixtures and tests:
(episodeId, turnId). Algorithm layer (V/α backprop, L2 induction, Tier-2 retrieval, Decision Repair) keeps step-level granularity per V7 §0.1.languageSteeringLineso a Chinese user no longer gets half-English memos.L2_INDUCTION_PROMPTandL3_ABSTRACTION_PROMPTbumped v1 → v2 with explicit "what NOT to write" guards plus same-fact-two-framings examples to keep procedural ↔ declarative knowledge cleanly separated.Plus two infrastructure fixes the v2 plugin needed to actually run on better-sqlite3 ≥ v11 (defensive-mode block on
sqlite_master) and a documentation alignment doc explaining the 小步/轮/任务 + 经验/环境认知/技能 mental model so future contributors stop conflating UI, storage, and algorithm granularities.What changed
traces.turn_id+ per-turn UI grouping013-trace-turn-id.sql: addsturn_id INTEGER+idx_traces_episode_turnindex.step-extractor.tsstamps every sub-step from the same user message with the user turn'stsasmeta.turnId;capture.ts::pickTurnIdthreads it intotraces.turn_id.MemoriesView.tsxintroducesMemoryGroupaggregation +<StepList>drawer so a 5-tool turn renders as one card with five collapsible step blocks (each carrying its own V / α / reflection / toolCalls), instead of five sibling cards. Bulk select / delete / share / export operate at card level.turn_idand fall back to per-row rendering.Language-aware knowledge generation
core/llm/prompts/index.ts: newdetectDominantLanguage(samples, {minSignal})— counts CJK ideographs vs ASCII letters, returns"zh" | "en" | "auto". Allocation-free, runs on every gen call.languageSteeringLine:capture/alpha-scorer.ts— reflection-quality reasoncapture/batch-scorer.ts— per-step batch reflectionsmemory/l2/induce.ts— L2 policy fieldsmemory/l3/abstract.ts— L3 (ℰ, ℐ, C) bulletsskill/crystallize.ts— skill body + scopeL2 / L3 boundary prompts (v1 → v2)
L2_INDUCTION_PROMPT: new "Boundaries — what NOT to write" section explicitly rejects environment topology / declarative behavioural rules / generic taboos. Includes same-fact-two-framings example (procedural vs declarative for the same underlying truth).L3_ABSTRACTION_PROMPT: bans imperative verbs (do / should / use / install / run) under any of ℰ/ℐ/C. All three example sets rewritten as pure declarative ("loading a glibc-linked binary wheel inside Alpine raises a dynamic-link error" instead of "if pip fails, install dev libs and retry").inducedByaudit strings intentionally left at v1 (they record the prompt version a row was generated under, not a call-time match key).Retrieval injector heading hierarchy
# User's conversation history (from memory system)is H1;## Memories/## Skills/## Environment Knowledgeare H2 — restores the visual outline the LLM consumes.Migration runner: better-sqlite3 ≥ v11 compatibility
runMigrationsnow flipsdb.raw.unsafeMode(true)at the outer boundary if any pending migration usesPRAGMA writable_schema(resets infinally). Migration 012 (status unification) needs this to swap CHECK constraints in-place; defensive mode otherwise blocked it at runtime.Documentation
docs/GRANULARITY-AND-MEMORY-LAYERS.md(~365 lines, zh-CN) — the foundational mental-model doc that should be read before any other algorithm doc:docs/Reflect2Skill_算法设计核心.md头部加阅读顺序提示。docs/README.md索引同步更新。Algorithm alignment
Per V7 §0.1, the L1 trace is the minimum learning unit and stays step-level — one tool call → one trace, one final reply → one trace. The "one round = one memory" view is purely a frontend display concern using
turn_idas a stable group key. Reflection-weighted backprop, cross-task L2 association, error-signature retrieval, and Decision Repair all continue to operate per-step. Documented end-to-end in the new GRANULARITY doc §6.Test plan
npx vitest run tests/unit/capture/step-extractor.test.ts— turnId stamped on every sub-step, multi-tool turn shares one turnId (11/11 pass)npx vitest run tests/unit/memory/l2/ tests/unit/memory/l3/ tests/unit/llm/prompts.test.ts— prompt v2 mock keys + L2/L3 induction (74/74 pass)npx vitest run tests/unit/storage/— migration 013 applies cleanly (106/106 pass)npx vitest run tests/unit/— full unit sweep: 802/806 pass; 4 failures are pre-existing onmain(mock LLM behavior in reward integration + an outdatedcapture.lite.doneevent-list assertion), unchanged by this PRbash install.sh --version ./memtensor-memos-local-plugin-2.0.0-beta.1.tgz: gateway + viewer come up clean,traces.turn_idcolumn present, migration 013 logged as applied工具 · 4 步chip, drawer expands into 4 collapsible step sections with per-step V/α/thinking/tool I-ONotes
apps/memos-local-plugin/is touched. No changes to other packages.