Skip to content

feat: 记忆系统 Phase 0 技术原型 (sqlite-vec + FTS5 + DashScope)#73

Merged
lishuceo merged 9 commits into
mainfrom
feat/memory-system-phase0
Feb 28, 2026
Merged

feat: 记忆系统 Phase 0 技术原型 (sqlite-vec + FTS5 + DashScope)#73
lishuceo merged 9 commits into
mainfrom
feat/memory-system-phase0

Conversation

@lishuceo
Copy link
Copy Markdown
Owner

@lishuceo lishuceo commented Feb 27, 2026

Summary

Phase 0: 技术原型 ✅

  • 新增 src/memory/ 模块:类型定义、SQLite 独立数据库(FTS5 + sqlite-vec)、两阶段写入 Store、混合检索(BM25 + vector score fusion)、DashScope embedding provider
  • 更新 Plan 5 文档,补充 4-Agent Review 发现的 5 个 Critical 和 12 个 Major 修正
  • 经 3-agent 并行 code review,修复 3 个 Critical(FTS5 注入消毒、vec0 cosine 距离、provider 竞态)、8 个 Warning、5 个 Info

Phase 1: 抽取 + 注入集成 ✅

  • 新增 init.ts: singleton 初始化器,启动时创建 DB/store/search,定期维护,优雅关闭
  • 新增 extractor.ts: 对话结束后调 DashScope Qwen (qwen3.5-flash) 从 prompt+output 中提取记忆,fire-and-forget,L0 置信度上限 0.7,自动查重 + supersede
  • 新增 injector.ts: 对话开始前搜索相关记忆,按类型分组格式化为 system prompt 片段,受 maxInjectTokens 约束
  • 修改 executor.ts: 新增 memoryContext 字段,插入 knowledge 和 history summaries 之间
  • 修改 event-handler.ts: executeClaudeTask + executeDirectTask 两条路径均接入注入(执行前)和抽取(执行后)
  • 修改 index.ts: 启动初始化 + 30min 定期维护 + 优雅关闭
  • embedding 维度 1536(text-embedding-v4 最高精度)
  • 抽取模型 qwen3.5-flash(基准测试: 稳定 7-8s,比 plus 快 3-10 倍,质量相同,价格便宜 5 倍)

质量评估结果

检索质量: 命中率 90%, Top-1 准确率 78%, Top-3 准确率 89%
抽取质量: 类型准确率 100%, 关键词命中率 100%, 噪声过滤通过

生产环境配置

MEMORY_ENABLED=true
DASHSCOPE_API_KEY=sk-xxx
# 以下均有合理默认值,无需额外配置
# MEMORY_EXTRACTION_MODEL=qwen3.5-flash
# MEMORY_EMBEDDING_MODEL=text-embedding-v4
# MEMORY_EMBEDDING_DIM=1536
# MEMORY_MAX_INJECT_TOKENS=4000

Test plan

  • npm run typecheck 通过
  • npx vitest run src/memory — 141 tests 全部通过 (8 test files)
  • FTS5 trigger 同步验证 (insert→match, update→old miss/new match, delete→miss)
  • BM25-only 降级正常工作 (sqlite-vec 不可用时 vectorEnabled=false)
  • 置信度分层强制上限 (L0≤0.7, L1≤0.9, L2≤1.0)
  • FTS5 查询输入消毒 (特殊字符 strip 后 wrap 为 literal)
  • Agent/workspace 隔离验证
  • DashScope embedding API 集成测试 (text-embedding-v4, 1536 维, hybrid search)
  • 语义搜索验证 ("UI library upgrade" → 命中 React 迁移, vec: 0.48)
  • 跨语言检索验证 (中文 "项目用什么数据库" → 命中英文 PostgreSQL, vec: 0.55)
  • Qwen 抽取集成测试 (从对话中提取结构化记忆)
  • 注入格式化验证 (类型分组、日期标注、token 预算截断)
  • 抽取→注入全链路闭环 (提取记忆 → 搜索命中 → 注入 prompt)
  • 质量评估: 10 个检索 case + 5 个抽取 case

🤖 Generated with Claude Code

@claude
Copy link
Copy Markdown

claude Bot commented Feb 27, 2026

Claude finished @lishuceo's task in 5m 3s —— View job


Code Review: Memory System Phase 0

  • Check for previous review comments (none found)
  • Read CLAUDE.md and understand project conventions
  • Read all changed source files (6 source + 4 test files)
  • Deep review: database.ts — schema, FTS5 triggers, sanitization, prepared statements
  • Deep review: store.ts — two-phase write, confidence clamping, supersede flow
  • Deep review: search.ts — hybrid scoring, isolation filters, BM25 normalization
  • Deep review: embeddings.ts — async client init, race condition handling
  • Deep review: types.ts, index.ts, config.ts — type definitions, exports, env parsing
  • Cross-module tracing and architecture review
  • Post final review summary

Verdict: ✅ Approved

This is a well-structured Phase 0 prototype with solid defensive programming. After reading all 6 source files in full, tracing cross-module interactions, and verifying design decisions against the plan document, no critical or high-confidence bugs were found.

What I verified

Area Assessment
FTS5 injection sanitizeFtsQuery correctly strips * " ( ) { } ^ : and wraps all tokens in double-quoted literals. FTS5 keywords (AND, OR, NOT, NEAR) become harmless literal searches when quoted. ✅
SQL injection All queries use parameterized statements. The only string interpolation (float[${dimension}] in vec0 DDL) is validated as Number.isInteger(dimension) && dimension > 0 && dimension <= 65536 immediately before use. ✅
FTS5 external content sync INSERT/UPDATE/DELETE triggers correctly maintain the FTS5 index. The UPDATE trigger properly deletes old content then inserts new content. ✅
vec0 cosine distance distance_metric=cosine produces distance in [0, 2]. The 1 - distance conversion in search.ts and the < 0.15 threshold in findConflicting are both correct for cosine distance. ✅
Two-phase write Phase 1 (sync INSERT + FTS5 trigger) ensures BM25 is immediately searchable. Phase 2 (async embed + vec0) is fire-and-forget with proper .catch() and flush() support. ✅
Confidence caps clampConfidence enforces L0≤0.7, L1≤0.9, L2≤1.0 with Math.min(Math.max(confidence, 0), cap). ✅
Agent/workspace isolation Search correctly filters by agentId (with * wildcard for globals), userId (null = cross-user), and workspaceDir (null = cross-workspace). ✅
Provider race condition DashScopeEmbeddingProvider uses clientReady: Promise pattern — embed() awaits readiness before using the client. If import('openai') fails, _available flips to false and future calls throw cleanly. All callers wrap embed in try-catch. ✅
ESM & project conventions .js extensions in imports, Pino logger, singleton pattern, proper barrel exports in index.ts. Follows existing codebase patterns. ✅
Dependency strategy sqlite-vec in optionalDependencies (graceful degradation), openai in dependencies (needed for DashScope). ✅

Minor observations (informational, no action required)

  1. ttl naming — The field stores an absolute expiry timestamp (ISO string), but the name "TTL" typically implies a duration. Consider renaming to expiresAt in a future phase for clarity.

  2. pendingEmbeddings cleanup — Each .finally() creates a new array via filter(), making cleanup O(n²) under high concurrency. Fine for Phase 0 scale; consider a Set if this becomes hot.

  3. supersede() atomicitycreate + supersedeMemory + deleteVec are not wrapped in a single transaction. If supersedeMemory somehow fails after create, both memories remain valid. In practice this is benign since the UPDATE is unlikely to fail, and both-valid is safer than data loss.

  4. vectorWeight validationHybridSearch constructor accepts any number without checking [0, 1]. Currently not wired to config, so no risk yet. Worth adding a guard when the integration layer is built.

Test coverage

The 79 tests provide comprehensive coverage including: FTS5 trigger sync (insert→match, update→old miss/new match, delete→miss), BM25-only degradation, confidence caps, TTL expiry filtering, agent/workspace isolation, empty/special-char query sanitization, and two-phase write resilience when embedding fails.

@lishuceo lishuceo force-pushed the feat/memory-system-phase0 branch from b70746d to 58fdff1 Compare February 28, 2026 09:27
@claude
Copy link
Copy Markdown

claude Bot commented Feb 28, 2026

Claude finished @lishuceo's task in 3m 25s —— View job


Code Review: Memory System Phase 0 + Phase 1

  • Check for previous review comments (none found)
  • Read CLAUDE.md and understand project conventions
  • Read all changed source files (13 source files)
  • Deep review all new modules
  • Cross-module tracing and architecture review
  • Post final review summary

Verdict: ⚠️ Issues Found

After reading all 13 source files in full, tracing cross-module interactions, and verifying correctness of the Phase 0 core (database, store, search, embeddings) and Phase 1 integration (extractor, injector, init, event-handler, executor), I found 2 critical bugs and 2 warnings.

The Phase 0 core (FTS5 sanitization, two-phase write, hybrid scoring, vec0 cosine distance, agent/workspace isolation, confidence clamping) remains solid — consistent with the previous review. The new Phase 1 issues are all in init.ts maintenance and injector.ts formatting.

Issues

# Severity File Confidence Description
1 🔴 Critical init.ts:75-83 92 Maintenance queries silently broken: datetime('now') returns "YYYY-MM-DD HH:MM:SS" (space separator) but stored timestamps use new Date().toISOString()"YYYY-MM-DDTHH:MM:SS.sssZ" (T separator). Since T > lexicographically, ttl < datetime('now') and created_at < datetime('now', '-90 days') are never true. Both maintenance ops are no-ops.
2 🔴 Critical init.ts:79-83 85 Orphaned vec0 entries: Raw DELETE FROM memories bypasses MemoryStore.delete() which handles vec0 cleanup. Even after fixing #1, deleted memories leave orphaned memories_vec rows.
3 🟡 Warning database.ts:172 80 Default dimension mismatch: MemoryDatabase.create() defaults to 1024, but config defaults to 1536. Safe today (init.ts passes config value), but misleading for direct callers.
4 🟡 Warning injector.ts:87-105 78 Empty section headers on budget exhaustion: Inner loop break on token budget doesn't break outer loop, potentially rendering ### 项目事实 headers with no items underneath.

What I verified (no issues found)

Area Assessment
FTS5 injection defense sanitizeFtsQuery strips * " ( ) { } ^ : and wraps tokens in double-quoted literals. ✅
SQL injection All queries parameterized. float[${dimension}] interpolation validated by dimension bounds check. ✅
Two-phase write Phase 1 sync INSERT + FTS5 trigger, Phase 2 async embed with .catch() and flush(). ✅
Confidence clamping L0 ≤ 0.7 enforced in store.create() via clampConfidence. Extractor correctly sets confidenceLevel: 'L0'. ✅
vec0 cosine distance 1 - distance conversion and < 0.15 threshold for findConflicting are correct. ✅
Provider race condition clientReady: Promise pattern in both DashScopeEmbeddingProvider and getExtractionClient(). ✅
Agent/workspace isolation Search filters correctly enforce agentId (with * wildcard), userId, workspaceDir. ✅
Event-handler integration Both executeClaudeTask and executeDirectTask correctly inject (before) and extract (after, fire-and-forget). ✅
Executor memoryContext Injected between persona/workspace prompt and history summaries. Hash excludes memory (avoids cache invalidation). ✅
Startup/shutdown lifecycle initializeMemory() idempotent, closeMemory() in shutdown with 3s delay, maintenance in 30min interval. ✅
ESM & conventions .js extensions, Pino logger, singleton pattern, barrel exports. ✅

Comment thread src/memory/init.ts Outdated
Comment thread src/memory/init.ts Outdated
Comment thread src/memory/database.ts Outdated
Comment thread src/memory/injector.ts
lishuceo and others added 9 commits March 1, 2026 03:30
Implement sqlite-vec + FTS5 + DashScope embedding prototype for the
agent memory system (Plan 5). Includes review findings documentation,
hybrid search with BM25-only fallback, confidence level caps (L0/L1/L2),
two-phase write, and workspace isolation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, 5 info)

Critical: sanitize FTS5 query input to prevent injection, add
distance_metric=cosine to vec0 table, fix DashScope provider async
init race condition via clientReady promise.

Warning: add input validation (content length, tags count, metadata size),
log errors in catch blocks, use safe Buffer.from with byteOffset/byteLength,
check memory existence before vector insert to prevent orphans, clamp search
limit to 100, move sqlite-vec to optionalDependencies, fix updateMemory
nullish coalescing for nullable fields.

Info: replace readonly mutation hack with getter, batch updateLastAccessed
in transaction, add flush() for pending embeddings, fix false-positive
embed test, add 18 new test cases (61 → 79 total).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- 文档状态更新为 Phase 0 ✅ 已完成
- 补充完整验证结果表(语义搜索、跨语言、延迟指标等)
- 补充生产环境配置方式和尚未集成的环节说明
- Phase 1 章节重整为"抽取 + 注入集成"
- 新增 integration.test.ts: 7 个场景 11 个测试用例

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- 新增 init.ts: singleton 初始化器,启动时创建 DB/store/search
- 新增 extractor.ts: 对话结束后调 Qwen (qwen3.5-plus) 提取记忆,
  fire-and-forget,L0 置信度,查重 + supersede
- 新增 injector.ts: 对话开始前搜索记忆,格式化为 system prompt 片段
- executor.ts: 新增 memoryContext 字段,插入 persona 和 history 之间
- event-handler.ts: executeClaudeTask + executeDirectTask 两条路径
  均接入注入 (执行前) 和抽取 (执行后)
- index.ts: 启动初始化 + 30min 维护 + 优雅关闭
- config: extractionModel 默认 qwen3.5-plus, maxInjectTokens 改为 4000
- 新增 27 个测试 (init/injector/extractor), 总计 117 tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- extractor: extractMemories() 跳过条件、不抛异常、L0 confidence、
  supersede/updateEvidence、null 容错
- injector: injectMemories() 主函数测试(disabled/未初始化/无匹配/
  有匹配/agent 隔离)、token 截断边界、空分组跳过
- init: 低置信度 >90 天清理、近期低置信度不清理

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- 场景7: 调真实 qwen3.5-plus 从对话中抽取结构化记忆,验证写入 store
- 场景8: 注入格式化验证(类型分组、日期、token 预估)
- 场景9: 抽取→注入全链路闭环(写入记忆后搜索命中并注入 prompt)
- 集成测试 11 → 14 个,总计 141 tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
text-embedding-v4 支持 512/1024/1536,1536 精度最高,
当前记忆量级下存储和速度开销可忽略。

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
基准测试显示 plus 延迟极不稳定(8s~623s),flash 在真实抽取场景下
稳定 7-8s,质量相同,价格便宜 5 倍。同时新增质量评估测试。

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- init.ts: 维护查询改用 ISO 8601 格式匹配存储时间戳,DELETE 改为
  通过 MemoryStore.delete() 清理 vec0 孤儿记录
- database.ts: 默认维度与 config 一致 (1536)
- injector.ts: 修复 token 预算耗尽时可能输出空 section header 的问题

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@lishuceo lishuceo force-pushed the feat/memory-system-phase0 branch from a8cfd95 to c526fbf Compare February 28, 2026 19:32
@lishuceo lishuceo merged commit fd1a62a into main Feb 28, 2026
1 check passed
@claude
Copy link
Copy Markdown

claude Bot commented Feb 28, 2026

Claude finished @lishuceo's task in 3m 22s —— View job


Code Review: Memory System Phase 0 + Phase 1

  • Check and resolve previous review comments (4 previous issues — all fixed)
  • Read init.ts, database.ts, injector.ts — verified all previous fixes
  • Read remaining source files (store, search, embeddings, extractor, types, index, config, event-handler, executor)
  • Deep review new/modified modules
  • Cross-module tracing and architecture review
  • Post final review summary

Verdict: ✅ Approved

All 4 previous issues fixed (2 critical, 2 warning). After re-reading all 13 source files in full and tracing cross-module interactions, no new issues found above the confidence threshold.

Previous issues resolved

# Severity File Fix
1 🔴 Critical init.ts Datetime format mismatch → now uses JS Date.toISOString() with parameterized queries
2 🔴 Critical init.ts Orphaned vec0 entries → now SELECTs ids and routes through memoryStore.delete()
3 🟡 Warning database.ts Default dimension 10241536
4 🟡 Warning injector.ts Empty section headers → added budgetExhausted flag + collect-before-header pattern

What I verified (no issues found)

Area Assessment
FTS5 injection defense sanitizeFtsQuery strips * " ( ) { } ^ : and wraps tokens in double-quoted literals. Keywords (AND, OR, NOT) become harmless when quoted. ✅
SQL injection All queries parameterized. float[${dimension}] interpolation guarded by Number.isInteger(dimension) && dimension > 0 && dimension <= 65536. ✅
Two-phase write Phase 1 sync INSERT + FTS5 trigger, Phase 2 async embed with .catch() and flush(). embedAndStore verifies memory exists before vec insert. ✅
Confidence clamping L0 ≤ 0.7 enforced in store.create(). Extractor always sets confidenceLevel: 'L0'. ✅
Maintenance correctness ISO timestamp comparison fixed. Deletions routed through memoryStore.delete() for vec0 cleanup. ✅
vec0 cosine distance 1 - distance conversion and < 0.15 threshold for findConflicting are correct. ✅
Provider race condition clientReady: Promise pattern in both DashScopeEmbeddingProvider and getExtractionClient(). ✅
Agent/workspace isolation Search filters correctly enforce agentId (with * wildcard), userId, workspaceDir. ✅
Injector budget handling budgetExhausted flag breaks outer loop. Items collected before header added. No empty sections. ✅
Event-handler integration Both executeClaudeTask and executeDirectTask inject (before) and extract (after, fire-and-forget with .catch()). ✅
Executor memoryContext Injected between knowledge/workspace prompt and history summaries. Hash excludes memory (avoids cache invalidation). ✅
Startup/shutdown lifecycle initializeMemory() idempotent, closeMemory() in shutdown, maintenance in 30min interval. ✅
Hybrid scoring BM25 normalized via sigmoid, vec score via 1 - cosine_distance. Type boost + recency decay + confidence multiplied in. MAX_SEARCH_LIMIT prevents resource exhaustion. ✅
Extraction parsing Handles raw JSON and markdown code blocks. validateMemories filters invalid entries. Dedup via vector similarity + supersede/evidence logic. ✅
ESM & conventions .js extensions, Pino logger, singleton pattern, barrel exports in index.ts. ✅

Overall assessment

Solid implementation of a memory system with good defensive programming throughout. The fixes for all 4 previous review findings are clean and correct. The architecture (two-phase write, hybrid BM25+vector search, fire-and-forget extraction, budget-aware injection) is well-designed for production use with proper graceful degradation when sqlite-vec is unavailable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant