Skip to content

feat: babel-memory integration — multilingual FTS & prompts#14

Merged
win4r merged 2 commits intowin4r:mainfrom
AliceLJY:feat/babel-memory-integration
Apr 9, 2026
Merged

feat: babel-memory integration — multilingual FTS & prompts#14
win4r merged 2 commits intowin4r:mainfrom
AliceLJY:feat/babel-memory-integration

Conversation

@AliceLJY
Copy link
Copy Markdown
Collaborator

@AliceLJY AliceLJY commented Apr 9, 2026

Summary

Integrate babel-memory as an optional dependency to enhance UltraMemory's multilingual capabilities:

  • BM25 search pre-tokenization: CJK queries are segmented before FTS (jieba for Chinese, kuromoji for Japanese), dramatically improving non-English recall
  • Language-aware KG extraction: Uses bilingual prompts (EN/CJK) based on auto-detected text language
  • Language-aware session distillation: Session summary prompts adapt to conversation language
  • Language metadata on ingest: Detected language stored in memory metadata for downstream use

Design

All integration goes through a single language-hook.ts module that auto-detects babel-memory at startup. Zero breaking changes — when babel-memory is not installed, all functions gracefully degrade to current behavior (English passthrough).

Files changed

File Change
language-hook.ts New — core integration module with auto-detection
babel-memory.d.ts New — type stub for optional dependency
store.ts Pre-tokenize BM25 queries via tokenizeQuery()
kg-extractor.ts Language-aware prompt selection via getLocalizedKgPrompt()
session-distiller.ts Language-aware prompt selection via getLocalizedSessionPrompt()
ingestion-pipeline.ts Language detection via detectLanguage(), stored in metadata
index.ts Export language-hook public API
package.json Add babel-memory as optional dependency

babel-memory capabilities

  • detectLanguage() — 8 script systems (zh/ja/ko/th/ar/hi/ru/en)
  • tokenizeForFts() — BM25 pre-tokenization for 27+ languages
  • getKgPrompt() / getSessionPrompt() — bilingual LLM prompt templates

Test plan

  • 18 tests covering both with/without babel-memory paths
  • TypeScript typecheck passes (no new errors)
  • Graceful degradation verified: all functions return safe defaults without babel-memory
  • CJK language detection, tokenization, and prompt routing verified with babel-memory installed

🤖 Generated with Claude Code

AliceLJY and others added 2 commits April 9, 2026 16:24
…rompts

Integrate babel-memory (optional dependency) for language-aware memory processing:

- language-hook.ts: auto-detects babel-memory at startup, graceful degradation
- store.ts: pre-tokenize BM25 queries for CJK languages (jieba/kuromoji)
- kg-extractor.ts: language-aware KG extraction prompts (EN/CJK)
- session-distiller.ts: language-aware session summary prompts (EN/CJK)
- ingestion-pipeline.ts: detect language and store in metadata
- 18 tests covering both with/without babel-memory paths

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@win4r win4r merged commit e843ead into win4r:main Apr 9, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants