Bug Description
After importing ~30k historical messages, the Embedding Maintenance stats show an extremely misleading "missing" vector count (~620k), even though the import process correctly skipped most content and only added ~7,000 effective memories.
Observed numbers:
| Metric |
Count |
| Traces in DB |
312,185 |
| Effective memories (viewer) |
~7,000 |
| Traces with vectors |
~2,100 |
| "Missing" vectors |
~620,000 |
The ~620k number comes from (312,185 - 2,100) × 2 (vec_summary + vec_action per trace). But ~227,000 of those traces are short content (tool calls, status messages, "let me check..." type text with user_text < 50 chars AND agent_text < 100 chars) that:
- The import process correctly skipped (reported as "skipped" in import stats)
- Were never queued into the
embedding_retry_queue
- Will never be processed by the embedding pipeline
Root Cause
computeEmbeddingMaintenanceStats() in memory-core.js counts all traces with vec_summary IS NULL as "missing", regardless of whether those traces actually need embedding. The import process stores skipped/short traces in the traces table but doesn't embed them — and correctly so. But the stats don't distinguish between "intentionally skipped" and "needs embedding."
How to Reproduce
- Import a large historical dataset (e.g. 30k+ messages from a chat export)
- Observe import stats: ~7,000 added, ~25,000+ skipped
- Check Settings → Embedding Maintenance
- See "missing" count of ~620,000 (far exceeding actual memory count)
Suggested Fix
Any of the following would help:
- Don't count traces that were intentionally skipped during import — either mark them with a flag (e.g.
share_scope = 'skipped') or exclude traces below a content length threshold from the stats.
- Show a breakdown in the stats: "X traces skipped/short, Y traces pending embedding" so users understand the gap.
- The "Repair missing" button should also skip short/empty traces instead of attempting to embed all 300k+ rows.
Environment
- memos-local-plugin version: 2.0.4
- OpenClaw version: 2026.5.12
- OS: macOS (Apple M4)
- Embedding model: qwen/qwen3-embedding-8b (via OpenRouter)
Bug Description
After importing ~30k historical messages, the Embedding Maintenance stats show an extremely misleading "missing" vector count (~620k), even though the import process correctly skipped most content and only added ~7,000 effective memories.
Observed numbers:
The ~620k number comes from
(312,185 - 2,100) × 2(vec_summary + vec_action per trace). But ~227,000 of those traces are short content (tool calls, status messages, "let me check..." type text withuser_text < 50 chars AND agent_text < 100 chars) that:embedding_retry_queueRoot Cause
computeEmbeddingMaintenanceStats()inmemory-core.jscounts all traces withvec_summary IS NULLas "missing", regardless of whether those traces actually need embedding. The import process stores skipped/short traces in thetracestable but doesn't embed them — and correctly so. But the stats don't distinguish between "intentionally skipped" and "needs embedding."How to Reproduce
Suggested Fix
Any of the following would help:
share_scope = 'skipped') or exclude traces below a content length threshold from the stats.Environment