An OpenClaw plugin that captures conversations before compaction, clusters them into topics using Cerebras, and injects relevant context back into future prompts — giving your agent persistent, searchable memory across sessions.
Every session On compaction / /reset
───────────────────── ──────────────────────────────────────────
message_received → update before_compaction → extract user/assistant
last_message_at pairs
→ Cerebras clusters into
llm_output → increment topics (JSON)
pair_count → append to
conversation-log.jsonl
before_prompt_build: → write topic .md files
Path A — gap > 20 min → qmd search on → qmd update (index them)
Path B — every 5 pairs → Cerebras intent
→ qmd search
Path C — nothing → skip (zero cost)
| Path | Trigger | Method | Cost |
|---|---|---|---|
| A | Gap > 20 min OR post-compaction | qmd BM25 on raw prompt | ~0ms (local) |
| B | Every 5th exchange | Cerebras intent extract → keywords → qmd | ~4s (Cerebras API) |
| C | Otherwise | Skip | Zero |
Path A uses a corpus gate — it won't fire until you have 5+ topic files (avoids wasteful searches on an empty corpus).
- OpenClaw with the extensions API
- Cerebras API key (free tier available)
- qmd — local hybrid BM25+semantic search
pip install qmdqmd init # creates a qmd.config.json in your workspace
qmd update # indexes your memory/ folder
qmd embed # generates embeddings (optional but improves Path B)mkdir -p ~/.openclaw/extensions/conversation-memory
cp index.ts ~/.openclaw/extensions/conversation-memory/Or if your OpenClaw workspace is ~/clawd:
mkdir -p ~/clawd/.openclaw/extensions/conversation-memory
cp index.ts ~/clawd/.openclaw/extensions/conversation-memory/Option A — Environment variable (recommended):
export CONV_MEMORY_CEREBRAS_KEY="your-api-key-here"Option B — Key file (default path):
mkdir -p ~/.credentials
echo "your-api-key-here" > ~/.credentials/cerebras-api-key.txt
chmod 600 ~/.credentials/cerebras-api-key.txtOption C — Custom key file path:
export CONV_MEMORY_CEREBRAS_KEY_FILE="/path/to/your/api-key.txt"openclaw gateway restart
# or
systemctl --user restart openclaw-gateway.servicejournalctl --user -u openclaw-gateway.service -n 20 | grep conv-memory
# Expected: [conv-memory] All 5 hooks registered — Phase 4+6 activeAll configuration is via environment variables. Defaults are shown.
# Workspace path (where OpenClaw lives)
CONV_MEMORY_WORKSPACE=~/clawd
# Cerebras API key (inline — takes priority over key file)
CONV_MEMORY_CEREBRAS_KEY=
# Cerebras API key file path
CONV_MEMORY_CEREBRAS_KEY_FILE=~/.credentials/cerebras-api-key.txt
# Cerebras model to use
CONV_MEMORY_CEREBRAS_MODEL=qwen-3-235b-a22b-instruct-2507
# Session key to monitor (your main human<>agent session)
CONV_MEMORY_SESSION=agent:main:main
# qmd collections to search (comma-separated)
CONV_MEMORY_QMD_COLLECTIONS=memory,obsidian
# Gap in minutes before Path A fires
CONV_MEMORY_GAP_THRESHOLD_MIN=20
# Minimum topic files before Path A fires (corpus gate)
CONV_MEMORY_MIN_CORPUS_SIZE=5
# Maximum characters to inject into system prompt
CONV_MEMORY_MAX_INJECT_CHARS=800
# Pair check frequency for Path B
CONV_MEMORY_PAIR_CHECK_FREQ=5
# Minimum confidence for Cerebras intent to trigger injection (0.0–1.0)
CONV_MEMORY_INTENT_CONFIDENCE=0.7You can set these in your shell profile, or in a .env file if your OpenClaw setup loads one.
<workspace>/memory/
conversation-log.jsonl # one JSON block per compaction
hook-state.json # pair_count, last_message_at, etc.
conversation-topics/
2026-02-21T08-05-07Z-plugin-architecture-0.md
2026-02-21T08-05-07Z-qmd-integration-1.md
... # grows with each compaction
Each topic file is a short Markdown document — searchable via qmd, readable by humans.
The corpus gate (MIN_CORPUS_SIZE=5) means Path A stays quiet until you've had enough compaction cycles to build a useful search corpus. After the first 5 compactions (~1-2 weeks of active use), you'll start seeing Path A fire and inject relevant context.
Milestones — the included conversation-compact.sh companion script logs a note to your daily memory file when the corpus crosses 5, 20, 50, or 100 topic files — signalling when to re-tune thresholds.
Re-tuning checklist (run at each milestone):
- What's the Path A hit rate? (check gateway logs for "qmd returned nothing")
- Are Path B injections relevant? (look at
intent=in logs) - Is 800 chars enough context, or too much?
- Should you adjust
PAIR_CHECK_FREQup or down?
Default: qwen-3-235b-a22b-instruct-2507 — a fast, high-quality model for structured JSON extraction. Cerebras's hardware delivers responses in ~1–2 seconds even at 235B parameters.
You can swap to any Cerebras-hosted model:
CONV_MEMORY_CEREBRAS_MODEL=llama-4-scout-17b-16e-instruct- State is in-memory —
hook-state.jsonis the source of truth but the in-memory cache avoids read-modify-write races between hooks - All disk writes are atomic — temp file + rename, unique per call
- Fire-and-forget compaction —
processCompactionruns async after returning from the hook; never blocks the main session - Fallback on Cerebras failure — if the API call fails, raw pairs are written to
conversation-log.jsonlso nothing is lost - qmd is optional — if qmd isn't installed or returns nothing, the plugin silently skips injection (no errors)
MIT