One Python file. Zero dependencies. Zero LLM calls. 100% deterministic. Never deletes — only archives.
Your coding agent already has a memory file — Hermes' MEMORY.md, Claude
Code's memory, a notes section in AGENTS.md. That file only ever grows:
duplicates pile up, corrected facts sit next to the stale ones they
corrected, and hard character budgets (Hermes: 2,200 chars) overflow.
dream gives that file what a brain gets every night — a sleep cycle:
light sleep exact + near-duplicate detection (keeps the richest wording)
deep sleep supersession: same subject stated twice → the newer wins,
the older is archived with a written reason
REM recurring-theme report + contradiction flags (report only)
squeeze --budget N: merge related entries clause-safely, then archive
the most redundant ones until the file fits its hard limit
Unlike every other memory consolidator (Letta sleep-time compute, OpenClaw dreaming, mem0), it never calls an LLM: consolidation is pure deterministic text algebra — zero tokens, zero keys, runs offline, same input → same plan, every time — which also means you can put it in cron and forget it.
curl -O https://raw.githubusercontent.com/Da7-Tech/dream/main/dream.py
python3 dream.py MEMORY.md # dry run: journal + diff, no writes
python3 dream.py MEMORY.md --apply # consolidate (backup + archive + journal)
python3 dream.py --hermes --apply # Hermes: ~/.hermes/memories/* with the
# char budgets read from config.yamlDry run is the default. --apply always writes a timestamped backup, moves
every removed entry into <file>.dream-archive.md with the reason, and
appends a human-readable report to DREAMS.md.
Run against a real, lived-in Hermes MEMORY.md (13 entries, 5,926 chars,
mixed Arabic/English), the dry run found exactly two consolidations and
zero false positives:
- [deep] superseded — same subject stated again later in the file
archived: <a claim about a provider endpoint from Jul 1>
kept/now: "Correction (Jul 1): <the agent's own later correction
of that exact claim>"
- [deep] superseded — same subject stated again (100% subject overlap)
archived: <feature note, older wording>
kept/now: <same feature note, richer wording with the config path>
entries: 13 -> 11 | chars: 5926 -> 5066
Both actions are exactly right: the first entry had literally been
corrected by a later "Correction:" entry the agent appended — dream
noticed, kept the correction, archived the stale claim.
Auto-detected (or force with --format):
| format | used by | entries are |
|---|---|---|
sections |
Hermes (MEMORY.md, USER.md) |
blocks separated by § — size accounting matches Hermes' own char-limit math (Python len() codepoint count, character-for-character) |
bullets |
Claude Code memory, most AGENTS.md notes |
- bullets (with continuation lines); headers preserved untouched |
paragraphs |
plain notes files | blank-line separated blocks |
Bilingual: tokenization and stemming handle English and Arabic (with broken-plural handling), so mixed-language memories consolidate correctly.
- Dry run by default — you always see the diff and the reasons first.
- Nothing is ever destroyed — every removed entry goes to
<file>.dream-archive.mdwith a timestamp and the reason. - Timestamped backup of the original before every
--apply. - Uncertain cases are flagged, not acted on — low-overlap conflicts go to the journal as flags for you (or your agent) to resolve.
- Atomic, symlink-refusing writes; idempotent (a second dream on a clean memory changes nothing — this is a unit test).
--apply write changes (default: dry-run preview)
--budget N enforce a hard character budget (Hermes MEMORY.md: 2200)
--format F auto | sections | bullets | paragraphs
--max-age D also archive entries first seen more than D days ago
(age is tracked in a sidecar state file across runs)
--no-supersede dedup only
--no-merge never merge; archive-only squeeze
--quiet one summary line
- 43 unit tests, stdlib
unittest:python3 -m unittest discover -s tests - 90-day soak in CI (
bench/soak.py— real code, injected clock, daily churn + nightly dream): newest statement of every evolving subject won, nothing lost (file ∪ archive), byte-identical across full reruns. A separate budget-stress leg feeds non-dedupable novel churn against a small budget so the squeeze path actually fires (archive-for-budget) and the hard char limit is verified held on every night — the budget guarantee is also covered directly by unit tests. - Determinism and idempotence are tested, not promised
- Consolidating a 13-entry real-world memory: < 20 ms, 0 tokens
dream improves the memory your agent already has. If you want a full
brain-like memory — spreading-activation recall, Ebbinghaus forgetting,
cross-agent export — that's the sister project:
mind. Use either, or both:
mind for project memory, dream for the agent's own memory file.
- Supersession trusts file order (memory files are append logs; later =
newer). If yours isn't append-ordered, use
--no-supersede. - Merging is clause-level text algebra: it preserves every novel clause but won't rewrite prose the way an LLM would — by design (determinism > polish).
- Near-duplicate detection is bag-of-words with an order guard: two entries are only merged when they share the same tokens in the same order, so "A calls B" and "B calls A" are treated as distinct (they fall through to the conflict flag), not silently deduped.
- Theme detection is term-frequency based, a report not an oracle.
- Pairwise comparison is O(n²): instant for agent-memory files (a 13-entry real memory: ~2 ms), slow for huge files (an adversarial 18,000-entry, 1 MB file took ~3 minutes). It is a memory-file tool, not a corpus tool.
Arabic README: README.ar.md · License: MIT