CogniFold: Always-On Proactive Memory
via Cognitive Folding

A brain-inspired always-on agent memory that folds continuously arriving events into self-emerging cognitive structure — designed for the next generation of proactive assistants.

CogniFold tri-layer cognitive folding loop: Hippocampal → Neocortical → Prefrontal

📖 Table of Contents

🎯 Highlights
🧠 Concepts in 60 seconds
🎬 Demo
🛠️ Installation
🚀 Quick Start
⚙️ Key Configurations
🔁 Benchmark Evaluation
📂 Project Structure
🔗 Citation
📜 License

🎯 Highlights

🔮 Proactive Memory. Proactivity is a property of the memory substrate, not the agent's policy — goals emerge from the topology that accumulates the conditions for them.
🧠 Architecture. A tri-layered substrate extending Complementary Learning Systems with a prefrontal Intent layer — events fold into concepts, concepts crystallize into intents, surfaced through a hierarchical context window.
🌱 Conceptual Bootstrapping. Accumulation, compression, decay, completion — four structural debts of a streaming event log, resolved as transparent graph rewrites: test-time learning without gradient updates or surface text rewriting.
📊 Evaluation. CogEval-Bench isolates proactive emergence from retrieval accuracy; seven downstream benchmarks confirm the substrate stays robust on conventional memory tasks.

🧠 Concepts in 60 seconds

CogniFold ingests an asynchronous event stream and folds it into a typed concept graph. Four node types — the first three mirror Complementary Learning Systems (CLS) theory:

Node	ID prefix	Layer	Role
`event`	`e-`	Hippocampal	Episodic trace — each input committed verbatim
`concept`	`c-`	Neocortical	Semantic pattern abstracted from recurring events
`intent`	`i-`	Prefrontal	Crystallizes when a concept cluster crosses density — this is what makes memory proactive
`time`	`t-`	—	Temporal anchor (deadlines, scheduled times)

Eight typed/weighted edges (GROUNDS, CAUSES, TRIGGERS, REINFORCES, PART_OF, DERIVED_FROM, DEADLINE_FOR, RELATED_TO) wire them. Two ways to read the graph:

Proactive Context Window (no query asked) — read the live immediate / working / background bands; intents surface on their own.
Memory Query Agent (explicit query) — retrieve via bm25 / semantic / hybrid modes, optionally wrapped in an agentic multi-round loop.

Details and tunables: ⚙️ Key Configurations.

🎬 Demo

1. Proactive memory in motion. The graph folds events, crystallizes concepts, and surfaces intents.

Demo.mp4

2. Substrate across narratives. I, Robot (top) and Currency Wars (bottom), two stream snapshots each.

Concept graphs from two narratives at two stream snapshots each

🛠️ Installation

Prerequisites

Requirement	Notes
Python ≥ 3.11	3.14 tested in CI
`uv` (recommended) or `pip`	`uv` gives ~10× faster installs
LLM API key (optional)	Google `GOOGLE_API_KEY` or OpenAI `OPENAI_API_KEY` — only needed for agent / semantic retrieval / agentic mode

Step-by-step

# 1. Clone the repository
git clone https://github.com/MergeFold/CogniFold.git
cd CogniFold

# 2. Install (pick one)
uv sync                                    # fastest, uses uv.lock
pip install -e ".[agent,service]"          # core + agent + HTTP service
pip install -e ".[dev,agent,service,viz]"  # everything (dev tools, viz, FAISS)

# 3. Configure API keys
cp .env.example .env
# edit .env and set GOOGLE_API_KEY or OPENAI_API_KEY

🚀 Quick Start

CLI demo (no LLM required — uses BM25 retrieval)

# 1. Generate a sample timeline (a saved demo is also under data/generated/)
cognifold generate --domain personal-timeline --persona software_engineer --events 50

# 2. Build the concept graph
cognifold run data/generated/alex_chen_timeline.json --save-graph output/graph.json

# 3. Query the graph
cognifold query --graph output/graph.json --retrieval bm25 "morning routine"

# 4. Replay the graph evolution as an interactive HTML
cognifold replay logs/replay_alex_chen_timeline_*.jsonl -o output/replay.html --open

Python — proactive context window (no query)

from cognifold import NodeType
from cognifold.graph.persistence import load_graph
from cognifold.scoring.hierarchical import HierarchicalContextSelector

# Load a previously saved graph
graph = load_graph("output/graph.json")
print(f"nodes={graph.node_count}  edges={graph.edge_count}")

# Read the live, always-on context window — no query asked!
context = HierarchicalContextSelector().select_context(graph)

print(f"\nimmediate ({context.immediate.node_count} nodes — top-of-mind):")
for n in context.immediate.nodes[:5]:
    print(f"  [{n.type.value}] {n.data.get('title', n.id)}")

print(f"\nworking ({context.working.node_count} nodes — active patterns)")
print(f"background ({context.background.node_count} nodes — historical)")

# Emergent intents surface here without anyone asking
intents = graph.get_nodes_by_type(NodeType.INTENT)
print(f"\n{len(intents)} intents emerged from the graph state:")
for i in intents[:5]:
    print(f"  [{i.id}] {i.data.get('title', '?')}  status={i.data.get('status', '?')}")

# Example output (50-event personal timeline):
# nodes=78  edges=124
#
# immediate (8 nodes — top-of-mind):
#   [event]   Met with team about Q3 plan
#   [intent]  Schedule follow-up with marketing
#   [concept] product launch coordination
#   [event]   Coffee with Sarah at Blue Bottle
#   [event]   Reviewed candidate resume
#
# working (23 nodes — active patterns)
# background (47 nodes — historical)
#
# 3 intents emerged from the graph state:
#   [i-7]  Schedule follow-up with marketing  status=pending
#   [i-12] Buy birthday gift for Sarah        status=pending
#   [i-15] Q3 OKR review prep                 status=in_progress

Python — explicit query (reactive)

from cognifold.query.agent import MemoryQueryAgent
from cognifold.query.config import QueryConfig

agent = MemoryQueryAgent(graph, config=QueryConfig(retrieval_mode="hybrid"))
result = agent.query("What did I commit to about exercise?")
print(result.context_text)

HTTP service

./scripts/start_server.sh                       # default :8000
cognifold client --url http://localhost:8000    # interactive REPL

# Or hit the API directly
curl -X POST http://localhost:8000/api/v1/sessions
curl http://localhost:8000/docs                 # OpenAPI / Swagger UI

⚙️ Key Configurations

Retrieval modes (Memory Query Agent)

Set via QueryConfig(retrieval_mode=...). The four modes select the entry point into the graph for an explicit query:

Mode	When to use	Needs LLM key?
`legacy`	original keyword matching, minimal dependency	No
`bm25`	TF-IDF inverted index; fast and deterministic	No
`semantic`	embedding-based vector search	Yes (Google / OpenAI)
`hybrid` (default)	BM25 + semantic via RRF fusion; best general accuracy	Yes — auto-degrades to BM25 if no embedder

For hard multi-hop queries, wrap with AgenticRetriever: it runs hybrid first, asks an LLM whether the result is sufficient, and if not, expands the query in parallel and re-ranks via RRF.

Read-window bands (Proactive Context Window)

HierarchicalContextSelector().select_context(graph) returns three bands, each a different attention regime:

Band	Default size	Score weights
`immediate`	10% of window	recency 0.7 + urgency 0.3
`working`	30% of window	PageRank 0.5 + recency 0.3 + type 0.2 (favors concepts)
`background`	50% of window	PageRank 0.8 + diversity 0.2

The window is read anytime — no query is required. Intents that crossed the crystallization threshold appear in immediate automatically; concepts that are being reinforced live in working; durable structure sinks to background.

Most-used CLI flags (benchmark runners)

Flag	Purpose
`--event-stream`	enable inter-session consolidation (`merge_similar_concepts` + `prune_orphan_concepts`); required for paper-grade LoCoMo
`--query-mode {base, rag, episodic, mergefold}`	ablation switch: `mergefold` = full CogniFold; others are baselines
`--disable-concepts`	events-only baseline (skips concept formation)
`--model openai:gpt-4o-mini`	reader model
`--judge-model gpt-4o-mini`	LLM-as-judge for QA scoring (auto-derived from `--model` if omitted)
`--limit N`	cap number of examples (smoke-testing)
`--no-llm-eval`	skip LLM judging step (use exact-match / F1 only)

Environment overrides accepted by scripts/reproduce.sh: MODEL=..., plus the LLM keys OPENAI_API_KEY / GOOGLE_API_KEY (from .env).

🔁 Benchmark Evaluation

One wrapper for everything — sane defaults, dataset auto-downloaded on first run, paper-faithful flags applied per benchmark.

# canonical run: LoCoMo full 10-conversation Mem0 protocol (≈ 1 h on gpt-4o-mini)
bash scripts/reproduce.sh

# any single benchmark (paper order — CogEval-Bench first, LoCoMo second)
bash scripts/reproduce.sh cogeval        # CogEval-Bench (structural diagnostic; the proactive thesis)
bash scripts/reproduce.sh locomo         # LoCoMo (default; Mem0 protocol)
bash scripts/reproduce.sh musique        # multi-hop QA
bash scripts/reproduce.sh narrativeqa    # narrative comprehension
bash scripts/reproduce.sh tomi           # theory of mind
bash scripts/reproduce.sh babilong       # long-context fact extraction
bash scripts/reproduce.sh mutual         # dialogue coherence
bash scripts/reproduce.sh streamingqa    # streaming temporal QA

# all 8 paper benchmarks back-to-back (CogEval + 7 downstream; uses MODEL env to override)
bash scripts/reproduce.sh all

Each run writes benchmarks/<name>/output/benchmark_results.json. Override the reader model via env: MODEL=openai:gpt-4o bash scripts/reproduce.sh locomo. The --event-stream flag is automatically applied to LoCoMo (it gates the inter-session consolidation pass central to the always-on memory thesis); all other benchmarks discharge consolidation through the shared base_runner post-ingestion hook.

📂 Project Structure

cognifold/
├── src/cognifold/             # core library (20 submodules)
│   ├── __init__.py
│   ├── __main__.py
│   ├── config.py
│   ├── logging.py
│   ├── agent/                 # LangGraph agent, prompts, sections, domain configs
│   ├── cli/                   # CLI commands
│   ├── embeddings/            # Gemini / OpenAI providers, optional FAISS ANN
│   ├── executor/              # Plan execution with validation and rollback
│   ├── generator/             # Event generation (4 domains)
│   ├── graph/                 # NetworkX wrapper, persistence, validation, metrics
│   ├── importers/             # Data importers (wiki)
│   ├── intent/                # Intent-to-action system: queue, executor, calibrator
│   ├── models/                # Pydantic schemas (Event, Node, Edge, UpdatePlan)
│   ├── pipeline/              # Pipeline orchestration (classic + layered)
│   ├── query/                 # Query agent, strategies, assembly, LLM utilities
│   ├── replay/                # Graph evolution logging + interactive HTML
│   ├── retrieval/             # BM25, hybrid, agentic multi-round, cross-encoder
│   ├── scoring/               # PageRank, hierarchical context, node ranking
│   ├── service/               # HTTP service (FastAPI) — sessions, routes, auth, stores
│   ├── simulator/             # Timeline processing, visualization
│   ├── symbolic/              # Symbolic belief tracker, cognition / intent routers
│   ├── temporal/              # Temporal entity extraction, date parsing
│   ├── trace/                 # Tracing / instrumentation
│   └── utils/                 # Shared utilities (LLM metrics, budget, embeddings)
├── benchmarks/                # 8 benchmark runners + shared base-runner library
│   ├── shared/                # base_runner, baseline_runner, graph_evolution_tracker
│   ├── babilong/  cogeval/    locomo/        musique/
│   └── mutual/    narrativeqa/  streamingqa/  tomi/
├── configs/                   # per-benchmark prompt profiles (YAML)
├── examples/                  # sample timelines + replay HTML for 4 domains
├── scripts/                   # auxiliary scripts (LoCoMo audit-protocol rejudge, …)
├── docs/                      # ARCHITECTURE.md · BENCHMARK.md · PROMPTS.md
├── .github/                   # CI / CD workflows
├── cognifold                  # CLI entry-point shell launcher
├── config.example.yaml        # example application config
├── .env.example               # example environment file
├── generate_demo.py           # one-shot demo-graph generator
├── test_benchmarks.py         # smoke tests for the benchmark runners
├── pyproject.toml
├── uv.lock
├── Makefile
├── README.md
├── LICENSE
└── .gitignore

🔗 Citation

A BibTeX entry for the accompanying paper will be added here once the paper is publicly released.

📜 License

Apache-2.0 — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CogniFold: Always-On Proactive Memory
via Cognitive Folding

📖 Table of Contents

🎯 Highlights

🧠 Concepts in 60 seconds

🎬 Demo

🛠️ Installation

Prerequisites

Step-by-step

🚀 Quick Start

CLI demo (no LLM required — uses BM25 retrieval)

Python — proactive context window (no query)

Python — explicit query (reactive)

HTTP service

⚙️ Key Configurations

Retrieval modes (Memory Query Agent)

Read-window bands (Proactive Context Window)

Most-used CLI flags (benchmark runners)

🔁 Benchmark Evaluation

📂 Project Structure

🔗 Citation

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
benchmarks		benchmarks
configs		configs
docs		docs
examples		examples
figures		figures
scripts		scripts
src/cognifold		src/cognifold
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
cognifold		cognifold
config.example.yaml		config.example.yaml
generate_demo.py		generate_demo.py
pyproject.toml		pyproject.toml
test_benchmarks.py		test_benchmarks.py
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

CogniFold: Always-On Proactive Memoryvia Cognitive Folding

📖 Table of Contents

🎯 Highlights

🧠 Concepts in 60 seconds

🎬 Demo

🛠️ Installation

Prerequisites

Step-by-step

🚀 Quick Start

CLI demo (no LLM required — uses BM25 retrieval)

Python — proactive context window (no query)

Python — explicit query (reactive)

HTTP service

⚙️ Key Configurations

Retrieval modes (Memory Query Agent)

Read-window bands (Proactive Context Window)

Most-used CLI flags (benchmark runners)

🔁 Benchmark Evaluation

📂 Project Structure

🔗 Citation

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

CogniFold: Always-On Proactive Memory
via Cognitive Folding

Packages