ai-knot

Agent knowledge layer — distills conversations into structured facts, retrieves what matters, forgets the rest.

Most agent frameworks treat memory as a log. ai-knot treats it as a knowledge base. It extracts facts from conversations, scores them by importance, and retrieves only what's relevant when building the next prompt. Pluggable storage, six LLM providers, no vendor lock-in.

The problem

Most frameworks store everything — messages, tool calls, system prompts, the whole log. That's fine until you're paying to inject six months of conversation history into every request, most of which has nothing to do with what the user asked.

The log grows to 400k tokens. The model needs maybe 300 of those for the next turn — but there's no obvious way to know which ones without reading all of them first. ai-knot solves this by keeping a distilled knowledge base instead of a raw log.

1000 messages  (~400k tokens)
    ↓ LLM extraction + ATC verification
~12 facts      (~300 tokens)
    ↓ BM25 retrieval
3–5 facts injected into the next prompt

Install

Python:

pip install ai-knot

# With OpenAI for LLM extraction:
pip install "ai-knot[openai]"

# With PostgreSQL backend:
pip install "ai-knot[postgres]"

# With MCP server (Claude Desktop / Claude Code):
pip install "ai-knot[mcp]"

Node.js / TypeScript (requires Python 3.11+ in PATH):

npm install ai-knot

Quickstart (30 seconds)

from ai_knot import KnowledgeBase

kb = KnowledgeBase(agent_id="my_agent")

# Add facts manually
kb.add("User is a senior backend developer at Acme Corp",
       type="semantic", importance=0.95)
kb.add("User prefers Python, dislikes async code",
       type="procedural", importance=0.85)

# Or extract automatically from a conversation
from ai_knot import ConversationTurn
turns = [
    ConversationTurn(role="user",      content="I deploy everything in Docker"),
    ConversationTurn(role="assistant", content="Got it, I'll use Docker examples"),
]
kb.learn(turns, provider="openai", api_key="sk-...")  # LLM extracts + stores relevant facts

# At inference time — get what matters
context = kb.recall("how should I write this deployment script?")
# -> "[procedural] User prefers Python, dislikes async code
#     [semantic]   User deploys everything in Docker
#     [semantic]   User is a senior backend developer at Acme Corp"

# Inject into your prompt
response = openai_client.chat(...,
    system=f"You are a helpful assistant.\n\n{context}")

Performance

Benchmarks run on Ubuntu (ubuntu-latest, GitHub Actions). Full benchmark history →

Retrieval latency (BM25, in-process)

Measured with pytest-benchmark, pedantic mode, 20 rounds after 3 warmups. BM25Retriever.search() is O(n) — IDF and BM25 scores are recomputed on every call.

Facts in memory	p50	p95	QPS
100	~1 ms	~3 ms	~800
1 000	~8 ms	~25 ms	~100
10 000	~80 ms	~200 ms	~12

Numbers are indicative. Run pytest tests/test_performance.py -m slow --benchmark-only locally for hardware-accurate results.

Full-stack recall latency (storage I/O + decay + BM25)

KnowledgeBase.recall() reads storage on every call. YAML adds ~10–50 ms I/O overhead; SQLite is lower-variance at scale.

Backend	Facts	p50	p95
YAML	1 000	~30 ms	~80 ms
SQLite	1 000	~20 ms	~50 ms

MCP tool call round-trip (stdio transport)

Measured end-to-end: Python subprocess spawn is one-time; per-call overhead is JSON serialization + tool execution.

Tool	Facts	p50	p95
`add`	—	~15 ms	~80 ms
`recall`	50	~20 ms	~100 ms
`stats`	—	~5 ms	~20 ms

Context: pure MCP stdio JSON-RPC overhead is ~10 ms P95 with no tool execution (tmdevlab MCP benchmark). Anthropic recommends keeping agent memory tool latency under 100 ms (Reduce Latency docs). Use storage="sqlite" for lower variance at scale.

What ai-knot keeps — and what it drops

Pass it a conversation and it calls your LLM to figure out what's worth keeping — preferences, recurring patterns, explicit facts. Greetings, clarifications, filler — those don't make the cut.

What happened in the conversation:         What ai-knot stores:
---                                        ---
"hey"                                      X skipped
"thanks"                                   X skipped
"ok got it"                                X skipped
"I really hate working with async"         -> "User dislikes async code"
"by the way we deploy on kubernetes"       -> "User deploys on Kubernetes"
"can you make it shorter please"           -> "User prefers concise responses"

Signal, not noise. Importance scores, power-law retention decay, ATC verification, deduplication — built in.

Conflict resolution — no more duplicate facts

learn() cross-checks new facts against everything already stored before inserting anything. If a new fact is semantically similar to an existing one (Jaccard similarity ≥ 0.7 by default), the existing fact is reinforced instead of a duplicate being created:

kb.add("User deploys on Docker")
kb.add("User deploys with Docker Compose")  # similar enough -> reinforced, not duplicated

# Control the threshold per call:
kb.learn(turns, provider="openai", conflict_threshold=0.8)

Importance is bumped by 0.05 (capped at 1.0) each time reinforcement fires — the knowledge base naturally weights well-confirmed facts higher over time.

Snapshots — version your knowledge base

Save and restore the complete state of a knowledge base at any point in time:

kb.add("User prefers Python")
kb.add("User deploys on Docker")

kb.snapshot("before_refactor")      # save current state

kb.add("User switched to Go")       # state changes
kb.forget(some_fact_id)

kb.restore("before_refactor")       # atomically roll back

# List all saved snapshots:
names = kb.list_snapshots()         # ["before_refactor"]

# Compare two snapshots (or current state):
diff = kb.diff("before_refactor", "current")
print(f"Added: {[f.content for f in diff.added]}")
print(f"Removed: {[f.content for f in diff.removed]}")

Both YAML and SQLite backends support snapshots. YAML stores them under .ai_knot/{agent_id}/snapshots/. SQLite stores them in the same database file.

MCP server — use ai-knot from Claude Desktop or Claude Code

ai-knot ships a native MCP server. Install it and Claude can call add, recall, forget, and snapshot as tools — without any Python code on your end:

pip install "ai-knot[mcp]"

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "ai-knot": {
      "command": "ai-knot-mcp",
      "env": {
        "AI_KNOT_AGENT_ID": "myagent",
        "AI_KNOT_STORAGE": "sqlite",
        "AI_KNOT_DB_PATH": "/absolute/path/to/memory.db"
      }
    }
  }
}

Available tools: add, recall, recall_json, learn, forget, list_facts, stats, snapshot, restore, list_snapshots, health, capabilities.

TypeScript agents: always use recall_json — it returns a stable JSON array ([] when empty, never a human string).

learn accepts messages (same format as OpenAI chat) and extracts relevant facts using the configured LLM. Without AI_KNOT_PROVIDER / AI_KNOT_API_KEY it falls back to storing the last user message directly.

health returns {"status":"ok","version":"..."}. capabilities returns the full tool list as JSON — useful for introspection without a full tools/list round-trip.

Environment variables:

Variable	Default	Description
`AI_KNOT_AGENT_ID`	`default`	Agent namespace
`AI_KNOT_STORAGE`	`sqlite`	`sqlite` (recommended) or `yaml`
`AI_KNOT_DATA_DIR`	`.ai_knot`	Base dir for file backends (use absolute path)
`AI_KNOT_DB_PATH`	—	Full path to SQLite file (overrides `DATA_DIR` for sqlite)

Note: Claude Desktop launches processes from a non-interactive shell where cwd is undefined. Always set AI_KNOT_DATA_DIR or AI_KNOT_DB_PATH to an absolute path.

OpenClaw integration

Which path should I use?

Situation	Solution
OpenClaw TypeScript app (recommended)	`generate_mcp_config()` → paste into `~/.openclaw/openclaw.json`
Python agent (LangChain, LangGraph, CrewAI)	`OpenClawMemoryAdapter(kb)`

ai-knot works as an OpenClaw memory backend via MCP. Two steps:

pip install "ai-knot[mcp]"   # installs the ai-knot-mcp entry point

Note: ai-knot (without [mcp]) does not install ai-knot-mcp. The config will be generated but OpenClaw won't find the command.

Generate the config snippet:

import json
from ai_knot.integrations.openclaw import generate_mcp_config

print(json.dumps(generate_mcp_config("my_agent"), indent=2))

Paste the output into your OpenClaw config file:

macOS / Linux: ~/.openclaw/openclaw.json
Windows: %APPDATA%\OpenClaw\openclaw.json

Your agent will have access to all ai-knot tools: add, recall, recall_json, forget, list_facts, list_snapshots, stats, snapshot, restore.

For Python-native agents (LangChain, LangGraph, CrewAI), use the adapter class instead:

from ai_knot import KnowledgeBase
from ai_knot.integrations.openclaw import OpenClawMemoryAdapter

kb = KnowledgeBase("my_agent")
memory = OpenClawMemoryAdapter(kb)

memory.add([{"role": "user", "content": "Deploy on Fridays"}])
results = memory.search("deployment schedule")
memory.update(results[0]["id"], "Deploy on Thursdays")
memory.delete(results[0]["id"])

Multi-turn extraction: add() stores only the last user message and emits a warning if the list has more than one user message. For extracting multiple facts from a full conversation, use kb.learn(turns, api_key=...) directly.

update() assigns a new ID. The old fact is deleted and a new one is created. If you need to hold a stable reference, call delete() + add() yourself and record the returned ID.

Pluggable storage backends

No vendor lock-in. Swap backends with one line:

from ai_knot import KnowledgeBase
from ai_knot.storage import YAMLStorage, SQLiteStorage

# Development — zero infra:
kb = KnowledgeBase(agent_id="bot", storage=YAMLStorage())

# Production — single server:
kb = KnowledgeBase(agent_id="bot",
    storage=SQLiteStorage(db_path="/data/agent.db"))

Same API. Same code. Different storage.

Cross-process safety (SQLite): SQLiteStorage implements AtomicUpdateCapable — SharedMemoryPool.publish() wraps the entire load→merge→save cycle in a BEGIN EXCLUSIVE SQLite transaction, preventing lost updates when multiple processes share the same .db file.

Initialization — storage + LLM provider together

Storage is set once on KnowledgeBase. The LLM provider can be set at init (recommended for production) or passed per learn() call. They are independent and combine freely:

from ai_knot import KnowledgeBase, ConversationTurn
from ai_knot.storage import SQLiteStorage

# Option A: configure provider once at init (recommended)
kb = KnowledgeBase(
    agent_id="assistant",
    storage=SQLiteStorage(db_path="./agent.db"),
    provider="openai",
    api_key="sk-...",          # or reads OPENAI_API_KEY from env if omitted
)
kb.learn(turns)                # no credentials needed per call
kb.learn(more_turns)           # same provider reused

# Option B: pass provider per call (legacy, still supported)
kb = KnowledgeBase(agent_id="assistant", storage=SQLiteStorage(db_path="./agent.db"))
turns = [
    ConversationTurn(role="user", content="I deploy everything in Docker"),
    ConversationTurn(role="assistant", content="Got it!"),
]
kb.learn(turns, provider="openai")           # reads OPENAI_API_KEY from env
kb.learn(turns, provider="openai",    api_key="sk-...")      # explicit key
kb.learn(turns, provider="anthropic", api_key="sk-ant-...")  # Claude
kb.learn(turns, provider="openai-compat",                    # any compatible API
         api_key="...", base_url="http://localhost:8000/v1")

# Recall never calls the LLM — no provider needed
context = kb.recall("how should I deploy this?")

Mix and match: any storage backend with any LLM provider.

Retrieval with relevance scores

recall_facts_with_scores() returns each fact together with its numeric relevance score. The score is a hybrid value combining BM25 similarity to the query, retention (power-law decay), and the fact's importance — higher is more relevant.

Use it when you need to filter or rank facts programmatically rather than inject them directly into a prompt:

scored = kb.recall_facts_with_scores("Docker deployment", top_k=5)
for fact, score in scored:
    print(f"[{score:.2f}] [{fact.type.value}] {fact.content}")
# [0.87] [procedural] User deploys everything in Docker
# [0.61] [episodic] Deploy failed last Tuesday at 3 PM

# Keep only highly relevant facts
relevant = [fact for fact, score in scored if score >= 0.5]

vs recall_facts() — use recall_facts() when you just need the Fact objects; use recall_facts_with_scores() when scores matter for downstream logic.

Memory types

Type	When to use	Example
`semantic`	Stable facts about the user or world	"User works at Sber", "Stack is Python + FastAPI"
`procedural`	How the user wants things done	"Always use type hints", "Prefer pytest over unittest"
`episodic`	Specific past events with time context	"Deploy failed last Tuesday at 3 PM", "User approved the v2 design on Monday"

Not sure which type to use? Default semantic covers most cases. Use procedural for preferences and rules; episodic for dated events you might want to forget sooner.

Forgetting (why it matters)

Accumulating everything makes agents worse, not better. Irrelevant facts pollute the context window — this is called context rot.

ai-knot uses a power-law decay curve (Wixted & Ebbesen, 1997) — empirically superior to exponential decay (R²=98.9% vs 96.3%):

retention(t) = (1 + t / (9 × stability))^(-decay_exp)

stability = 336h × importance × (1 + ln(1 + access_count))
decay_exp = { semantic: 0.8, procedural: 1.0, episodic: 1.3 }
-> high importance + frequently accessed = remembered for months
-> low importance + never accessed     = forgotten in weeks
-> semantic facts decay slower, episodic facts decay faster (Tulving 1972)

Power-law has a heavier tail than exponential — important facts persist realistically over months instead of vanishing after days. Decay exponent varies by memory type: core preferences (semantic) fade slower than event recollections (episodic). Facts accessed often get reinforced. Stale facts fade automatically.

Do you need to call decay() manually? No — decay is applied automatically inside every recall() call. For facts that are never recalled (e.g. background knowledge your agent doesn't actively query), run kb.decay() in a daily cron job to keep retention scores current.

Custom decay exponents:

kb = KnowledgeBase("agent", decay_config={
    "semantic": 0.5,   # even slower decay for core facts
    "episodic": 2.0,   # much faster decay for events
})
# Without decay_config → defaults: semantic=0.8, procedural=1.0, episodic=1.3

LLM-enhanced features (base + enhanced)

When an LLM provider is configured, additional capabilities activate:

# Auto-tagging: LLM generates domain tags during learn() — zero extra calls
kb = KnowledgeBase("agent", provider="openai", api_key="sk-...")
kb.learn(turns)
# → Fact(content="User prefers Python", tags=["python", "preferences"])

# Query expansion: LLM adds weighted synonyms before BM25 search (opt-in)
kb = KnowledgeBase("agent", provider="openai", api_key="sk-...",
                   llm_recall=True)
kb.recall("what database?")
# → expansion tokens "PostgreSQL", "SQL", "storage" added with weight 0.4
# → original query tokens keep weight 1.0 — no dilution

Without an LLM, everything works as before — tags via add(tags=[...]), raw queries via BM25.

Multilingual support (Russian)

ai-knot includes a zero-dependency Russian stemmer. No configuration needed — the tokenizer auto-detects Cyrillic script and applies Snowball-lite stemming:

kb = KnowledgeBase(agent_id="agent")
kb.add("Пользователь предпочитает Python для бэкенда")
kb.add("Развёртывание через Docker Compose")

# Morphological variants match automatically:
# "запрещённые" and "запрещённых" → same stem
context = kb.recall("запрещённые слова")

With llm_recall=True, the expansion prompt preserves the query language — Russian queries get Russian synonyms, English queries get English synonyms.

Clock injection and configurable RRF

All recall and decay methods accept a now parameter for deterministic testing and time-travel scenarios:

from datetime import datetime, UTC

# Test how facts would rank at a specific point in time
future = datetime(2026, 12, 1, tzinfo=UTC)
context = kb.recall("deployment", now=future)
kb.decay(now=future)

RRF (Reciprocal Rank Fusion) weights are configurable:

# Default: BM25 dominates (50% influence), importance+retention 40%
kb = KnowledgeBase("agent", rrf_weights=(5.0, 2.0, 2.0, 1.0))

# Boost importance and retention signals
kb = KnowledgeBase("agent", rrf_weights=(3.0, 2.0, 2.0, 1.0))

Weights correspond to: (bm25, importance, retention, recency).

CLI

ai-knot show   my_agent            # list all stored facts
ai-knot recall my_agent "query"    # test retrieval
ai-knot add    my_agent "fact"     # add a fact
ai-knot stats  my_agent            # counts, avg importance, retention
ai-knot decay  my_agent            # apply forgetting curve
ai-knot clear  my_agent            # wipe knowledge base
ai-knot export my_agent out.yaml   # backup to file
ai-knot import my_agent in.yaml    # restore from backup

MCP setup shortcut — prints a paste-ready config block for Claude Desktop or Claude Code:

ai-knot setup claude                            # defaults: sqlite, agent_id=default
ai-knot setup claude --agent-id bot --storage yaml --data-dir /data/mem

How knowledge looks on disk (YAML backend)

# .ai_knot/my_agent/knowledge.yaml — readable, editable, Git-trackable

a1b2c3:
  content: "User is a senior backend developer at Acme Corp"
  type: semantic
  importance: 0.95
  retention_score: 0.91
  access_count: 12
  created_at: '2026-03-01T10:00:00+00:00'
  last_accessed: '2026-03-27T09:00:00+00:00'
  tags: [user_profile, work]
  slot_key: "user::role"              # slot-addressed (set automatically)
  canonical_surface: "senior backend developer at Acme Corp"

d4e5f6:
  content: "User prefers Python, dislikes async patterns"
  type: procedural
  importance: 0.85
  retention_score: 0.88
  access_count: 34
  topic_channel: devops               # optional — route to a named channel
  visibility_scope: local             # local = private to this agent (default: global)

Edit it by hand. Commit it to Git. Roll back when needed.

Batch fact insertion — `add_many()`

Insert multiple pre-extracted facts in a single storage round-trip, without any LLM call:

# Plain strings — use method-level defaults for type/importance/tags
kb.add_many(["User deploys on Fridays", "User uses Docker", "Stack: Python + FastAPI"])

# Dicts for full control per fact
kb.add_many([
    {"content": "User is a senior backend engineer", "type": "semantic", "importance": 0.95},
    {"content": "Always use type hints", "type": "procedural", "importance": 0.8},
    {"content": "Sprint demo went well", "type": "episodic", "importance": 0.6},
])

# Mix strings and dicts — strings use method defaults
kb.add_many(
    ["Quick fact"],
    type=MemoryType.PROCEDURAL,
    importance=0.7,
)

Useful when facts come from an external source, are pre-processed by another tool, or the LLM extraction step is handled upstream.

Async API

All blocking operations have async variants that run in a thread-pool executor, keeping the asyncio event loop free during LLM HTTP calls:

Sync	Async
`kb.learn(turns, ...)`	`await kb.alearn(turns, ...)`
`kb.recall(query)`	`await kb.arecall(query)`
`kb.recall_facts(query)`	`await kb.arecall_facts(query)`

import asyncio
from ai_knot import KnowledgeBase, ConversationTurn

kb = KnowledgeBase(agent_id="bot", provider="openai", api_key="sk-...")

# FastAPI handler — never blocks the event loop
async def handle_message(turns: list[ConversationTurn]) -> str:
    await kb.alearn(turns)
    return await kb.arecall("current topic")

# Concurrent extraction for multiple agents
kb_a = KnowledgeBase(agent_id="a", provider="openai", api_key="sk-...")
kb_b = KnowledgeBase(agent_id="b", provider="openai", api_key="sk-...")
results = await asyncio.gather(
    kb_a.alearn(turns_a),
    kb_b.alearn(turns_b),
)

`learn()` options: timeout and batch_size

# Abort slow LLM calls after 10 seconds (default: 30 s)
kb.learn(turns, provider="openai", api_key="sk-...", timeout=10.0)

# Split long conversations into chunks of 10 turns per LLM call (default: 20)
# Prevents silent fact loss when the LLM truncates a large JSON response
kb.learn(turns, provider="openai", api_key="sk-...", batch_size=10)

LLM providers

ai-knot ships with 6 providers for fact extraction:

Provider	Name	Env var
OpenAI	`openai`	`OPENAI_API_KEY`
Anthropic (Claude)	`anthropic`	`ANTHROPIC_API_KEY`
GigaChat	`gigachat`	`GIGACHAT_API_KEY`
Yandex GPT	`yandex`	`YANDEX_API_KEY`
Qwen	`qwen`	`QWEN_API_KEY`
Any OpenAI-compatible	`openai-compat`	`LLM_API_KEY`

kb.learn(turns, provider="anthropic")  # uses ANTHROPIC_API_KEY from env
kb.learn(turns, provider="gigachat", api_key="...")
kb.learn(turns, provider="openai-compat", api_key="...", base_url="http://localhost:8080/v1")

OpenAI integration

from ai_knot import KnowledgeBase
from ai_knot.integrations.openai import MemoryEnabledOpenAI

kb = KnowledgeBase(agent_id="assistant")
kb.add("User prefers Python")
kb.add("User deploys on Docker")

client = MemoryEnabledOpenAI(knowledge_base=kb)

import openai
openai_client = openai.OpenAI()

messages = [{"role": "user", "content": "Write me a deployment script"}]
enriched = client.enrich_messages(messages)

response = openai_client.chat.completions.create(
    model="gpt-4o",
    messages=enriched,
)

Slot-addressed deduplication

ai-knot identifies what a fact is about using a slot_key in the form entity::attribute (e.g. user::preferred_language, project::database). When learn() extracts a new fact whose slot matches an existing one, it supersedes the old fact instead of creating a duplicate:

kb.learn([{"role":"user","content":"I use Python"}], provider="openai")
kb.learn([{"role":"user","content":"I switched to Go"}], provider="openai")

facts = kb.list_facts()
# 1 fact: "I switched to Go"  (Python fact was closed, not duplicated)

Without an LLM, add() still does lexical deduplication on exact-match text.

Topic channels and visibility

Facts can carry a topic_channel (domain routing) and visibility_scope ("global" = shared across all agents in the pool, "local" = private):

from ai_knot import KnowledgeBase
from ai_knot.types import Fact, MemoryType

kb = KnowledgeBase(agent_id="devbot", storage=storage)
kb.add("Deploy uses Helm 3", topic_channel="devops", importance=0.9)

# recall with channel filter — only "devops" facts surface
context = kb.recall("deployment tools", topic_channel="devops")

In multi-agent scenarios pass topic_channel to SharedMemoryPool.recall() for domain-isolated retrieval.

Publish gating (multi-agent)

publish() pushes private facts into the shared pool. Set utility_threshold to gate out low-signal facts:

pool = SharedMemoryPool(storage=storage)
# utility = state_confidence × importance — only facts above threshold are shared
kb.publish(pool, utility_threshold=0.3)

Auto-trust (multi-agent)

Trust is computed automatically from observed behaviour — no manual configuration:

results = pool.recall("query", "agent_b", top_k=3)
# only the 3 returned facts affect _used_count — not the wider overfetch window

trust = pool.get_trust("agent_a")   # min(1, used/published) × (1 - quick_inv_rate)

An agent whose facts appear in recalled results often and are rarely superseded quickly earns trust ≈ 1.0. An agent that publishes stale or conflicting facts sees trust decay automatically.

Delta sync (multi-agent)

sync_slot_deltas() exchanges only changed slots between agents instead of full fact sets:

# Pull lightweight delta records since last sync
deltas = pool.sync_slot_deltas("agent_b")
# Each SlotDelta carries: slot_key, op, version, content, prompt_surface

Delta token transfer is typically < 15% of full-sync volume.

Architecture

Conversation Turns
       |
[ Extractor ]         LLM distillation → tri-surface facts (canonical/witness/prompt)
       |                 + slot_key induction + ATC verification (Broder, 1997)
[ KnowledgeBase ]     slot-based dedup + importance scoring + power-law decay
       |                 + topic channels + publish gating + auto-trust
[ Storage Adapter ]   YAML / SQLite / PostgreSQL
       |
[ Retriever ]         BM25 + slot-exact + char-trigram (Jaccard) + retention fusion
       |
Context String        injected into agent system prompt

Why BM25 + slot-exact instead of embeddings? For knowledge bases up to ~10k facts, BM25 with hybrid scoring (term relevance + retention + importance) is fast and deterministic. The char-trigram ranker closes the semantic gap for paraphrases; slot-exact matching ensures the most recent value for a given attribute always surfaces first. Semantic embeddings remain on the roadmap for large corpora.

Tri-surface retrieval: each fact carries three surfaces — canonical_surface (normalised form for indexing), witness_surface (verbatim evidence supporting the fact), and prompt_surface (polished recall text injected into the system prompt). Retrieval indexes the canonical surface; output uses the prompt surface.

ATC verification: every LLM-extracted fact is verified against the source text using Asymmetric Token Containment (ATC). Facts where fewer than 60% of tokens appear in the source are flagged as supported=False, preventing hallucinated facts from polluting the knowledge base.

Known limitation: extraction quality depends on the LLM. GPT-4o extracts nuanced procedural facts reliably; smaller models (gpt-3.5-turbo, haiku) occasionally miss implicit preferences or conflate episodic events with semantic facts. When accuracy matters, use a capable model for learn().

Examples

1. Manual add + recall (no LLM required)

from ai_knot import KnowledgeBase, MemoryType

kb = KnowledgeBase(agent_id="assistant")
kb.add("User prefers Python",          type=MemoryType.PROCEDURAL, importance=0.9)
kb.add("User deploys with Docker",     importance=0.85)
kb.add("Deploy failed last Tuesday",   type=MemoryType.EPISODIC,   importance=0.4)

context = kb.recall("how to deploy?")
# -> "[procedural] User prefers Python
#     [semantic]   User deploys with Docker"

2. SQLite + OpenAI

from ai_knot import KnowledgeBase, ConversationTurn
from ai_knot.storage import SQLiteStorage

kb = KnowledgeBase(agent_id="bot", storage=SQLiteStorage(db_path="./bot.db"))
turns = [ConversationTurn(role="user", content="I work with Python and FastAPI")]
kb.learn(turns, provider="openai")          # reads OPENAI_API_KEY from env
context = kb.recall("what stack does user use?")

3. YAML storage + Anthropic (Claude)

from ai_knot import KnowledgeBase, ConversationTurn
from ai_knot.storage import YAMLStorage

kb = KnowledgeBase(agent_id="bot", storage=YAMLStorage(base_dir=".ai_knot"))
turns = [ConversationTurn(role="user", content="Always write tests with pytest")]
kb.learn(turns, provider="anthropic", api_key="sk-ant-...")
# Facts are saved to .ai_knot/bot/knowledge.yaml — readable, Git-trackable

4. PostgreSQL + any OpenAI-compatible endpoint

from ai_knot import KnowledgeBase, ConversationTurn
from ai_knot.storage import create_storage

storage = create_storage("postgres", dsn="postgresql://user:pass@db:5432/ai-knot")
kb = KnowledgeBase(agent_id="assistant", storage=storage)
turns = [ConversationTurn(role="user", content="Prefer concise answers")]
kb.learn(turns, provider="openai-compat",
         api_key="...", base_url="http://localhost:8000/v1")

5. Per-customer knowledge (support agent)

from ai_knot import KnowledgeBase

def handle_ticket(customer_id: str, message: str) -> str:
    kb = KnowledgeBase(agent_id=f"customer_{customer_id}")
    context = kb.recall(message)
    # Agent sees: past issues, preferences, tier — specific to this customer
    return context

6. Coding agent with project context

from ai_knot import KnowledgeBase, MemoryType
from ai_knot.storage import YAMLStorage

kb = KnowledgeBase(agent_id="project", storage=YAMLStorage(".ai_knot"))
kb.add("Stack: FastAPI + PostgreSQL + Docker",  importance=1.0)
kb.add("No unittest — use pytest only",         type=MemoryType.PROCEDURAL, importance=0.9)
kb.add("All endpoints require JWT auth",        importance=0.95)
# Commit .ai_knot/ to Git — new team members clone the context

7. Shared knowledge across multiple agents (SharedMemoryPool)

from ai_knot.knowledge import KnowledgeBase, SharedMemoryPool
from ai_knot.storage import SQLiteStorage
from ai_knot.types import Fact

storage = SQLiteStorage(db_path="./team.db")
pool = SharedMemoryPool(storage=storage)
pool.register("researcher")
pool.register("writer")

researcher = KnowledgeBase(agent_id="researcher", storage=storage)
writer     = KnowledgeBase(agent_id="writer",     storage=storage)

# Researcher learns a fact and publishes it to the shared pool
fact = researcher.add("API rate limit is 100 req/s", importance=0.9)
pool.publish("researcher", [fact.id], kb=researcher)

# Writer queries the pool — sees researcher's fact, tagged with trust score
results = pool.recall("rate limits", "writer", top_k=3)
for fact, score in results:
    print(f"[{score:.2f}] (trust={pool.get_trust(fact.origin_agent_id):.2f}) {fact.content}")

# Delta sync: pull only changed slots since last check
deltas = pool.sync_slot_deltas("writer")

See examples/shared_pool.py for a complete walkthrough with slot supersession and trust evolution.

8. Stats and forgetting curve

from ai_knot import KnowledgeBase

kb = KnowledgeBase(agent_id="assistant")
kb.add("User likes dark mode")
kb.add("User timezone is UTC+3")

stats = kb.stats()
print(f"Facts: {stats['total_facts']}")
print(f"Avg importance: {stats['avg_importance']:.2f}")
print(f"By type: {stats['by_type']}")

kb.decay()  # apply power-law forgetting curve — stale facts lose retention score

Roadmap

Why not just use Mem0 / Zep / LangMem?

	ai-knot	Mem0	Zep	LangMem
Self-hosted	Yes	Partial	Yes	Yes
No cloud required	Yes	No	No	Yes
Pluggable storage	Yes	No	No	No
Human-readable store	Yes	No	No	No
Setup time	30 sec	10 min	30 min	5 min
Framework-agnostic	Yes	Partial	Partial	LangGraph only
Forgetting curve (type-aware power-law)	Yes	No	No	No
Slot-addressed dedup (no LLM needed)	Yes	No	No	No
Multi-agent auto-trust	Yes	No	No	No
Topic channels + publish gating	Yes	No	No	No
Fact verification (ATC)	Yes	No	No	No
Offline eval framework	Yes	No	No	No
Snapshots + diff	Yes	No	No	No
MCP server	Yes	No	No	No
Free forever	Yes (MIT)	No	No	Yes

Contributing

PRs welcome. Especially looking for: storage backend implementations, integration adapters, retrieval strategies.

License

MIT

Found a bug or a missing backend? Open an issue. Built something with it? We'd like to hear.

Name		Name	Last commit message	Last commit date
Latest commit History 271 Commits
.github		.github
aiknotbench		aiknotbench
examples		examples
npm		npm
research		research
scripts		scripts
skills		skills
src/ai_knot		src/ai_knot
tests		tests
.gitignore		.gitignore
.mlc-config.json		.mlc-config.json
.pre-commit-config.yaml		.pre-commit-config.yaml
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
DECISIONS.md		DECISIONS.md
DEVELOPMENT.md		DEVELOPMENT.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
resume.pdf		resume.pdf

Folders and files

Latest commit

History

Repository files navigation

ai-knot

The problem

Install

Quickstart (30 seconds)

Performance

Retrieval latency (BM25, in-process)

Full-stack recall latency (storage I/O + decay + BM25)

MCP tool call round-trip (stdio transport)

What ai-knot keeps — and what it drops

Conflict resolution — no more duplicate facts

Snapshots — version your knowledge base

MCP server — use ai-knot from Claude Desktop or Claude Code

OpenClaw integration

Pluggable storage backends

Initialization — storage + LLM provider together

Retrieval with relevance scores

Memory types

Forgetting (why it matters)

LLM-enhanced features (base + enhanced)

Multilingual support (Russian)

Clock injection and configurable RRF

CLI

How knowledge looks on disk (YAML backend)

Batch fact insertion — add_many()

Async API

learn() options: timeout and batch_size

LLM providers

OpenAI integration

Slot-addressed deduplication

Topic channels and visibility

Publish gating (multi-agent)

Auto-trust (multi-agent)

Delta sync (multi-agent)

Architecture

Examples

1. Manual add + recall (no LLM required)

2. SQLite + OpenAI

3. YAML storage + Anthropic (Claude)

4. PostgreSQL + any OpenAI-compatible endpoint

5. Per-customer knowledge (support agent)

6. Coding agent with project context

7. Shared knowledge across multiple agents (SharedMemoryPool)

8. Stats and forgetting curve

Roadmap

Why not just use Mem0 / Zep / LangMem?

Contributing

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Batch fact insertion — `add_many()`

`learn()` options: timeout and batch_size

Packages