Skip to content

mingyk/coffloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

coffloader

coffloader overview

External memory for AI agents — offload context to a VFS, index caller-provided summaries, retrieve on demand.

Python License Status

pip install coffloader              # core (BM25 search)
pip install coffloader[embed]       # + semantic search (sentence-transformers)

What it does

Agents accumulate context faster than any window allows. coffloader offloads content to storage, keeps a searchable index of summaries, and retrieves full content on demand.

write(content, summary) → store blob + index summary
search(query)           → top-k summaries + addresses
read(address)           → full content

Key constraints:

  • summary is required on write — your agent/LLM provides it, not coffloader
  • No LLM calls inside the library — pure storage and retrieval
  • Caller handles contradiction detection, dedup, and reasoning

Quick start

from coffloader import Coffloader

store = Coffloader()

# 1. Offload a conversation segment (summary comes from your agent)
store.write(
    content="[Turn 1] User: I was charged twice for order #9910...",
    summary="Customer reports duplicate charge on order #9910",
    metadata={"session_id": "ticket_8842", "segment": 1},
    path="/sessions/ticket_8842/seg_001.txt",
)

# 2. Later: search when user asks about earlier context
hits = store.search("order number", namespace="/sessions/ticket_8842/")

# 3. Load full content and inject into your LLM
text = store.read_text(hits[0].address)

The loop: offload cold context → search when needed → read and inject.


API

store = Coffloader(
    backend=None,           # default: in-memory VFS
    max_bytes=512_000,      # default: 512 KB — reject oversized payloads
    on_oversize="reject",   # "reject" or "metadata_only"
    hybrid=True,            # default: True — use BM25 + embeddings if available
    min_similarity=0.3,     # default: 0.3 — filter out weak embedding matches
                            # lower = more results, less relevant
                            # higher = fewer results, more relevant  
                            # set to 0.0 to disable filtering
)

# Store content with a caller-provided summary
result = store.write(content, summary, metadata={}, path=None)

# Search indexed summaries (returns TocEntry list, not full content)
hits = store.search(query, k=5, filters={}, namespace=None)
#                         ^^^ number of results to return

# Load full content
data = store.read(address)          # bytes
text = store.read_text(address)     # str

# Check size before writing
check = store.inspect(content)      # .acceptable, .byte_count

# Delete
store.delete(address)

Defaults are exposed as class attributes:

Coffloader.DEFAULT_MAX_BYTES       # 512_000
Coffloader.DEFAULT_MIN_SIMILARITY  # 0.3

Composite backends

Route paths to different storage:

from coffloader import Coffloader, CompositeBackend, LocalBackend, MemoryBackend

store = Coffloader(
    backend=CompositeBackend(
        default=MemoryBackend(),
        routes={"/archive/": LocalBackend(root="./data")},
    )
)

Patterns

Long session (segmented): Offload every ~15 turns. Search returns precise segments, not the whole transcript.

store.write(content=turns_1_15, summary="...", path="/sessions/abc/seg_001.txt")
store.write(content=turns_16_30, summary="...", path="/sessions/abc/seg_002.txt")

Tool output: Offload large grep/API results with a structural summary (no LLM needed).

store.write(
    content=grep_output,
    summary=f"grep error src/ → {n} matches",
    path=f"/active/{session}/tool_001.txt",
)

Multi-agent: Use namespaces for isolation (/agent/{id}/) or sharing (/shared/).


Limits

  • Max payload: 512 KB by default (configurable)
  • Oversized content is rejected or recorded as metadata-only
  • No silent truncation

Status

Pre-alpha. Core API is stable: write, search, read, inspect, delete.

Working:

  • BM25 (keyword) search via SQLite FTS5
  • Semantic search via [embed] optional extra
  • Hybrid search (BM25 + embeddings) with Reciprocal Rank Fusion

Not yet implemented:

  • Persistent index to disk
  • Sharded TOC for large corpora

Non-goals

  • LLM calls from the library
  • Automatic dedup, contradiction detection, or memory merge
  • Knowledge graphs or hierarchical rollups

License

MIT

Releases

No releases published

Packages

 
 
 

Contributors

Languages