A self-hosted, local-first personal AI coordination system that gives every LLM agent on your machine -- Claude Code sessions, Ollama agents, scripts importing the cos package -- a shared memory, a shared work queue, and a shared awareness of every other agent. A background daemon harvests session transcripts, runs specialist agents against them, and feeds the results back into the same database that the next session will read at startup.
CoS is for developers who already use multiple AI coding agents and want them to remember decisions, share context, and coordinate work without relying on a cloud SaaS. Everything runs against your own PostgreSQL + pgvector database and your own Ollama instance. Cloud LLMs are an optional fallback, not the default.
| System | Local-first | Cross-session memory | Background daemon | Multi-agent swarm | Self-expanding | Harvests existing agent transcripts |
|---|---|---|---|---|---|---|
| CoS | Yes (Ollama) | Yes (pgvector) | Yes | Yes | Yes (TriggerAgentDesigner) | Yes (Claude Code JSONL) |
| CrewAI | Any model | Partial | No | Yes (roles) | No | No |
| AutoGen / MS Agent Framework | Azure-first | Partial | No | Yes | No | No |
| LangGraph | Any model | Tiered | No | Yes (graph) | No | No |
| mem0 | Yes | Yes (LLM facts) | No | No | No | No |
| Letta / MemGPT | Yes | Yes (tiered) | No (agent-driven) | Limited | Partial | No |
| Zep / Graphiti | Self-host | Yes (temporal KG) | No | No | No | No |
| Cognee | Yes | Yes (vector+graph) | No (library) | No | Partial | No |
| Khoj | Yes | Document index | No | No | No | No |
| Claude Memory Tool | Client-side | File-based | No | No | No | No |
The two distinguishing properties are the always-on daemon that harvests Claude Code session JSONL transcripts and feeds them to specialist agents, and the self-expanding agent taxonomy -- when the system encounters a session pattern it doesn't recognize, it can code-generate, register, and hot-load a new specialist agent at runtime. No other tool in the survey combines those with a swarm registry and an urgency-tiered persistent agenda. (See COS_COMPARISON.md for the full survey of 40+ peer systems.)
CoS acts as a persistent "Chief of Staff" layer that sits between you and your AI agents. In a typical workflow, you might have several Claude Code sessions open across different projects, an Ollama-based agent running a code review, and a background process ingesting meeting notes -- all at the same time. Without coordination, each of these agents operates in isolation: they don't know what the others are doing, they can't share context, and decisions made in one session are invisible to the rest.
CoS solves this by providing:
- Shared memory -- Every agent writes to and reads from the same semantic search database. A decision recorded in a morning Claude Code session is instantly recallable by an afternoon Ollama agent working on a different project.
- A persistent work queue -- Agenda items survive across sessions. You can add "deploy v2.1" at 9am, and whichever agent you open at 3pm will see it waiting.
- Swarm awareness -- Every active agent registers itself. Agents can see who else is running, what they're working on, and send messages to each other.
- Autonomous background processing -- A daemon continuously harvests session transcripts, extracts decisions and action items, discovers dropped ideas, and surfaces them as agenda items -- all using local LLM inference to keep costs near zero.
- Intelligent LLM routing -- Rather than always calling a cloud API, CoS routes each task to the best available model. Embedding and triage run locally on Ollama. Complex reasoning falls back to cloud only when needed.
You're juggling multiple projects with AI agents. You have Claude Code open in three terminals -- one for a backend service, one for a frontend, one for infrastructure. A decision in the backend session ("we're switching from REST to gRPC") should be visible when the frontend agent asks about API integration. CoS makes this automatic: the daemon's SessionHarvestWorker picks up the transcript, the DecisionRecorder agent extracts the decision, and it becomes searchable via cos recall "API protocol" from any session.
You lose track of what was discussed and decided. After a long coding session, it's easy to forget the three minor decisions you made, the two ideas you deferred, and the one bug you noticed but didn't fix. CoS's specialist agents -- DecisionRecorder, VisionKeeperExpert, ActionItemExtractor, BugRecorder -- run automatically against your session transcripts and surface everything as structured, searchable records.
You want AI agents to work autonomously in the background. From the web dashboard, you can spawn pre-configured agents to run code reviews, fix failing tests, run linters, perform security audits, or write documentation -- all without occupying a terminal session. The agent task queue coordinates everything.
You want to minimize cloud API costs. CoS's routing layer prefers local Ollama models for everything they can handle well (embedding, classification, summarization, simple extraction) and only falls back to cloud APIs for tasks that genuinely require it. On typical hardware (M3 Max, 64GB), this reduces cloud API spend by roughly 95% compared to routing everything through Claude or GPT.
You want a morning briefing that knows what happened yesterday. Run cos brief and get a formatted summary of your priorities, open agenda items, active projects, and what the swarm has been doing -- all pulled from the database, not regenerated from scratch.
Here's a typical day with CoS running:
-
8:00am -- You open a Claude Code session. CoS bootstraps automatically: registers the session as a boss, checks for messages from other agents, and shows you any urgent agenda items and hot items from overnight processing.
-
8:05am -- You start coding. In the background, the daemon is already processing last night's sessions. The TranscriptAnomalyExpert found a spec discussion in yesterday's evening session and queued a SpecFileExpert, which extracted 4 action items and registered a formal spec document.
-
10:30am -- You finish a feature and open a new session in a different project. CoS shows you the 4 action items from the spec as
this_weekurgency. You also see that Boss B (an Ollama agent running a security audit) finished and found 2 medium-severity issues. -
2:00pm -- From the web dashboard, you spawn a "fix tests" agent against the morning project. It claims the task from the queue, runs the test suite, fixes 3 failures, and marks the task complete.
-
5:00pm -- You run
cos briefbefore shutting down. It shows you what was accomplished, what's still open, and what the VisionKeeperExpert surfaced as a dropped idea worth revisiting.
One of CoS's most powerful features is its self-expanding trigger pipeline. When the TranscriptAnomalyExpert encounters a conversation pattern it doesn't recognize, it proposes a new trigger type. The TriggerReviewExpert evaluates whether it's genuinely new or a duplicate. If it's new, the TriggerAgentDesigner writes a new specialist agent -- including the Python code, the system prompt, and the database registration -- and hot-loads it into the running daemon. The system literally grows new capabilities by observing your work.
flowchart LR
SESSION["New session<br/>transcript"] --> TAE["TranscriptAnomaly<br/>Expert"]
TAE -->|"known trigger"| SPECIALIST["Existing specialist<br/>(SpecFileExpert,<br/>DecisionRecorder, etc.)"]
TAE -->|"unknown pattern"| PROPOSE["Propose new<br/>trigger type"]
PROPOSE --> TRE["TriggerReview<br/>Expert"]
TRE -->|"duplicate/noise"| REJECT["Rejected"]
TRE -->|"genuinely new"| TAD["TriggerAgent<br/>Designer"]
TAD --> NEW["New agent written,<br/>registered, and<br/>hot-loaded"]
style TAE fill:#4a4a6a,color:#fff
style TAD fill:#2d6a4f,color:#fff
style NEW fill:#2d6a4f,color:#fff
graph TB
subgraph Interfaces
CLI["cos CLI"]
API["Python API<br/><code>import cos</code>"]
DASH["Web Dashboard<br/>:7432"]
end
subgraph Daemon["cos-daemon (background workers)"]
SH["SessionHarvest<br/>120s"]
MS["MemorySync<br/>300s"]
SW["SwarmWorker<br/>60s"]
CU["ContextUpdate<br/>120s"]
LD["LLMDiscovery<br/>300s"]
BW["BossWakeup<br/>180s"]
end
subgraph Storage
PG["PostgreSQL + pgvector<br/><code>chief_of_staff</code>"]
end
subgraph Inference
OL["Ollama (local)"]
CL["Cloud LLMs<br/>(fallback)"]
end
CLI --> PG
API --> PG
DASH --> PG
SH --> PG
MS --> PG
SW --> PG
CU --> PG
LD --> OL
SH --> OL
BW --> PG
API --> OL
API --> CL
style PG fill:#336791,color:#fff
style OL fill:#1a1a2e,color:#fff
style CL fill:#4a4a6a,color:#fff
graph LR
B1["Boss A<br/>(Claude Code)"] -- heartbeat --> SWARM["Swarm Registry<br/><code>ops.bosses</code>"]
B2["Boss B<br/>(Ollama agent)"] -- heartbeat --> SWARM
B1 -- "spawns" --> M1["Minion"]
B1 -- "queues" --> AT["Agent Task<br/><code>ops.agent_tasks</code>"]
AT -- "claimed by" --> AG["Specialist Agent<br/>(TranscriptAnomalyExpert, etc.)"]
B1 -. "message" .-> B2
style SWARM fill:#2d6a4f,color:#fff
style AT fill:#6a2d4f,color:#fff
| Term | Description |
|---|---|
| Boss | Any LLM agent (Claude Code, Ollama, Codex) registered in the swarm. Each boss has a name, model, working directory, and heartbeat. A single developer might have 3-5 bosses active simultaneously across different projects. |
| Minion | A subagent spawned by a boss for a specific subtask. For example, a boss working on a feature might spawn a minion to run the test suite. Minions are tracked in ops.minions with their parent boss, task, and result. |
| Agent task | A queued job for a specialist agent. Unlike minions (which are ad-hoc), agent tasks are typed: TranscriptAnomalyExpert, SpecFileExpert, DecisionRecorder, etc. The daemon claims and runs them automatically. |
| Agenda item | A work item with an urgency level (now, today, this_week, soon, later). Items can be created by humans (cos add), by agents (ActionItemExtractor), or by the dashboard. They persist until explicitly resolved. |
| Memory/RAG | Semantic vector search across three sources: memories (typed notes with importance), facts (subject-predicate-object triples), and document chunks (auto-split with 50% overlap). All embeddings are 768-dimensional via nomic-embed-text. |
CoS ships with 16 baseline specialist agents, each designed for a single responsibility. New agents can be code-generated and hot-loaded at runtime by the TriggerAgentDesigner, so the live registry typically grows past the baseline. The current registry is defined in cos/agents.py (AGENT_REGISTRY).
graph TB
subgraph "Session Analysis Pipeline"
TAE["TranscriptAnomalyExpert<br/><i>Classifies sessions against<br/>trigger taxonomy</i>"]
SFE["SpecFileExpert<br/><i>Extracts specs from design<br/>discussions, creates action items</i>"]
VKE["VisionKeeperExpert<br/><i>Finds dropped ideas and<br/>deferred thoughts</i>"]
DR["DecisionRecorder<br/><i>Extracts decisions (tech choices,<br/>config, architecture)</i>"]
AIE["ActionItemExtractor<br/><i>Pulls concrete tasks<br/>and next steps</i>"]
BR["BugRecorder<br/><i>Logs bug context,<br/>creates fix items</i>"]
ADE["ArchDocExpert<br/><i>Creates architecture notes<br/>from design discussions</i>"]
end
subgraph "Self-Expansion Pipeline"
TRE["TriggerReviewExpert<br/><i>Evaluates proposed new<br/>trigger types</i>"]
TAD["TriggerAgentDesigner<br/><i>Writes and registers<br/>new agents</i>"]
end
TAE -->|"spec_session"| SFE
TAE -->|"vision_drift"| VKE
TAE -->|"decision_made"| DR
TAE -->|"action_items"| AIE
TAE -->|"bug_session"| BR
TAE -->|"architecture_discussion"| ADE
TAE -->|"unknown"| TRE
TRE -->|"genuinely new"| TAD
style TAE fill:#4a4a6a,color:#fff
style TAD fill:#2d6a4f,color:#fff
| Agent | Trigger | What it produces |
|---|---|---|
| TranscriptAnomalyExpert | Every significant session | Classification + delegation to the right specialist |
| SpecFileExpert | spec_session |
Formal spec document in ops.specs, action items on the agenda |
| VisionKeeperExpert | vision_drift |
[Vision] agenda items for dropped ideas (importance >= 6) |
| DecisionRecorder | decision_made |
Records in ops.decisions, vectorized for recall |
| ActionItemExtractor | action_items |
Agenda items with urgency, assigned to the source project |
| BugRecorder | bug_session |
Bug summaries vectorized for search, fix items if unresolved |
| ArchDocExpert | architecture_discussion |
Architecture notes ingested as documents |
| TriggerReviewExpert | Unknown trigger proposed | Dedup check, approves or rejects proposals |
| TriggerAgentDesigner | Approved new trigger | New agent code written, registered, and hot-loaded |
| Requirement | Version | Notes |
|---|---|---|
| Python | 3.11+ | For local install; Docker handles this otherwise |
| PostgreSQL | 14+ | With pgvector extension enabled |
| Ollama | latest | Running locally or on a reachable host |
| Docker + Compose | latest | For containerized deployment only |
CoS is designed for multi-host deployment. A typical setup has a server (Linux) that hosts the database and the canonical repo, and one or more client machines (macOS laptops, other Linux hosts) that connect to the same database. Every machine runs its own daemon to harvest local sessions and keep local status files current.
The examples below use /srv/apps/cos as the server install path -- substitute any path you prefer. The bootstrap script in ~/.claude/CLAUDE.md probes both /srv/apps/cos/.venv/bin/python (Linux server) and ~/.cos/venv/bin/python (client) by default; if you install elsewhere, edit the probe paths in your bootstrap accordingly.
graph TB
subgraph Server ["Server (Linux)"]
REPO["Git repo<br/>/srv/apps/cos"]
VENV_S[".venv"]
DAEMON_S["cos-daemon<br/>(Docker or systemd)"]
PG["PostgreSQL + pgvector"]
OL["Ollama"]
end
subgraph Laptop ["Client (macOS laptop)"]
VENV_L["~/.cos/venv<br/>(pip install from git)"]
DAEMON_L["cos-daemon<br/>(LaunchAgent)"]
CLAUDE_L["Claude Code sessions<br/>(~/.claude/)"]
end
DAEMON_S --> PG
DAEMON_L --> PG
DAEMON_S --> OL
DAEMON_L -.->|"optional"| OL_L["Local Ollama"]
CLAUDE_L --> DAEMON_L
style PG fill:#336791,color:#fff
style OL fill:#1a1a2e,color:#fff
The server hosts the git repo, the database, and optionally runs Ollama for local inference.
# Clone the repo
git clone https://github.com/YOUR-USER/cos.git /srv/apps/cos
cd /srv/apps/cos
# Create a virtualenv and install
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
# Verify installation
cos statusThis installs two entry points:
| Command | Description |
|---|---|
cos |
The CLI (all subcommands) |
cos-daemon |
The background daemon |
Client machines don't need a full repo checkout -- just pip install the package from the git remote. The venv lives at ~/.cos/venv so it's colocated with the rest of the CoS state.
# Install cos from the git repo (no local clone needed)
python3 -m venv ~/.cos/venv
~/.cos/venv/bin/pip install git+https://github.com/YOUR-USER/cos.git
# Verify it can reach the database
~/.cos/venv/bin/python -c "import cos; cos.briefing()"To upgrade later:
~/.cos/venv/bin/pip install --upgrade git+https://github.com/YOUR-USER/cos.gitA pre-built image is available from the container registry:
# Pull from registry
docker pull ghcr.io/YOUR-USER/cos:latest
# Or build locally
make buildRun via docker-compose:
# Start the daemon (background)
make up
# Run CLI commands inside the container
make cli CMD="status"
make cli CMD="recall 'some query'"
# View daemon logs
make logs
# Stop everything
make downTo build and push a new image:
docker build --provenance=false -t ghcr.io/YOUR-USER/cos:latest .
docker push ghcr.io/YOUR-USER/cos:latestNote: The --provenance=false flag is required. Without it, Docker attaches a build attestation manifest that GitLab's container registry doesn't support (you'll see "invalid tag: missing manifest digest").
The docker-compose binds host paths into the container so the daemon reads ~/.claude/ sessions directly and writes status files to ~/.cos/ on the host filesystem:
volumes:
- ${HOME}/.cos:/root/.cos # status.json, hot_items.md, harvested_sessions.json
- ${HOME}/.claude:/root/.claude:ro # session JSONL files, memory markdown filesThis is critical -- if ~/.cos is a Docker volume instead of a host bind mount, the statusline and session bootstraps won't see the daemon's output. If ~/.claude isn't mounted, the SessionHarvestWorker won't find any sessions to process.
CoS runs fully standalone by default. The myma integration (meeting-notes
sync + notebook push from SpecFileExpert/PlanExpert) is opt-in.
To enable it:
-
Set
COS_ENABLE_MYMA=1in your.env(and optionallyCOS_MYMA_URL/COS_MYMA_TOKENif your myma instance isn't on the defaults). -
Start the stack with the
mymaprofile so the daemon/CLI containers join themyma_myma-netnetwork and mount themyma_myma-dbvolume:docker compose --profile myma up -d
This brings up
cos-daemon-myma(andcos-cli-mymafor one-shot CLI runs) instead of the plaincos-daemon. Without the profile, no myma volumes or networks are touched.
When COS_ENABLE_MYMA is unset, all myma calls in cos.integrations.myma
short-circuit to no-ops.
Create the database with pgvector, then apply migrations:
# Create database (if it doesn't exist)
createdb -h <host> -U postgres chief_of_staff
psql -h <host> -U postgres -d chief_of_staff -c 'CREATE EXTENSION IF NOT EXISTS vector;'
# Apply migrations in order
psql -h <host> -U postgres -d chief_of_staff -f migrations/001_swarm_and_pipeline.sql
psql -h <host> -U postgres -d chief_of_staff -f migrations/002_meeting_assistant_sync.sqlerDiagram
ops_bosses {
uuid id PK
text name
text model
text working_dir
timestamp last_seen
text status
}
ops_agenda_items {
uuid id PK
text title
text body
text urgency
text category
timestamp resolved_at
}
ops_agent_tasks {
uuid id PK
text agent_type
text context_ref
jsonb context_raw
int priority
text status
}
rag_memories {
uuid id PK
text mem_type
text title
text body
vector embedding
int importance
}
rag_chunks {
uuid id PK
uuid doc_id FK
text content
vector embedding
}
rag_documents {
uuid id PK
text title
text source
text doc_type
}
sessions_conversations {
uuid id PK
text project_dir
text summary
}
sessions_turns {
uuid id PK
uuid conversation_id FK
text role
text content
int turn_index
}
rag_documents ||--o{ rag_chunks : "chunked into"
sessions_conversations ||--o{ sessions_turns : "contains"
ops_bosses ||--o{ ops_agent_tasks : "spawns"
The full schema spans five Postgres schemas:
| Schema | Purpose |
|---|---|
principal |
User profile, preferences, sensitivities, values, habits |
people |
Contacts, relationships, interaction history |
ops |
Bosses, agenda, agent tasks, triggers, specs, LLM nodes, routing log |
rag |
Memories, documents, chunks, facts (all with vector embeddings) |
sessions |
Conversation transcripts, turns, breadcrumbs |
# Required: embedding model
ollama pull nomic-embed-text
# Recommended: local inference models
ollama pull qwen2.5:7b # triage, classification (fast)
ollama pull qwen2.5:14b # summarization
ollama pull qwen2.5-coder:14b # code generationSee COSWORK.md for the full cost/quality analysis of local vs. cloud models.
All configuration is via environment variables. Set them in your shell, a .env file (loaded by docker-compose), or export before running.
# Example .env file
COS_DB_HOST=db.example.internal
COS_DB_PORT=5432
COS_DB_NAME=chief_of_staff
COS_DB_USER=postgres
COS_DB_PASS=
OLLAMA_BASE=http://127.0.0.1:11434
COS_EMBED_MODEL=nomic-embed-text
COS_NOTIFY_BACKEND=log
COS_AUTO_SPAWN=true
COS_DASHBOARD_PORT=7432| Variable | Default | Description |
|---|---|---|
COS_DB_HOST |
db.example.internal |
PostgreSQL host (use postgres when running via the bundled docker-compose.yml, since that is the compose service name) |
COS_DB_PORT |
5432 |
PostgreSQL port |
COS_DB_NAME |
chief_of_staff |
Database name |
COS_DB_USER |
postgres |
Database user |
COS_DB_PASS |
(empty) | Database password |
OLLAMA_BASE |
http://127.0.0.1:11434 |
Ollama API base URL |
COS_EMBED_MODEL |
nomic-embed-text |
Embedding model name |
COS_NOTIFY_BACKEND |
macos (Darwin) / log (Linux) |
Notification backend(s), comma-separated |
COS_AUTO_SPAWN |
true |
Auto-spawn agents from the task queue |
COS_DASHBOARD_PORT |
7432 |
Web dashboard port |
COS_MEETING_NOTES_DIR |
/data/meeting-notes |
Meeting notes directory |
COS_MEETING_DB |
/data/myma/meetings.db |
Meeting assistant SQLite DB |
COS_OLLAMA_NODES |
(empty) | Extra Ollama node URLs (comma-separated) |
cos Show status (default when no subcommand)
cos status Dashboard overview: bosses, agenda, tasks, LLM nodes
cos brief Full morning briefing
cos ls [--all] [--cat] List agenda items grouped by urgency
cos add <title> Add agenda item
cos done <item_id> Resolve an agenda item
cos recall <query> Semantic search across all memory
cos swarm Show all registered bosses
cos tasks [--all] Show agent task queue
cos specs [id] List or view specs
cos routing Show LLM routing stats
cos sessions List harvested sessions
cos dashboard Open web dashboard in browser
cos daemon Start the background daemon (foreground)
# Quick status check
cos status
# Add work items at different urgency levels
cos add "Review PR #42" --urgency today --category code-review
cos add "Investigate flaky test" --urgency this_week --body "test_auth_flow fails intermittently"
cos add "Upgrade dependencies" --urgency later
# Search memory semantically
cos recall "authentication middleware changes"
cos recall "deployment process" -n 10
# Mark something done
cos done 3fa8b2c1-...
# Morning briefing
cos brief
# See who's active in the swarm
cos swarm
# Check agent task queue
cos tasks
cos tasks --all # include completed
# LLM routing performance
cos routing
# Open the web dashboard
cos dashboard
# Start daemon in foreground (for debugging)
cos daemonYou sit down and want to know what happened overnight:
# What's the system state?
cos status
# Full briefing with priorities, agenda, and swarm activity
cos brief
# Anything about that deploy discussion yesterday?
cos recall "production deploy timeline"
# See what agents ran overnight
cos tasks --all
# Check if any specs were generated from yesterday's architecture chat
cos specs --project my-backendYou're midway through a session and want to note things for later:
# Quick capture -- will show up in your next briefing
cos add "Revisit caching strategy for /api/users" --urgency soon
# High-priority item that should be addressed today
cos add "Fix broken migration on staging" --urgency now --category fix
# Done with the migration fix
cos done <item_id>make build # docker compose build
make up # start cos-daemon (background)
make down # stop all services
make restart # restart cos-daemon
make logs # tail daemon logs
make cli CMD="status" # run any CLI subcommand in container
make status # shortcut for cos status
make agenda # shortcut for cos agenda
make routing # shortcut for cos routingThe package exports a flat public API via import cos. Every function listed below works from any Python process -- a script, a REPL, a Jupyter notebook, or inside a running LLM agent session.
CoS provides three kinds of semantic storage, all searchable through a single recall() call:
- Memories -- Typed notes (project, feedback, user, reference) with an importance score. Good for capturing context, decisions, and observations.
- Facts -- Subject-predicate-object triples with a confidence score. Good for structured knowledge ("auth-middleware is-blocked-by legal-review").
- Document chunks -- Long-form text auto-split into 400-word overlapping chunks. Good for specs, meeting notes, architecture docs.
flowchart LR
INPUT["Text input"] --> CHUNK["Chunking<br/>(400 words, 50% overlap)"]
CHUNK --> EMBED["Embedding<br/>(nomic-embed-text)"]
EMBED --> STORE["pgvector storage"]
QUERY["Search query"] --> QEMBED["Embed query"]
QEMBED --> SIM["Cosine similarity<br/>search"]
STORE --> SIM
SIM --> RESULTS["Ranked results<br/>(memories + facts + chunks)"]
Python:
import cos
# Store a memory (auto-embedded)
cos.remember("project", "Auth rewrite",
"Rewriting auth middleware for compliance.",
importance=8, domain="security")
# Semantic search
results = cos.recall("authentication changes", n=5)
for r in results:
print(r["title"], r["similarity"])
# Store a structured fact (subject-predicate-object triple)
cos.add_fact("auth-middleware", "blocked_by", "legal-review", confidence=0.9)
# Ingest a document (auto-chunked, auto-embedded)
cos.ingest_document("Architecture Decision Record",
open("adr-003.md").read(),
source="adr-003.md", doc_type="adr")
# Sync Claude Code local memory files (~/.claude/projects/*/memory/*.md) to DB
cos.sync_local_memories()Shell (via python -c or interactive):
# Semantic recall from the command line
cos recall "authentication middleware changes"
# Quick recall from Python one-liner
python -c "import cos; [print(r['title'], round(r['similarity'],2)) for r in cos.recall('auth changes')]"
# Store a memory
python -c "import cos; cos.remember('project', 'Deploy v2.1', 'Deployed to prod successfully', importance=5)"
# Ingest a file
python -c "
import cos
cos.ingest_document('Meeting Notes', open('notes.md').read(), source='notes.md', doc_type='meeting')
"You're picking up a project after a week away and don't remember the details:
# What do we know about this project's auth work?
cos recall "auth middleware refactor"
# Were any decisions made?
cos recall "decision auth"
# Any specs generated?
cos specs --project my-backend# Or programmatically in a script
import cos
results = cos.recall("auth middleware", n=10)
for r in results:
print(f"[{r['source']}] {r['title']} (similarity: {r['similarity']:.2f})")
print(f" {r['body'][:120]}...")You have a folder of architecture decision records you want to make searchable:
# Ingest all markdown files in a directory
for f in docs/adr/*.md; do
python -c "
import cos
title = '$(basename "$f" .md)'
content = open('$f').read()
cos.ingest_document(title, content, source='$f', doc_type='adr')
print(f'Ingested: {title}')
"
done
# Now they're all searchable
cos recall "database migration strategy"The agenda is a persistent work queue that survives across sessions. Items have an urgency level that determines their sort order and visual treatment in the CLI and dashboard.
Urgency levels, lowest to highest:
graph LR
L["later"] --> S["soon"] --> TW["this_week"] --> T["today"] --> N["now"]
style L fill:#555,color:#fff
style S fill:#666,color:#fff
style TW fill:#0891b2,color:#fff
style T fill:#ca8a04,color:#fff
style N fill:#dc2626,color:#fff
Items marked now or today appear in ~/.cos/hot_items.md and are shown at the top of every new agent session. Items are automatically deduplicated -- if you add an item that's 85% similar to an existing one, the duplicate is silently dropped.
Python:
import cos
cos.add_agenda("Deploy v2.1", body="Staging passed, ready for prod",
urgency="today", category="deploy")
items = cos.get_agenda() # unresolved, sorted by urgency
all_items = cos.get_agenda(include_all=True) # include resolved
cos.resolve_agenda(item_id)Shell:
# Add items
cos add "Deploy v2.1" --urgency today --category deploy --body "Staging passed"
cos add "Write tests for auth module" --urgency this_week
# List items
cos ls # open items grouped by urgency
cos ls --all # include resolved
cos ls --cat deploy # filter by category
# Resolve
cos done <item_id># Build up the checklist
cos add "Run full test suite on staging" --urgency today --category release
cos add "Update CHANGELOG for v2.1" --urgency today --category release
cos add "Tag release v2.1" --urgency today --category release
cos add "Deploy to production" --urgency today --category release
cos add "Post-deploy smoke tests" --urgency today --category release
# View just the release items
cos ls --cat release
# Mark them off as you go
cos done <test_item_id>
cos done <changelog_item_id>The swarm system lets multiple agents see each other and communicate. Every agent that calls register_boss() becomes visible in the swarm registry. Agents that stop heartbeating for 15 minutes are automatically marked as gone by the SwarmWorker.
This is especially useful when you have agents working on related projects -- a frontend agent can check whether the backend agent is still active before making assumptions about API contracts.
sequenceDiagram
participant B1 as Boss A (Claude Code)
participant DB as PostgreSQL
participant B2 as Boss B (Ollama)
B1->>DB: register_boss()
B2->>DB: register_boss()
B1->>DB: heartbeat() (every few min)
B2->>DB: heartbeat()
B1->>DB: log_activity("Refactoring auth")
B1->>DB: send_message("Need review", to=B2)
B2->>DB: check_messages()
DB-->>B2: [message from B1]
B1->>DB: spawn_minion("Run tests")
Python:
import cos
# Register this agent as a boss
boss_id = cos.register_boss("my-project", "claude-sonnet-4-6",
working_dir="/path/to/project")
# Heartbeat (call periodically to stay "active")
cos.heartbeat()
# Log what you're doing
cos.log_activity("Refactoring auth module", event_type="code")
# See who else is running
bosses = cos.get_active_bosses(stale_minutes=10)
# Message another boss (or broadcast to all)
cos.send_message("Need help", "Can someone review auth changes?")
cos.send_message("Direct msg", "Check the tests", to_boss_id=other_boss_id)
# Check inbox
messages = cos.check_messages(unread_only=True)Shell:
# See active bosses
cos swarm
# Register + interact via Python one-liners
python -c "import cos; bid = cos.register_boss('cli-task', 'manual'); print(bid)"
python -c "import cos; print(cos.get_active_bosses())"
python -c "import cos; cos.send_message('Status update', 'Auth module complete')"You have a frontend and backend agent running simultaneously. The backend agent finishes an API change:
# In the backend agent session
import cos
cos.log_activity("Completed gRPC migration for /api/users", event_type="code")
cos.send_message("API change", "Switched /api/users from REST to gRPC. Proto files in shared/proto/.")
cos.add_agenda("Update frontend API client for gRPC", urgency="today",
category="build", responsible_boss="/path/to/frontend")The next time the frontend agent starts, it sees the message and the agenda item automatically.
The agent task queue is how the daemon dispatches work to specialist agents. Tasks are claimed atomically (FOR UPDATE SKIP LOCKED) so multiple daemon instances won't double-process. Each task has a type (which agent handles it), a priority (1-10, higher = more urgent), and a context payload.
flowchart LR
Q["queue_agent_task()"] --> QUEUE["ops.agent_tasks<br/>(pending)"]
QUEUE --> CLAIM["claim_agent_task()<br/>(atomic pop)"]
CLAIM --> RUN["Specialist Agent<br/>runs"]
RUN --> DONE["complete_agent_task()<br/>(result stored)"]
Python:
import cos
# Queue a specialist agent
cos.queue_agent_task("TranscriptAnomalyExpert",
context_ref="session-abc-123",
context_raw={"transcript": "..."},
priority=5)
# Claim a task (used by daemon/workers)
task = cos.claim_agent_task(agent_type="TranscriptAnomalyExpert")
# Complete a task
cos.complete_agent_task(task["id"], result={"findings": [...]})Shell:
# View the agent task queue
cos tasks
cos tasks --all # include completed/failed
# Queue a task from the command line
python -c "
import cos
cos.queue_agent_task('ActionItemExtractor',
context_ref='session-id-here',
context_raw={'source': 'manual'},
priority=8)
print('Task queued')
"You realize an important meeting transcript wasn't fully analyzed:
# Find the session ID
cos sessions
# Manually queue the agents you want to run against it
python -c "
import cos
sid = 'abc123-session-id'
cos.queue_agent_task('DecisionRecorder', context_ref=sid, priority=8)
cos.queue_agent_task('ActionItemExtractor', context_ref=sid, priority=8)
cos.queue_agent_task('VisionKeeperExpert', context_ref=sid, priority=5)
print('Queued 3 agents')
"
# Watch them process
cos tasksThe router is CoS's inference layer. Rather than hardcoding which model to use, it maintains a registry of available Ollama nodes and their models, scores them based on historical performance (success rate, latency, quality), and picks the best match for each task type.
The quality feedback loop is key: every time an agent calls _llm_json(), the response is automatically scored (parseable JSON = 8.0, unparseable = 2.0). Over time, the router learns which models actually work well for each task and adjusts its scoring. You can also provide manual feedback via rate_response().
flowchart TB
REQ["Task request<br/>(e.g. summarize)"] --> ROUTER["Router"]
ROUTER --> SCORE["Score nodes by:<br/>- model match<br/>- historical success rate<br/>- avg latency<br/>- avg quality"]
SCORE --> PICK["Best (node, model)"]
PICK --> CALL["Ollama API call"]
CALL --> RESP["Response"]
RESP --> LOG["Log to ops.llm_routing_log"]
RESP -.-> FB["Quality feedback<br/>(rate_response)"]
FB -.-> SCORE
style ROUTER fill:#4a4a6a,color:#fff
Task types and their preferred model tiers:
| Task Type | Best Models | Fallback |
|---|---|---|
embed |
nomic-embed-text | -- |
triage |
qwen3:8b, llama3, mistral | phi3 |
extract |
qwen3:8b, qwen2.5:14b | llama3 |
summarize |
qwen3:8b, qwen2.5:14b | mistral |
analyze |
qwq:32b, deepseek-r1:32b, qwen3:30b | llama3 |
spec |
qwq:32b, qwen3:30b, deepseek-r1:32b | llama3 |
code |
qwen3-coder:30b, devstral, codestral | llama3 |
Python:
import cos
# Route a task to the best available model
base_url, model = cos.route("summarize")
# Use the routed LLM directly (returns text + log_id)
text, log_id = cos.routed_llm("Summarize this document...",
task_type="summarize",
system="You are a summarizer.")
# Provide quality feedback (improves future routing)
cos.rate_response(log_id, quality_score=8)
# Refresh node discovery
cos.refresh_nodes()
nodes = cos.get_active_nodes()Shell:
# View routing stats
cos routing
# Refresh node discovery manually
python -c "import cos; cos.refresh_nodes(); print(cos.get_active_nodes())"
# Quick routed LLM call
python -c "
import cos
text, lid = cos.routed_llm('What is 2+2?', task_type='triage')
print(text)
"You set up Ollama on a second machine and want CoS to use it:
# Option 1: Add via environment variable (persists in .env)
echo 'COS_OLLAMA_NODES=http://192.168.1.50:11434' >> .env
# Option 2: Add to the database directly
psql -h <host> -U postgres -d chief_of_staff -c \
"INSERT INTO ops.llm_nodes (base_url, label, priority) VALUES ('http://192.168.1.50:11434', 'workstation', 5);"
# Force a refresh
python -c "import cos; cos.refresh_nodes()"
# Verify it's discovered
cos routingStanding orders are persistent instructions stored in the database that apply to every boss regardless of model. They encode your preferences and operating rules -- things like "prefer short responses", "no emoji", or "always confirm before destructive actions". Every boss loads them at session start.
Python:
import cos
orders = cos.get_standing_orders(scope="global")
context = cos.get_principal_context() # full user profile
cos.briefing() # print formatted briefingShell:
# Morning briefing
cos brief
# Dump standing orders
python -c "import cos; [print(o['title']) for o in cos.get_standing_orders()]"
# View your full principal context
python -c "
import cos
ctx = cos.get_principal_context()
for section, content in ctx.items():
print(f'=== {section} ===')
print(content[:200])
print()
"The daemon (cos-daemon) runs background worker threads that keep the system alive. Each worker runs in its own thread, on its own interval, with independent error handling -- a failure in one worker doesn't affect the others.
flowchart TB
MAIN["cos-daemon main()"] --> SH & MS & SW & CU & LD & BW
SH["SessionHarvestWorker<br/>every 120s"]
MS["MemorySyncWorker<br/>every 300s"]
SW["SwarmWorker<br/>every 60s"]
CU["ContextUpdateWorker<br/>every 120s"]
LD["LLMDiscoveryWorker<br/>every 300s"]
BW["BossWakeupWorker<br/>every 180s"]
SH -->|"scans"| JSONL["~/.claude/**/*.jsonl"]
SH -->|"archives to"| DB["PostgreSQL"]
SH -->|"skims via"| OL["Ollama"]
SH -->|"queues"| TASKS["ops.agent_tasks"]
MS -->|"syncs"| MEM["~/.claude/**/memory/*.md"]
MS -->|"embeds into"| DB
SW -->|"writes"| SS["~/.cos/swarm_status.md"]
CU -->|"writes"| HI["~/.cos/hot_items.md"]
CU -->|"writes"| SJ["~/.cos/status.json"]
LD -->|"probes"| OL
style MAIN fill:#4a4a6a,color:#fff
| Worker | Interval | What it does |
|---|---|---|
| SessionHarvestWorker | 120s | Scans Claude Code session JSONL files, archives transcripts, skims for significance via local LLM, queues specialist agents. Tracks processed files in ~/.cos/harvested_sessions.json to avoid re-work. |
| MemorySyncWorker | 300s | Syncs ~/.claude/projects/*/memory/*.md into the RAG database with embeddings. Watches file mtimes to detect changes and re-embed updated files. |
| SwarmWorker | 60s | Marks stale bosses (no heartbeat in 15min) as gone, writes ~/.cos/swarm_status.md with a human-readable summary of who's active and what they're doing. |
| ContextUpdateWorker | 120s | Writes ~/.cos/hot_items.md (urgent agenda + recent completions/failures) and ~/.cos/status.json. These files are read by new agent sessions at bootstrap. |
| LLMDiscoveryWorker | 300s | Probes Ollama nodes from DB + COS_OLLAMA_NODES, updates availability and model lists. Detects when nodes go offline or new models are pulled. |
| BossWakeupWorker | 180s | Checks for agenda items assigned to a specific boss that has gone inactive. Sends a notification to remind you to restart that agent. |
The daemon is safe to run on multiple hosts simultaneously. Each instance harvests sessions and syncs memories from its own ~/.claude/ directory, but they all write to the same database. This means Claude Code sessions on your laptop get harvested by the laptop's daemon, and sessions on the server get harvested by the server's daemon.
flowchart LR
subgraph Server
DS["cos-daemon"] -->|"harvests"| SS["~/.claude/<br/>(server sessions)"]
DS -->|"writes"| DB["PostgreSQL"]
end
subgraph Laptop
DL["cos-daemon"] -->|"harvests"| SL["~/.claude/<br/>(laptop sessions)"]
DL -->|"writes"| DB
end
DS -->|"writes"| HS["~/.cos/status.json<br/>hot_items.md<br/>(server)"]
DL -->|"writes"| HL["~/.cos/status.json<br/>hot_items.md<br/>(laptop)"]
# Background via Docker (logs via `make logs`)
make up
# View logs
make logs
# Restart after code changes
make restartcos-daemon
# or
cos daemonOn macOS, set up a LaunchAgent so the daemon starts automatically and restarts on crash:
cat > ~/Library/LaunchAgents/com.cos.daemon.plist << 'EOF'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.cos.daemon</string>
<key>ProgramArguments</key>
<array>
<string>/Users/YOUR_USER/.cos/venv/bin/cos-daemon</string>
</array>
<key>EnvironmentVariables</key>
<dict>
<key>OLLAMA_BASE</key>
<string>http://127.0.0.1:11434</string>
<key>COS_DB_HOST</key>
<string>db.example.internal</string>
<key>COS_NOTIFY_BACKEND</key>
<string>macos</string>
<key>PATH</key>
<string>/Users/YOUR_USER/.cos/venv/bin:/usr/local/bin:/usr/bin:/bin</string>
</dict>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/Users/YOUR_USER/.cos/daemon.log</string>
<key>StandardErrorPath</key>
<string>/Users/YOUR_USER/.cos/daemon.err</string>
<key>ThrottleInterval</key>
<integer>30</integer>
</dict>
</plist>
EOF
# Load it (starts immediately)
launchctl load ~/Library/LaunchAgents/com.cos.daemon.plist
# Verify it's running
launchctl list | grep cosManaging the LaunchAgent:
# Check status (PID and exit code)
launchctl list | grep cos
# Stop
launchctl unload ~/Library/LaunchAgents/com.cos.daemon.plist
# Restart (unload + load)
launchctl unload ~/Library/LaunchAgents/com.cos.daemon.plist
launchctl load ~/Library/LaunchAgents/com.cos.daemon.plist
# View logs
tail -f ~/.cos/daemon.log
tail -f ~/.cos/daemon.errIf Ollama isn't running on the client, the LLMDiscoveryWorker will log a connection error but the daemon continues running normally. If you later start Ollama on the client, the worker will automatically discover it on the next probe cycle (every 300s).
Something isn't being processed and you want to see what's happening:
# Run in foreground with full logs
cos daemon
# In another terminal, check what's been harvested
cat ~/.cos/harvested_sessions.json | python -m json.tool | tail -20
# Check for agent task failures
cos tasks --all
# Look at what hot_items the daemon is generating
cat ~/.cos/hot_items.mdA lightweight web UI (zero external dependencies -- pure HTML/CSS/JS served by Python's built-in HTTP server) on port 7432.
# Open in browser
cos dashboard
# Or access directly
# http://localhost:7432Features:
- View and filter agenda items by urgency
- Add and resolve agenda items
- Monitor agent task queue
- Send messages to bosses
- Spawn agents from pre-configured templates
- Dark/light theme toggle
- Collapsible sections and toast notifications
The dashboard includes pre-configured agent templates for common tasks. Each template has a prompt and a budget:
| Template | What it does |
|---|---|
code_review |
Reviews git diff against main, flags bugs/security/missing tests |
fix_tests |
Runs test suite, fixes failures without changing test logic |
lint_fix |
Runs linter/type checker, fixes style issues |
security_audit |
Audits for hardcoded secrets, injection risks, insecure deps |
write_readme |
Writes/updates README based on actual code |
write_docs |
Adds docstrings to undocumented public functions |
changelog |
Generates CHANGELOG from git log |
cos_status |
Generates a CoS system health report |
cos_agenda |
Reviews and triages the agenda |
research_deps |
Researches dependency updates and security advisories |
perf_profile |
Profiles the codebase for performance bottlenecks |
CoS bootstraps automatically in Claude Code sessions via ~/.claude/CLAUDE.md. This means every Claude Code session is CoS-aware without any manual setup -- on any machine where the cos package is installed.
flowchart TB
START["Claude Code session starts"] --> DETECT["Detect Python<br/>/srv/apps/cos/.venv/bin/python (Linux)<br/>~/.cos/venv/bin/python (macOS)"]
DETECT --> CHECK["Exclusion check<br/>(~/.cos/exclude_dirs.txt)"]
CHECK -->|excluded| SKIP["Skip CoS"]
CHECK -->|active| REG["register_boss()"]
REG --> MSG["check_messages()"]
REG --> AGENDA["get_agenda() (urgent)"]
MSG --> LOAD["Read ~/.cos/hot_items.md<br/>Read ~/.cos/swarm_status.md"]
AGENDA --> LOAD
LOAD --> READY["Session ready<br/>(urgent items shown first)"]
The bootstrap in ~/.claude/CLAUDE.md automatically finds the right Python venv on each platform:
COS_PYTHON="$([ -x /srv/apps/cos/.venv/bin/python ] && echo /srv/apps/cos/.venv/bin/python || echo $HOME/.cos/venv/bin/python)"This checks for the server venv at /srv/apps/cos/.venv/bin/python first (Linux). If that doesn't exist, it falls back to ~/.cos/venv/bin/python (macOS client install). Since ~/.claude/CLAUDE.md syncs across machines, this single file works on both without modification.
- Python detection -- Finds the cos venv (server path first, then client path).
- Exclusion check -- Reads
~/.cos/exclude_dirs.txt. If the current working directory matches any listed path, CoS is skipped entirely. This is useful for third-party repos or sensitive projects. - Registration -- Calls
register_boss()to announce this session to the swarm. - Message check -- Reads any unread messages from other bosses or agents.
- Agenda check -- Pulls urgent items (
now,today) to display immediately. - Context load -- Reads
~/.cos/hot_items.md(recent completions, failures, urgent work) and~/.cos/swarm_status.md(what other bosses are doing).
The Claude Code statusline shows CoS:<bosses> <urgent>! -- for example, CoS:2 3! means 2 active bosses and 3 urgent agenda items. This is read from ~/.cos/status.json, which the daemon's ContextUpdateWorker writes every 120 seconds.
The status.json file contains:
{
"active_bosses": 2,
"urgent_items": 3,
"approved_proposals": 0,
"updated_at": "2026-04-03T06:30:04.323141+00:00"
}Since each machine runs its own daemon, the statusline stays current on every host. If the daemon isn't running, the statusline will show stale data from the last time it was updated.
To exclude a project directory from CoS:
# Add the absolute path to the exclusion file (one per line)
echo "/path/to/excluded/project" >> ~/.cos/exclude_dirs.txt
# You can also add comments
echo "# Third-party repos" >> ~/.cos/exclude_dirs.txt
echo "/home/user/vendor/some-lib" >> ~/.cos/exclude_dirs.txtCoS requires three pieces of Claude Code configuration in ~/.claude/settings.json. These use platform-agnostic paths so a single settings file works on both Linux and macOS:
UserPromptSubmit hook -- Fires on every user message. Heartbeats the boss, checks for hot items changes, and injects project-scoped context (agenda items, recent sessions, relevant memories) on session start.
{
"hooks": {
"UserPromptSubmit": [
{
"hooks": [
{
"type": "command",
"command": "COS_PYTHON=\"$([ -x /srv/apps/cos/.venv/bin/python ] && echo /srv/apps/cos/.venv/bin/python || echo $HOME/.cos/venv/bin/python)\"; \"$COS_PYTHON\" $HOME/.cos/prompt_submit.py"
}
]
}
]
}
}Statusline -- Displays CoS:<active_bosses> <urgent_items>! in the Claude Code status bar by reading ~/.cos/status.json.
{
"statusLine": {
"type": "command",
"command": "zsh $HOME/.claude/statusline-command.sh"
}
}The statusline script (~/.claude/statusline-command.sh) reads JSON from stdin (provided by Claude Code with workspace, model, and context info) and appends the CoS segment from ~/.cos/status.json. The segment turns red when there are urgent items.
Key files:
| File | Location | Purpose |
|---|---|---|
statusline-command.sh |
~/.claude/ |
Statusline rendering script (zsh) |
prompt_submit.py |
~/.cos/ |
UserPromptSubmit hook (Python) |
status.json |
~/.cos/ |
Written by daemon, read by statusline |
hot_items.md |
~/.cos/ |
Written by daemon, injected by hook on change |
The ~/.cos/prompt_submit.py hook runs on every user message and does the following:
flowchart TB
MSG["User sends message"] --> EXCL["Check exclude_dirs.txt"]
EXCL -->|excluded| EXIT["Exit silently"]
EXCL -->|active| HB["register_boss() / heartbeat"]
HB --> CHECK["Compare hot_items.md mtime<br/>vs last seen"]
CHECK -->|changed or gap > 15min| INJECT["Print hot items<br/>to conversation"]
CHECK -->|unchanged| SKIP["Skip injection"]
INJECT --> CTX{"Session start?<br/>(gap > 15min)"}
SKIP --> CTX
CTX -->|yes| PROJECT["Inject project context:<br/>- Agenda items for this dir<br/>- Recent session summaries<br/>- Relevant memories"]
CTX -->|no| DONE["Done"]
PROJECT --> DONE
On session start (defined as >15 minutes since the last hook run), the hook also queries the database for project-specific context: open agenda items assigned to the current working directory, recent session summaries, and the top relevant memories from semantic search.
graph LR
subgraph "~/.cos/"
HI["hot_items.md<br/><i>urgent agenda</i>"]
SS["swarm_status.md<br/><i>active bosses</i>"]
SJ["status.json<br/><i>statusline data</i>"]
EX["exclude_dirs.txt<br/><i>opt-out directories</i>"]
DL["daemon.log<br/><i>daemon stdout</i>"]
DE["daemon.err<br/><i>daemon stderr</i>"]
HS["harvested_sessions.json<br/><i>processed session tracker</i>"]
VN["venv/<br/><i>client Python venv</i>"]
WT["worktrees/<br/><i>git worktrees for agents</i>"]
NT["notes/<br/><i>Marimo notebooks</i>"]
end
| Path | Purpose |
|---|---|
~/.cos/hot_items.md |
Urgent items injected at session start |
~/.cos/swarm_status.md |
Active bosses summary |
~/.cos/status.json |
Machine-readable status for statusline integrations |
~/.cos/exclude_dirs.txt |
Project paths to skip (one per line) |
~/.cos/daemon.log |
Daemon stdout log |
~/.cos/daemon.err |
Daemon stderr log (macOS LaunchAgent) |
~/.cos/harvested_sessions.json |
Tracks which session files have been processed |
~/.cos/venv/ |
Client-install Python venv (macOS / remote hosts) |
~/.cos/prompt_submit.py |
UserPromptSubmit hook script |
~/.cos/worktrees/ |
Git worktrees for autonomous agent workspaces |
~/.cos/notes/ |
Marimo notebook notes |
~/.claude/statusline-command.sh |
Statusline rendering script |
~/.claude/CLAUDE.md |
Bootstrap instructions (loaded by Claude Code) |
~/.claude/settings.json |
Claude Code settings (hooks, statusline, env) |
Set COS_NOTIFY_BACKEND to one or more (comma-separated):
# Single backend
export COS_NOTIFY_BACKEND=desktop
# Multiple backends
export COS_NOTIFY_BACKEND=log,desktop| Backend | Platform | Method |
|---|---|---|
macos |
macOS | osascript display notification |
desktop |
Linux | notify-send (freedesktop) |
log |
Any | Python logger.info |
none |
Any | Silent |
Notifications are sent by agents when they complete significant work -- for example, when the SpecFileExpert registers a new spec, or when the TriggerAgentDesigner creates a new agent.
# Install with test dependencies
pip install -e ".[test]"
# Run tests
pytest
# Run a specific test file
pytest tests/test_router.py
# Verbose output
pytest -vTests use a real PostgreSQL connection with transaction rollback -- no separate test DB needed. LLM calls are mocked via fixtures in tests/conftest.py.
cos/
__init__.py Public API (import cos) -- 63 exported functions
__main__.py Entry point for python -m cos
client.py PostgreSQL client: memory, agenda, swarm, sessions, sync
router.py LLM routing: node discovery, task dispatch, quality tracking
daemon.py Background workers (session harvest, memory sync, swarm, etc.)
agents.py Specialist agents (transcript analysis, trigger matching, etc.)
cli.py Click-based CLI (cos status, cos recall, etc.)
dashboard.py Web UI on port 7432 (zero dependencies)
notify.py Multi-backend notifications (macOS, Linux, log)
note.py Marimo-based note system with AST-validated headers
migrations/
001_swarm_and_pipeline.sql Core schema (swarm, pipeline, sessions)
002_meeting_assistant_sync.sql Meeting assistant sync tables
tests/
conftest.py DB fixtures (rollback), LLM mocks
test_*.py Unit tests
Dockerfile Container image for daemon
docker-compose.yml Multi-service orchestration
Makefile Convenience targets (build, up, down, cli, logs)
| Document | Description |
|---|---|
HANDOFF.md |
Full system architecture, inference philosophy, schema reference, bootstrap mechanism |
COS_CHEATSHEET.md |
Quick-reference card for CLI commands, Python API, daemon workers, DB tables |
COSWORK.md |
Cost/quality analysis of local vs. cloud LLM routing, model recommendations, measurement plan |