CoS -- Chief of Staff

A self-hosted, local-first personal AI coordination system that gives every LLM agent on your machine -- Claude Code sessions, Ollama agents, scripts importing the cos package -- a shared memory, a shared work queue, and a shared awareness of every other agent. A background daemon harvests session transcripts, runs specialist agents against them, and feeds the results back into the same database that the next session will read at startup.

CoS is for developers who already use multiple AI coding agents and want them to remember decisions, share context, and coordinate work without relying on a cloud SaaS. Everything runs against your own PostgreSQL + pgvector database and your own Ollama instance. Cloud LLMs are an optional fallback, not the default.

How is this different from CrewAI / AutoGen / mem0 / MemGPT / Letta?

System	Local-first	Cross-session memory	Background daemon	Multi-agent swarm	Self-expanding	Harvests existing agent transcripts
CoS	Yes (Ollama)	Yes (pgvector)	Yes	Yes	Yes (TriggerAgentDesigner)	Yes (Claude Code JSONL)
CrewAI	Any model	Partial	No	Yes (roles)	No	No
AutoGen / MS Agent Framework	Azure-first	Partial	No	Yes	No	No
LangGraph	Any model	Tiered	No	Yes (graph)	No	No
mem0	Yes	Yes (LLM facts)	No	No	No	No
Letta / MemGPT	Yes	Yes (tiered)	No (agent-driven)	Limited	Partial	No
Zep / Graphiti	Self-host	Yes (temporal KG)	No	No	No	No
Cognee	Yes	Yes (vector+graph)	No (library)	No	Partial	No
Khoj	Yes	Document index	No	No	No	No
Claude Memory Tool	Client-side	File-based	No	No	No	No

The two distinguishing properties are the always-on daemon that harvests Claude Code session JSONL transcripts and feeds them to specialist agents, and the self-expanding agent taxonomy -- when the system encounters a session pattern it doesn't recognize, it can code-generate, register, and hot-load a new specialist agent at runtime. No other tool in the survey combines those with a swarm registry and an urgency-tiered persistent agenda. (See COS_COMPARISON.md for the full survey of 40+ peer systems.)

Introduction

What is CoS?

CoS acts as a persistent "Chief of Staff" layer that sits between you and your AI agents. In a typical workflow, you might have several Claude Code sessions open across different projects, an Ollama-based agent running a code review, and a background process ingesting meeting notes -- all at the same time. Without coordination, each of these agents operates in isolation: they don't know what the others are doing, they can't share context, and decisions made in one session are invisible to the rest.

CoS solves this by providing:

Shared memory -- Every agent writes to and reads from the same semantic search database. A decision recorded in a morning Claude Code session is instantly recallable by an afternoon Ollama agent working on a different project.
A persistent work queue -- Agenda items survive across sessions. You can add "deploy v2.1" at 9am, and whichever agent you open at 3pm will see it waiting.
Swarm awareness -- Every active agent registers itself. Agents can see who else is running, what they're working on, and send messages to each other.
Autonomous background processing -- A daemon continuously harvests session transcripts, extracts decisions and action items, discovers dropped ideas, and surfaces them as agenda items -- all using local LLM inference to keep costs near zero.
Intelligent LLM routing -- Rather than always calling a cloud API, CoS routes each task to the best available model. Embedding and triage run locally on Ollama. Complex reasoning falls back to cloud only when needed.

When is CoS useful?

You're juggling multiple projects with AI agents. You have Claude Code open in three terminals -- one for a backend service, one for a frontend, one for infrastructure. A decision in the backend session ("we're switching from REST to gRPC") should be visible when the frontend agent asks about API integration. CoS makes this automatic: the daemon's SessionHarvestWorker picks up the transcript, the DecisionRecorder agent extracts the decision, and it becomes searchable via cos recall "API protocol" from any session.

You lose track of what was discussed and decided. After a long coding session, it's easy to forget the three minor decisions you made, the two ideas you deferred, and the one bug you noticed but didn't fix. CoS's specialist agents -- DecisionRecorder, VisionKeeperExpert, ActionItemExtractor, BugRecorder -- run automatically against your session transcripts and surface everything as structured, searchable records.

You want AI agents to work autonomously in the background. From the web dashboard, you can spawn pre-configured agents to run code reviews, fix failing tests, run linters, perform security audits, or write documentation -- all without occupying a terminal session. The agent task queue coordinates everything.

You want to minimize cloud API costs. CoS's routing layer prefers local Ollama models for everything they can handle well (embedding, classification, summarization, simple extraction) and only falls back to cloud APIs for tasks that genuinely require it. On typical hardware (M3 Max, 64GB), this reduces cloud API spend by roughly 95% compared to routing everything through Claude or GPT.

You want a morning briefing that knows what happened yesterday. Run cos brief and get a formatted summary of your priorities, open agenda items, active projects, and what the swarm has been doing -- all pulled from the database, not regenerated from scratch.

How does it work in practice?

Here's a typical day with CoS running:

8:00am -- You open a Claude Code session. CoS bootstraps automatically: registers the session as a boss, checks for messages from other agents, and shows you any urgent agenda items and hot items from overnight processing.
8:05am -- You start coding. In the background, the daemon is already processing last night's sessions. The TranscriptAnomalyExpert found a spec discussion in yesterday's evening session and queued a SpecFileExpert, which extracted 4 action items and registered a formal spec document.
10:30am -- You finish a feature and open a new session in a different project. CoS shows you the 4 action items from the spec as this_week urgency. You also see that Boss B (an Ollama agent running a security audit) finished and found 2 medium-severity issues.
2:00pm -- From the web dashboard, you spawn a "fix tests" agent against the morning project. It claims the task from the queue, runs the test suite, fixes 3 failures, and marks the task complete.
5:00pm -- You run cos brief before shutting down. It shows you what was accomplished, what's still open, and what the VisionKeeperExpert surfaced as a dropped idea worth revisiting.

The Trigger Pipeline

One of CoS's most powerful features is its self-expanding trigger pipeline. When the TranscriptAnomalyExpert encounters a conversation pattern it doesn't recognize, it proposes a new trigger type. The TriggerReviewExpert evaluates whether it's genuinely new or a duplicate. If it's new, the TriggerAgentDesigner writes a new specialist agent -- including the Python code, the system prompt, and the database registration -- and hot-loads it into the running daemon. The system literally grows new capabilities by observing your work.

flowchart LR
    SESSION["New session<br/>transcript"] --> TAE["TranscriptAnomaly<br/>Expert"]
    TAE -->|"known trigger"| SPECIALIST["Existing specialist<br/>(SpecFileExpert,<br/>DecisionRecorder, etc.)"]
    TAE -->|"unknown pattern"| PROPOSE["Propose new<br/>trigger type"]
    PROPOSE --> TRE["TriggerReview<br/>Expert"]
    TRE -->|"duplicate/noise"| REJECT["Rejected"]
    TRE -->|"genuinely new"| TAD["TriggerAgent<br/>Designer"]
    TAD --> NEW["New agent written,<br/>registered, and<br/>hot-loaded"]

    style TAE fill:#4a4a6a,color:#fff
    style TAD fill:#2d6a4f,color:#fff
    style NEW fill:#2d6a4f,color:#fff

Architecture

graph TB
    subgraph Interfaces
        CLI["cos CLI"]
        API["Python API<br/><code>import cos</code>"]
        DASH["Web Dashboard<br/>:7432"]
    end

    subgraph Daemon["cos-daemon (background workers)"]
        SH["SessionHarvest<br/>120s"]
        MS["MemorySync<br/>300s"]
        SW["SwarmWorker<br/>60s"]
        CU["ContextUpdate<br/>120s"]
        LD["LLMDiscovery<br/>300s"]
        BW["BossWakeup<br/>180s"]
    end

    subgraph Storage
        PG["PostgreSQL + pgvector<br/><code>chief_of_staff</code>"]
    end

    subgraph Inference
        OL["Ollama (local)"]
        CL["Cloud LLMs<br/>(fallback)"]
    end

    CLI --> PG
    API --> PG
    DASH --> PG
    SH --> PG
    MS --> PG
    SW --> PG
    CU --> PG
    LD --> OL
    SH --> OL
    BW --> PG

    API --> OL
    API --> CL

    style PG fill:#336791,color:#fff
    style OL fill:#1a1a2e,color:#fff
    style CL fill:#4a4a6a,color:#fff

Key Concepts

graph LR
    B1["Boss A<br/>(Claude Code)"] -- heartbeat --> SWARM["Swarm Registry<br/><code>ops.bosses</code>"]
    B2["Boss B<br/>(Ollama agent)"] -- heartbeat --> SWARM
    B1 -- "spawns" --> M1["Minion"]
    B1 -- "queues" --> AT["Agent Task<br/><code>ops.agent_tasks</code>"]
    AT -- "claimed by" --> AG["Specialist Agent<br/>(TranscriptAnomalyExpert, etc.)"]
    B1 -. "message" .-> B2

    style SWARM fill:#2d6a4f,color:#fff
    style AT fill:#6a2d4f,color:#fff

Term	Description
Boss	Any LLM agent (Claude Code, Ollama, Codex) registered in the swarm. Each boss has a name, model, working directory, and heartbeat. A single developer might have 3-5 bosses active simultaneously across different projects.
Minion	A subagent spawned by a boss for a specific subtask. For example, a boss working on a feature might spawn a minion to run the test suite. Minions are tracked in `ops.minions` with their parent boss, task, and result.
Agent task	A queued job for a specialist agent. Unlike minions (which are ad-hoc), agent tasks are typed: `TranscriptAnomalyExpert`, `SpecFileExpert`, `DecisionRecorder`, etc. The daemon claims and runs them automatically.
Agenda item	A work item with an urgency level (`now`, `today`, `this_week`, `soon`, `later`). Items can be created by humans (`cos add`), by agents (ActionItemExtractor), or by the dashboard. They persist until explicitly resolved.
Memory/RAG	Semantic vector search across three sources: memories (typed notes with importance), facts (subject-predicate-object triples), and document chunks (auto-split with 50% overlap). All embeddings are 768-dimensional via nomic-embed-text.

Specialist Agents

CoS ships with 16 baseline specialist agents, each designed for a single responsibility. New agents can be code-generated and hot-loaded at runtime by the TriggerAgentDesigner, so the live registry typically grows past the baseline. The current registry is defined in cos/agents.py (AGENT_REGISTRY).

graph TB
    subgraph "Session Analysis Pipeline"
        TAE["TranscriptAnomalyExpert<br/><i>Classifies sessions against<br/>trigger taxonomy</i>"]
        SFE["SpecFileExpert<br/><i>Extracts specs from design<br/>discussions, creates action items</i>"]
        VKE["VisionKeeperExpert<br/><i>Finds dropped ideas and<br/>deferred thoughts</i>"]
        DR["DecisionRecorder<br/><i>Extracts decisions (tech choices,<br/>config, architecture)</i>"]
        AIE["ActionItemExtractor<br/><i>Pulls concrete tasks<br/>and next steps</i>"]
        BR["BugRecorder<br/><i>Logs bug context,<br/>creates fix items</i>"]
        ADE["ArchDocExpert<br/><i>Creates architecture notes<br/>from design discussions</i>"]
    end

    subgraph "Self-Expansion Pipeline"
        TRE["TriggerReviewExpert<br/><i>Evaluates proposed new<br/>trigger types</i>"]
        TAD["TriggerAgentDesigner<br/><i>Writes and registers<br/>new agents</i>"]
    end

    TAE -->|"spec_session"| SFE
    TAE -->|"vision_drift"| VKE
    TAE -->|"decision_made"| DR
    TAE -->|"action_items"| AIE
    TAE -->|"bug_session"| BR
    TAE -->|"architecture_discussion"| ADE
    TAE -->|"unknown"| TRE
    TRE -->|"genuinely new"| TAD

    style TAE fill:#4a4a6a,color:#fff
    style TAD fill:#2d6a4f,color:#fff

Agent	Trigger	What it produces
TranscriptAnomalyExpert	Every significant session	Classification + delegation to the right specialist
SpecFileExpert	`spec_session`	Formal spec document in `ops.specs`, action items on the agenda
VisionKeeperExpert	`vision_drift`	`[Vision]` agenda items for dropped ideas (importance >= 6)
DecisionRecorder	`decision_made`	Records in `ops.decisions`, vectorized for recall
ActionItemExtractor	`action_items`	Agenda items with urgency, assigned to the source project
BugRecorder	`bug_session`	Bug summaries vectorized for search, fix items if unresolved
ArchDocExpert	`architecture_discussion`	Architecture notes ingested as documents
TriggerReviewExpert	Unknown trigger proposed	Dedup check, approves or rejects proposals
TriggerAgentDesigner	Approved new trigger	New agent code written, registered, and hot-loaded

Prerequisites

Requirement	Version	Notes
Python	3.11+	For local install; Docker handles this otherwise
PostgreSQL	14+	With `pgvector` extension enabled
Ollama	latest	Running locally or on a reachable host
Docker + Compose	latest	For containerized deployment only

Installation

CoS is designed for multi-host deployment. A typical setup has a server (Linux) that hosts the database and the canonical repo, and one or more client machines (macOS laptops, other Linux hosts) that connect to the same database. Every machine runs its own daemon to harvest local sessions and keep local status files current.

The examples below use /srv/apps/cos as the server install path -- substitute any path you prefer. The bootstrap script in ~/.claude/CLAUDE.md probes both /srv/apps/cos/.venv/bin/python (Linux server) and ~/.cos/venv/bin/python (client) by default; if you install elsewhere, edit the probe paths in your bootstrap accordingly.

graph TB
    subgraph Server ["Server (Linux)"]
        REPO["Git repo<br/>/srv/apps/cos"]
        VENV_S[".venv"]
        DAEMON_S["cos-daemon<br/>(Docker or systemd)"]
        PG["PostgreSQL + pgvector"]
        OL["Ollama"]
    end

    subgraph Laptop ["Client (macOS laptop)"]
        VENV_L["~/.cos/venv<br/>(pip install from git)"]
        DAEMON_L["cos-daemon<br/>(LaunchAgent)"]
        CLAUDE_L["Claude Code sessions<br/>(~/.claude/)"]
    end

    DAEMON_S --> PG
    DAEMON_L --> PG
    DAEMON_S --> OL
    DAEMON_L -.->|"optional"| OL_L["Local Ollama"]
    CLAUDE_L --> DAEMON_L

    style PG fill:#336791,color:#fff
    style OL fill:#1a1a2e,color:#fff

Server install (Linux)

The server hosts the git repo, the database, and optionally runs Ollama for local inference.

# Clone the repo
git clone https://github.com/YOUR-USER/cos.git /srv/apps/cos
cd /srv/apps/cos

# Create a virtualenv and install
python3 -m venv .venv
source .venv/bin/activate
pip install -e .

# Verify installation
cos status

This installs two entry points:

Command	Description
`cos`	The CLI (all subcommands)
`cos-daemon`	The background daemon

Client install (macOS / remote Linux)

Client machines don't need a full repo checkout -- just pip install the package from the git remote. The venv lives at ~/.cos/venv so it's colocated with the rest of the CoS state.

# Install cos from the git repo (no local clone needed)
python3 -m venv ~/.cos/venv
~/.cos/venv/bin/pip install git+https://github.com/YOUR-USER/cos.git

# Verify it can reach the database
~/.cos/venv/bin/python -c "import cos; cos.briefing()"

To upgrade later:

~/.cos/venv/bin/pip install --upgrade git+https://github.com/YOUR-USER/cos.git

Docker (server)

A pre-built image is available from the container registry:

# Pull from registry
docker pull ghcr.io/YOUR-USER/cos:latest

# Or build locally
make build

Run via docker-compose:

# Start the daemon (background)
make up

# Run CLI commands inside the container
make cli CMD="status"
make cli CMD="recall 'some query'"

# View daemon logs
make logs

# Stop everything
make down

To build and push a new image:

docker build --provenance=false -t ghcr.io/YOUR-USER/cos:latest .
docker push ghcr.io/YOUR-USER/cos:latest

Note: The --provenance=false flag is required. Without it, Docker attaches a build attestation manifest that GitLab's container registry doesn't support (you'll see "invalid tag: missing manifest digest").

Docker volume mounts

The docker-compose binds host paths into the container so the daemon reads ~/.claude/ sessions directly and writes status files to ~/.cos/ on the host filesystem:

volumes:
  - ${HOME}/.cos:/root/.cos          # status.json, hot_items.md, harvested_sessions.json
  - ${HOME}/.claude:/root/.claude:ro  # session JSONL files, memory markdown files

This is critical -- if ~/.cos is a Docker volume instead of a host bind mount, the statusline and session bootstraps won't see the daemon's output. If ~/.claude isn't mounted, the SessionHarvestWorker won't find any sessions to process.

Optional: myma integration

CoS runs fully standalone by default. The myma integration (meeting-notes sync + notebook push from SpecFileExpert/PlanExpert) is opt-in.

To enable it:

Set COS_ENABLE_MYMA=1 in your .env (and optionally COS_MYMA_URL / COS_MYMA_TOKEN if your myma instance isn't on the defaults).
Start the stack with the myma profile so the daemon/CLI containers join the myma_myma-net network and mount the myma_myma-db volume:
```
docker compose --profile myma up -d
```
This brings up cos-daemon-myma (and cos-cli-myma for one-shot CLI runs) instead of the plain cos-daemon. Without the profile, no myma volumes or networks are touched.

When COS_ENABLE_MYMA is unset, all myma calls in cos.integrations.myma short-circuit to no-ops.

Database Setup

Create the database with pgvector, then apply migrations:

# Create database (if it doesn't exist)
createdb -h <host> -U postgres chief_of_staff
psql -h <host> -U postgres -d chief_of_staff -c 'CREATE EXTENSION IF NOT EXISTS vector;'

# Apply migrations in order
psql -h <host> -U postgres -d chief_of_staff -f migrations/001_swarm_and_pipeline.sql
psql -h <host> -U postgres -d chief_of_staff -f migrations/002_meeting_assistant_sync.sql

Database Schema

erDiagram
    ops_bosses {
        uuid id PK
        text name
        text model
        text working_dir
        timestamp last_seen
        text status
    }
    ops_agenda_items {
        uuid id PK
        text title
        text body
        text urgency
        text category
        timestamp resolved_at
    }
    ops_agent_tasks {
        uuid id PK
        text agent_type
        text context_ref
        jsonb context_raw
        int priority
        text status
    }
    rag_memories {
        uuid id PK
        text mem_type
        text title
        text body
        vector embedding
        int importance
    }
    rag_chunks {
        uuid id PK
        uuid doc_id FK
        text content
        vector embedding
    }
    rag_documents {
        uuid id PK
        text title
        text source
        text doc_type
    }
    sessions_conversations {
        uuid id PK
        text project_dir
        text summary
    }
    sessions_turns {
        uuid id PK
        uuid conversation_id FK
        text role
        text content
        int turn_index
    }

    rag_documents ||--o{ rag_chunks : "chunked into"
    sessions_conversations ||--o{ sessions_turns : "contains"
    ops_bosses ||--o{ ops_agent_tasks : "spawns"

The full schema spans five Postgres schemas:

Schema	Purpose
`principal`	User profile, preferences, sensitivities, values, habits
`people`	Contacts, relationships, interaction history
`ops`	Bosses, agenda, agent tasks, triggers, specs, LLM nodes, routing log
`rag`	Memories, documents, chunks, facts (all with vector embeddings)
`sessions`	Conversation transcripts, turns, breadcrumbs

Pull Required Ollama Models

# Required: embedding model
ollama pull nomic-embed-text

# Recommended: local inference models
ollama pull qwen2.5:7b         # triage, classification (fast)
ollama pull qwen2.5:14b        # summarization
ollama pull qwen2.5-coder:14b  # code generation

See COSWORK.md for the full cost/quality analysis of local vs. cloud models.

Configuration

All configuration is via environment variables. Set them in your shell, a .env file (loaded by docker-compose), or export before running.

# Example .env file
COS_DB_HOST=db.example.internal
COS_DB_PORT=5432
COS_DB_NAME=chief_of_staff
COS_DB_USER=postgres
COS_DB_PASS=
OLLAMA_BASE=http://127.0.0.1:11434
COS_EMBED_MODEL=nomic-embed-text
COS_NOTIFY_BACKEND=log
COS_AUTO_SPAWN=true
COS_DASHBOARD_PORT=7432

Variable	Default	Description
`COS_DB_HOST`	`db.example.internal`	PostgreSQL host (use `postgres` when running via the bundled `docker-compose.yml`, since that is the compose service name)
`COS_DB_PORT`	`5432`	PostgreSQL port
`COS_DB_NAME`	`chief_of_staff`	Database name
`COS_DB_USER`	`postgres`	Database user
`COS_DB_PASS`	(empty)	Database password
`OLLAMA_BASE`	`http://127.0.0.1:11434`	Ollama API base URL
`COS_EMBED_MODEL`	`nomic-embed-text`	Embedding model name
`COS_NOTIFY_BACKEND`	`macos` (Darwin) / `log` (Linux)	Notification backend(s), comma-separated
`COS_AUTO_SPAWN`	`true`	Auto-spawn agents from the task queue
`COS_DASHBOARD_PORT`	`7432`	Web dashboard port
`COS_MEETING_NOTES_DIR`	`/data/meeting-notes`	Meeting notes directory
`COS_MEETING_DB`	`/data/myma/meetings.db`	Meeting assistant SQLite DB
`COS_OLLAMA_NODES`	(empty)	Extra Ollama node URLs (comma-separated)

CLI Reference

cos                     Show status (default when no subcommand)
cos status              Dashboard overview: bosses, agenda, tasks, LLM nodes
cos brief               Full morning briefing
cos ls [--all] [--cat]  List agenda items grouped by urgency
cos add <title>         Add agenda item
cos done <item_id>      Resolve an agenda item
cos recall <query>      Semantic search across all memory
cos swarm               Show all registered bosses
cos tasks [--all]       Show agent task queue
cos specs [id]          List or view specs
cos routing             Show LLM routing stats
cos sessions            List harvested sessions
cos dashboard           Open web dashboard in browser
cos daemon              Start the background daemon (foreground)

CLI Examples

# Quick status check
cos status

# Add work items at different urgency levels
cos add "Review PR #42" --urgency today --category code-review
cos add "Investigate flaky test" --urgency this_week --body "test_auth_flow fails intermittently"
cos add "Upgrade dependencies" --urgency later

# Search memory semantically
cos recall "authentication middleware changes"
cos recall "deployment process" -n 10

# Mark something done
cos done 3fa8b2c1-...

# Morning briefing
cos brief

# See who's active in the swarm
cos swarm

# Check agent task queue
cos tasks
cos tasks --all    # include completed

# LLM routing performance
cos routing

# Open the web dashboard
cos dashboard

# Start daemon in foreground (for debugging)
cos daemon

Scenario: Triaging your morning

You sit down and want to know what happened overnight:

# What's the system state?
cos status

# Full briefing with priorities, agenda, and swarm activity
cos brief

# Anything about that deploy discussion yesterday?
cos recall "production deploy timeline"

# See what agents ran overnight
cos tasks --all

# Check if any specs were generated from yesterday's architecture chat
cos specs --project my-backend

Scenario: Capturing work during a session

You're midway through a session and want to note things for later:

# Quick capture -- will show up in your next briefing
cos add "Revisit caching strategy for /api/users" --urgency soon

# High-priority item that should be addressed today
cos add "Fix broken migration on staging" --urgency now --category fix

# Done with the migration fix
cos done <item_id>

Makefile Shortcuts (Docker)

make build            # docker compose build
make up               # start cos-daemon (background)
make down             # stop all services
make restart          # restart cos-daemon
make logs             # tail daemon logs
make cli CMD="status" # run any CLI subcommand in container
make status           # shortcut for cos status
make agenda           # shortcut for cos agenda
make routing          # shortcut for cos routing

Python API

The package exports a flat public API via import cos. Every function listed below works from any Python process -- a script, a REPL, a Jupyter notebook, or inside a running LLM agent session.

Memory & RAG

CoS provides three kinds of semantic storage, all searchable through a single recall() call:

Memories -- Typed notes (project, feedback, user, reference) with an importance score. Good for capturing context, decisions, and observations.
Facts -- Subject-predicate-object triples with a confidence score. Good for structured knowledge ("auth-middleware is-blocked-by legal-review").
Document chunks -- Long-form text auto-split into 400-word overlapping chunks. Good for specs, meeting notes, architecture docs.

flowchart LR
    INPUT["Text input"] --> CHUNK["Chunking<br/>(400 words, 50% overlap)"]
    CHUNK --> EMBED["Embedding<br/>(nomic-embed-text)"]
    EMBED --> STORE["pgvector storage"]
    QUERY["Search query"] --> QEMBED["Embed query"]
    QEMBED --> SIM["Cosine similarity<br/>search"]
    STORE --> SIM
    SIM --> RESULTS["Ranked results<br/>(memories + facts + chunks)"]

Python:

import cos

# Store a memory (auto-embedded)
cos.remember("project", "Auth rewrite",
             "Rewriting auth middleware for compliance.",
             importance=8, domain="security")

# Semantic search
results = cos.recall("authentication changes", n=5)
for r in results:
    print(r["title"], r["similarity"])

# Store a structured fact (subject-predicate-object triple)
cos.add_fact("auth-middleware", "blocked_by", "legal-review", confidence=0.9)

# Ingest a document (auto-chunked, auto-embedded)
cos.ingest_document("Architecture Decision Record",
                    open("adr-003.md").read(),
                    source="adr-003.md", doc_type="adr")

# Sync Claude Code local memory files (~/.claude/projects/*/memory/*.md) to DB
cos.sync_local_memories()

Shell (via python -c or interactive):

# Semantic recall from the command line
cos recall "authentication middleware changes"

# Quick recall from Python one-liner
python -c "import cos; [print(r['title'], round(r['similarity'],2)) for r in cos.recall('auth changes')]"

# Store a memory
python -c "import cos; cos.remember('project', 'Deploy v2.1', 'Deployed to prod successfully', importance=5)"

# Ingest a file
python -c "
import cos
cos.ingest_document('Meeting Notes', open('notes.md').read(), source='notes.md', doc_type='meeting')
"

Scenario: Recovering context from last week

You're picking up a project after a week away and don't remember the details:

# What do we know about this project's auth work?
cos recall "auth middleware refactor"

# Were any decisions made?
cos recall "decision auth"

# Any specs generated?
cos specs --project my-backend

# Or programmatically in a script
import cos
results = cos.recall("auth middleware", n=10)
for r in results:
    print(f"[{r['source']}] {r['title']} (similarity: {r['similarity']:.2f})")
    print(f"  {r['body'][:120]}...")

Scenario: Building a knowledge base from documents

You have a folder of architecture decision records you want to make searchable:

# Ingest all markdown files in a directory
for f in docs/adr/*.md; do
  python -c "
import cos
title = '$(basename "$f" .md)'
content = open('$f').read()
cos.ingest_document(title, content, source='$f', doc_type='adr')
print(f'Ingested: {title}')
"
done

# Now they're all searchable
cos recall "database migration strategy"

Agenda

The agenda is a persistent work queue that survives across sessions. Items have an urgency level that determines their sort order and visual treatment in the CLI and dashboard.

Urgency levels, lowest to highest:

graph LR
    L["later"] --> S["soon"] --> TW["this_week"] --> T["today"] --> N["now"]
    style L fill:#555,color:#fff
    style S fill:#666,color:#fff
    style TW fill:#0891b2,color:#fff
    style T fill:#ca8a04,color:#fff
    style N fill:#dc2626,color:#fff

Items marked now or today appear in ~/.cos/hot_items.md and are shown at the top of every new agent session. Items are automatically deduplicated -- if you add an item that's 85% similar to an existing one, the duplicate is silently dropped.

Python:

import cos

cos.add_agenda("Deploy v2.1", body="Staging passed, ready for prod",
               urgency="today", category="deploy")

items = cos.get_agenda()                      # unresolved, sorted by urgency
all_items = cos.get_agenda(include_all=True)   # include resolved

cos.resolve_agenda(item_id)

Shell:

# Add items
cos add "Deploy v2.1" --urgency today --category deploy --body "Staging passed"
cos add "Write tests for auth module" --urgency this_week

# List items
cos ls                # open items grouped by urgency
cos ls --all          # include resolved
cos ls --cat deploy   # filter by category

# Resolve
cos done <item_id>

Scenario: Managing a release checklist

# Build up the checklist
cos add "Run full test suite on staging" --urgency today --category release
cos add "Update CHANGELOG for v2.1" --urgency today --category release
cos add "Tag release v2.1" --urgency today --category release
cos add "Deploy to production" --urgency today --category release
cos add "Post-deploy smoke tests" --urgency today --category release

# View just the release items
cos ls --cat release

# Mark them off as you go
cos done <test_item_id>
cos done <changelog_item_id>

Swarm Coordination

The swarm system lets multiple agents see each other and communicate. Every agent that calls register_boss() becomes visible in the swarm registry. Agents that stop heartbeating for 15 minutes are automatically marked as gone by the SwarmWorker.

This is especially useful when you have agents working on related projects -- a frontend agent can check whether the backend agent is still active before making assumptions about API contracts.

sequenceDiagram
    participant B1 as Boss A (Claude Code)
    participant DB as PostgreSQL
    participant B2 as Boss B (Ollama)

    B1->>DB: register_boss()
    B2->>DB: register_boss()
    B1->>DB: heartbeat() (every few min)
    B2->>DB: heartbeat()
    B1->>DB: log_activity("Refactoring auth")
    B1->>DB: send_message("Need review", to=B2)
    B2->>DB: check_messages()
    DB-->>B2: [message from B1]
    B1->>DB: spawn_minion("Run tests")

Python:

import cos

# Register this agent as a boss
boss_id = cos.register_boss("my-project", "claude-sonnet-4-6",
                             working_dir="/path/to/project")

# Heartbeat (call periodically to stay "active")
cos.heartbeat()

# Log what you're doing
cos.log_activity("Refactoring auth module", event_type="code")

# See who else is running
bosses = cos.get_active_bosses(stale_minutes=10)

# Message another boss (or broadcast to all)
cos.send_message("Need help", "Can someone review auth changes?")
cos.send_message("Direct msg", "Check the tests", to_boss_id=other_boss_id)

# Check inbox
messages = cos.check_messages(unread_only=True)

Shell:

# See active bosses
cos swarm

# Register + interact via Python one-liners
python -c "import cos; bid = cos.register_boss('cli-task', 'manual'); print(bid)"
python -c "import cos; print(cos.get_active_bosses())"
python -c "import cos; cos.send_message('Status update', 'Auth module complete')"

Scenario: Coordinating agents across projects

You have a frontend and backend agent running simultaneously. The backend agent finishes an API change:

# In the backend agent session
import cos
cos.log_activity("Completed gRPC migration for /api/users", event_type="code")
cos.send_message("API change", "Switched /api/users from REST to gRPC. Proto files in shared/proto/.")
cos.add_agenda("Update frontend API client for gRPC", urgency="today",
               category="build", responsible_boss="/path/to/frontend")

The next time the frontend agent starts, it sees the message and the agenda item automatically.

Agent Tasks

The agent task queue is how the daemon dispatches work to specialist agents. Tasks are claimed atomically (FOR UPDATE SKIP LOCKED) so multiple daemon instances won't double-process. Each task has a type (which agent handles it), a priority (1-10, higher = more urgent), and a context payload.

flowchart LR
    Q["queue_agent_task()"] --> QUEUE["ops.agent_tasks<br/>(pending)"]
    QUEUE --> CLAIM["claim_agent_task()<br/>(atomic pop)"]
    CLAIM --> RUN["Specialist Agent<br/>runs"]
    RUN --> DONE["complete_agent_task()<br/>(result stored)"]

Python:

import cos

# Queue a specialist agent
cos.queue_agent_task("TranscriptAnomalyExpert",
                     context_ref="session-abc-123",
                     context_raw={"transcript": "..."},
                     priority=5)

# Claim a task (used by daemon/workers)
task = cos.claim_agent_task(agent_type="TranscriptAnomalyExpert")

# Complete a task
cos.complete_agent_task(task["id"], result={"findings": [...]})

Shell:

# View the agent task queue
cos tasks
cos tasks --all   # include completed/failed

# Queue a task from the command line
python -c "
import cos
cos.queue_agent_task('ActionItemExtractor',
    context_ref='session-id-here',
    context_raw={'source': 'manual'},
    priority=8)
print('Task queued')
"

Scenario: Re-processing a past session

You realize an important meeting transcript wasn't fully analyzed:

# Find the session ID
cos sessions

# Manually queue the agents you want to run against it
python -c "
import cos
sid = 'abc123-session-id'
cos.queue_agent_task('DecisionRecorder', context_ref=sid, priority=8)
cos.queue_agent_task('ActionItemExtractor', context_ref=sid, priority=8)
cos.queue_agent_task('VisionKeeperExpert', context_ref=sid, priority=5)
print('Queued 3 agents')
"

# Watch them process
cos tasks

LLM Routing

The router is CoS's inference layer. Rather than hardcoding which model to use, it maintains a registry of available Ollama nodes and their models, scores them based on historical performance (success rate, latency, quality), and picks the best match for each task type.

The quality feedback loop is key: every time an agent calls _llm_json(), the response is automatically scored (parseable JSON = 8.0, unparseable = 2.0). Over time, the router learns which models actually work well for each task and adjusts its scoring. You can also provide manual feedback via rate_response().

flowchart TB
    REQ["Task request<br/>(e.g. summarize)"] --> ROUTER["Router"]
    ROUTER --> SCORE["Score nodes by:<br/>- model match<br/>- historical success rate<br/>- avg latency<br/>- avg quality"]
    SCORE --> PICK["Best (node, model)"]
    PICK --> CALL["Ollama API call"]
    CALL --> RESP["Response"]
    RESP --> LOG["Log to ops.llm_routing_log"]
    RESP -.-> FB["Quality feedback<br/>(rate_response)"]
    FB -.-> SCORE

    style ROUTER fill:#4a4a6a,color:#fff

Task types and their preferred model tiers:

Task Type	Best Models	Fallback
`embed`	nomic-embed-text	--
`triage`	qwen3:8b, llama3, mistral	phi3
`extract`	qwen3:8b, qwen2.5:14b	llama3
`summarize`	qwen3:8b, qwen2.5:14b	mistral
`analyze`	qwq:32b, deepseek-r1:32b, qwen3:30b	llama3
`spec`	qwq:32b, qwen3:30b, deepseek-r1:32b	llama3
`code`	qwen3-coder:30b, devstral, codestral	llama3

Python:

import cos

# Route a task to the best available model
base_url, model = cos.route("summarize")

# Use the routed LLM directly (returns text + log_id)
text, log_id = cos.routed_llm("Summarize this document...",
                               task_type="summarize",
                               system="You are a summarizer.")

# Provide quality feedback (improves future routing)
cos.rate_response(log_id, quality_score=8)

# Refresh node discovery
cos.refresh_nodes()
nodes = cos.get_active_nodes()

Shell:

# View routing stats
cos routing

# Refresh node discovery manually
python -c "import cos; cos.refresh_nodes(); print(cos.get_active_nodes())"

# Quick routed LLM call
python -c "
import cos
text, lid = cos.routed_llm('What is 2+2?', task_type='triage')
print(text)
"

Scenario: Adding a new Ollama node

You set up Ollama on a second machine and want CoS to use it:

# Option 1: Add via environment variable (persists in .env)
echo 'COS_OLLAMA_NODES=http://192.168.1.50:11434' >> .env

# Option 2: Add to the database directly
psql -h <host> -U postgres -d chief_of_staff -c \
  "INSERT INTO ops.llm_nodes (base_url, label, priority) VALUES ('http://192.168.1.50:11434', 'workstation', 5);"

# Force a refresh
python -c "import cos; cos.refresh_nodes()"

# Verify it's discovered
cos routing

Standing Orders & Context

Standing orders are persistent instructions stored in the database that apply to every boss regardless of model. They encode your preferences and operating rules -- things like "prefer short responses", "no emoji", or "always confirm before destructive actions". Every boss loads them at session start.

Python:

import cos

orders = cos.get_standing_orders(scope="global")
context = cos.get_principal_context()   # full user profile
cos.briefing()                          # print formatted briefing

Shell:

# Morning briefing
cos brief

# Dump standing orders
python -c "import cos; [print(o['title']) for o in cos.get_standing_orders()]"

# View your full principal context
python -c "
import cos
ctx = cos.get_principal_context()
for section, content in ctx.items():
    print(f'=== {section} ===')
    print(content[:200])
    print()
"

Daemon

The daemon (cos-daemon) runs background worker threads that keep the system alive. Each worker runs in its own thread, on its own interval, with independent error handling -- a failure in one worker doesn't affect the others.

flowchart TB
    MAIN["cos-daemon main()"] --> SH & MS & SW & CU & LD & BW

    SH["SessionHarvestWorker<br/>every 120s"]
    MS["MemorySyncWorker<br/>every 300s"]
    SW["SwarmWorker<br/>every 60s"]
    CU["ContextUpdateWorker<br/>every 120s"]
    LD["LLMDiscoveryWorker<br/>every 300s"]
    BW["BossWakeupWorker<br/>every 180s"]

    SH -->|"scans"| JSONL["~/.claude/**/*.jsonl"]
    SH -->|"archives to"| DB["PostgreSQL"]
    SH -->|"skims via"| OL["Ollama"]
    SH -->|"queues"| TASKS["ops.agent_tasks"]

    MS -->|"syncs"| MEM["~/.claude/**/memory/*.md"]
    MS -->|"embeds into"| DB

    SW -->|"writes"| SS["~/.cos/swarm_status.md"]
    CU -->|"writes"| HI["~/.cos/hot_items.md"]
    CU -->|"writes"| SJ["~/.cos/status.json"]
    LD -->|"probes"| OL

    style MAIN fill:#4a4a6a,color:#fff

Worker	Interval	What it does
SessionHarvestWorker	120s	Scans Claude Code session JSONL files, archives transcripts, skims for significance via local LLM, queues specialist agents. Tracks processed files in `~/.cos/harvested_sessions.json` to avoid re-work.
MemorySyncWorker	300s	Syncs `~/.claude/projects//memory/.md` into the RAG database with embeddings. Watches file mtimes to detect changes and re-embed updated files.
SwarmWorker	60s	Marks stale bosses (no heartbeat in 15min) as gone, writes `~/.cos/swarm_status.md` with a human-readable summary of who's active and what they're doing.
ContextUpdateWorker	120s	Writes `~/.cos/hot_items.md` (urgent agenda + recent completions/failures) and `~/.cos/status.json`. These files are read by new agent sessions at bootstrap.
LLMDiscoveryWorker	300s	Probes Ollama nodes from DB + `COS_OLLAMA_NODES`, updates availability and model lists. Detects when nodes go offline or new models are pulled.
BossWakeupWorker	180s	Checks for agenda items assigned to a specific boss that has gone inactive. Sends a notification to remind you to restart that agent.

Running the Daemon on Multiple Hosts

The daemon is safe to run on multiple hosts simultaneously. Each instance harvests sessions and syncs memories from its own ~/.claude/ directory, but they all write to the same database. This means Claude Code sessions on your laptop get harvested by the laptop's daemon, and sessions on the server get harvested by the server's daemon.

flowchart LR
    subgraph Server
        DS["cos-daemon"] -->|"harvests"| SS["~/.claude/<br/>(server sessions)"]
        DS -->|"writes"| DB["PostgreSQL"]
    end
    subgraph Laptop
        DL["cos-daemon"] -->|"harvests"| SL["~/.claude/<br/>(laptop sessions)"]
        DL -->|"writes"| DB
    end
    DS -->|"writes"| HS["~/.cos/status.json<br/>hot_items.md<br/>(server)"]
    DL -->|"writes"| HL["~/.cos/status.json<br/>hot_items.md<br/>(laptop)"]

Starting the Daemon

Linux server (Docker)

# Background via Docker (logs via `make logs`)
make up

# View logs
make logs

# Restart after code changes
make restart

Linux server (foreground, for debugging)

cos-daemon
# or
cos daemon

macOS client (LaunchAgent)

On macOS, set up a LaunchAgent so the daemon starts automatically and restarts on crash:

cat > ~/Library/LaunchAgents/com.cos.daemon.plist << 'EOF'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
  "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.cos.daemon</string>

    <key>ProgramArguments</key>
    <array>
        <string>/Users/YOUR_USER/.cos/venv/bin/cos-daemon</string>
    </array>

    <key>EnvironmentVariables</key>
    <dict>
        <key>OLLAMA_BASE</key>
        <string>http://127.0.0.1:11434</string>
        <key>COS_DB_HOST</key>
        <string>db.example.internal</string>
        <key>COS_NOTIFY_BACKEND</key>
        <string>macos</string>
        <key>PATH</key>
        <string>/Users/YOUR_USER/.cos/venv/bin:/usr/local/bin:/usr/bin:/bin</string>
    </dict>

    <key>RunAtLoad</key>
    <true/>

    <key>KeepAlive</key>
    <true/>

    <key>StandardOutPath</key>
    <string>/Users/YOUR_USER/.cos/daemon.log</string>

    <key>StandardErrorPath</key>
    <string>/Users/YOUR_USER/.cos/daemon.err</string>

    <key>ThrottleInterval</key>
    <integer>30</integer>
</dict>
</plist>
EOF

# Load it (starts immediately)
launchctl load ~/Library/LaunchAgents/com.cos.daemon.plist

# Verify it's running
launchctl list | grep cos

Managing the LaunchAgent:

# Check status (PID and exit code)
launchctl list | grep cos

# Stop
launchctl unload ~/Library/LaunchAgents/com.cos.daemon.plist

# Restart (unload + load)
launchctl unload ~/Library/LaunchAgents/com.cos.daemon.plist
launchctl load ~/Library/LaunchAgents/com.cos.daemon.plist

# View logs
tail -f ~/.cos/daemon.log
tail -f ~/.cos/daemon.err

If Ollama isn't running on the client, the LLMDiscoveryWorker will log a connection error but the daemon continues running normally. If you later start Ollama on the client, the worker will automatically discover it on the next probe cycle (every 300s).

Scenario: Debugging the daemon

Something isn't being processed and you want to see what's happening:

# Run in foreground with full logs
cos daemon

# In another terminal, check what's been harvested
cat ~/.cos/harvested_sessions.json | python -m json.tool | tail -20

# Check for agent task failures
cos tasks --all

# Look at what hot_items the daemon is generating
cat ~/.cos/hot_items.md

Web Dashboard

A lightweight web UI (zero external dependencies -- pure HTML/CSS/JS served by Python's built-in HTTP server) on port 7432.

# Open in browser
cos dashboard

# Or access directly
# http://localhost:7432

Features:

View and filter agenda items by urgency
Add and resolve agenda items
Monitor agent task queue
Send messages to bosses
Spawn agents from pre-configured templates
Dark/light theme toggle
Collapsible sections and toast notifications

Spawn Templates

The dashboard includes pre-configured agent templates for common tasks. Each template has a prompt and a budget:

Template	What it does
`code_review`	Reviews git diff against main, flags bugs/security/missing tests
`fix_tests`	Runs test suite, fixes failures without changing test logic
`lint_fix`	Runs linter/type checker, fixes style issues
`security_audit`	Audits for hardcoded secrets, injection risks, insecure deps
`write_readme`	Writes/updates README based on actual code
`write_docs`	Adds docstrings to undocumented public functions
`changelog`	Generates CHANGELOG from git log
`cos_status`	Generates a CoS system health report
`cos_agenda`	Reviews and triages the agenda
`research_deps`	Researches dependency updates and security advisories
`perf_profile`	Profiles the codebase for performance bottlenecks

Claude Code Integration

CoS bootstraps automatically in Claude Code sessions via ~/.claude/CLAUDE.md. This means every Claude Code session is CoS-aware without any manual setup -- on any machine where the cos package is installed.

flowchart TB
    START["Claude Code session starts"] --> DETECT["Detect Python<br/>/srv/apps/cos/.venv/bin/python (Linux)<br/>~/.cos/venv/bin/python (macOS)"]
    DETECT --> CHECK["Exclusion check<br/>(~/.cos/exclude_dirs.txt)"]
    CHECK -->|excluded| SKIP["Skip CoS"]
    CHECK -->|active| REG["register_boss()"]
    REG --> MSG["check_messages()"]
    REG --> AGENDA["get_agenda() (urgent)"]
    MSG --> LOAD["Read ~/.cos/hot_items.md<br/>Read ~/.cos/swarm_status.md"]
    AGENDA --> LOAD
    LOAD --> READY["Session ready<br/>(urgent items shown first)"]

Multi-platform Python detection

The bootstrap in ~/.claude/CLAUDE.md automatically finds the right Python venv on each platform:

COS_PYTHON="$([ -x /srv/apps/cos/.venv/bin/python ] && echo /srv/apps/cos/.venv/bin/python || echo $HOME/.cos/venv/bin/python)"

This checks for the server venv at /srv/apps/cos/.venv/bin/python first (Linux). If that doesn't exist, it falls back to ~/.cos/venv/bin/python (macOS client install). Since ~/.claude/CLAUDE.md syncs across machines, this single file works on both without modification.

Bootstrap sequence

Python detection -- Finds the cos venv (server path first, then client path).
Exclusion check -- Reads ~/.cos/exclude_dirs.txt. If the current working directory matches any listed path, CoS is skipped entirely. This is useful for third-party repos or sensitive projects.
Registration -- Calls register_boss() to announce this session to the swarm.
Message check -- Reads any unread messages from other bosses or agents.
Agenda check -- Pulls urgent items (now, today) to display immediately.
Context load -- Reads ~/.cos/hot_items.md (recent completions, failures, urgent work) and ~/.cos/swarm_status.md (what other bosses are doing).

Statusline

The Claude Code statusline shows CoS:<bosses> <urgent>! -- for example, CoS:2 3! means 2 active bosses and 3 urgent agenda items. This is read from ~/.cos/status.json, which the daemon's ContextUpdateWorker writes every 120 seconds.

The status.json file contains:

{
  "active_bosses": 2,
  "urgent_items": 3,
  "approved_proposals": 0,
  "updated_at": "2026-04-03T06:30:04.323141+00:00"
}

Since each machine runs its own daemon, the statusline stays current on every host. If the daemon isn't running, the statusline will show stale data from the last time it was updated.

Excluding directories

To exclude a project directory from CoS:

# Add the absolute path to the exclusion file (one per line)
echo "/path/to/excluded/project" >> ~/.cos/exclude_dirs.txt

# You can also add comments
echo "# Third-party repos" >> ~/.cos/exclude_dirs.txt
echo "/home/user/vendor/some-lib" >> ~/.cos/exclude_dirs.txt

Claude Code Settings

CoS requires three pieces of Claude Code configuration in ~/.claude/settings.json. These use platform-agnostic paths so a single settings file works on both Linux and macOS:

UserPromptSubmit hook -- Fires on every user message. Heartbeats the boss, checks for hot items changes, and injects project-scoped context (agenda items, recent sessions, relevant memories) on session start.

{
  "hooks": {
    "UserPromptSubmit": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "COS_PYTHON=\"$([ -x /srv/apps/cos/.venv/bin/python ] && echo /srv/apps/cos/.venv/bin/python || echo $HOME/.cos/venv/bin/python)\"; \"$COS_PYTHON\" $HOME/.cos/prompt_submit.py"
          }
        ]
      }
    ]
  }
}

Statusline -- Displays CoS:<active_bosses> <urgent_items>! in the Claude Code status bar by reading ~/.cos/status.json.

{
  "statusLine": {
    "type": "command",
    "command": "zsh $HOME/.claude/statusline-command.sh"
  }
}

The statusline script (~/.claude/statusline-command.sh) reads JSON from stdin (provided by Claude Code with workspace, model, and context info) and appends the CoS segment from ~/.cos/status.json. The segment turns red when there are urgent items.

Key files:

File	Location	Purpose
`statusline-command.sh`	`~/.claude/`	Statusline rendering script (zsh)
`prompt_submit.py`	`~/.cos/`	UserPromptSubmit hook (Python)
`status.json`	`~/.cos/`	Written by daemon, read by statusline
`hot_items.md`	`~/.cos/`	Written by daemon, injected by hook on change

Prompt Submit Hook Details

The ~/.cos/prompt_submit.py hook runs on every user message and does the following:

flowchart TB
    MSG["User sends message"] --> EXCL["Check exclude_dirs.txt"]
    EXCL -->|excluded| EXIT["Exit silently"]
    EXCL -->|active| HB["register_boss() / heartbeat"]
    HB --> CHECK["Compare hot_items.md mtime<br/>vs last seen"]
    CHECK -->|changed or gap > 15min| INJECT["Print hot items<br/>to conversation"]
    CHECK -->|unchanged| SKIP["Skip injection"]
    INJECT --> CTX{"Session start?<br/>(gap > 15min)"}
    SKIP --> CTX
    CTX -->|yes| PROJECT["Inject project context:<br/>- Agenda items for this dir<br/>- Recent session summaries<br/>- Relevant memories"]
    CTX -->|no| DONE["Done"]
    PROJECT --> DONE

On session start (defined as >15 minutes since the last hook run), the hook also queries the database for project-specific context: open agenda items assigned to the current working directory, recent session summaries, and the top relevant memories from semantic search.

Key File Paths

graph LR
    subgraph "~/.cos/"
        HI["hot_items.md<br/><i>urgent agenda</i>"]
        SS["swarm_status.md<br/><i>active bosses</i>"]
        SJ["status.json<br/><i>statusline data</i>"]
        EX["exclude_dirs.txt<br/><i>opt-out directories</i>"]
        DL["daemon.log<br/><i>daemon stdout</i>"]
        DE["daemon.err<br/><i>daemon stderr</i>"]
        HS["harvested_sessions.json<br/><i>processed session tracker</i>"]
        VN["venv/<br/><i>client Python venv</i>"]
        WT["worktrees/<br/><i>git worktrees for agents</i>"]
        NT["notes/<br/><i>Marimo notebooks</i>"]
    end

Path	Purpose
`~/.cos/hot_items.md`	Urgent items injected at session start
`~/.cos/swarm_status.md`	Active bosses summary
`~/.cos/status.json`	Machine-readable status for statusline integrations
`~/.cos/exclude_dirs.txt`	Project paths to skip (one per line)
`~/.cos/daemon.log`	Daemon stdout log
`~/.cos/daemon.err`	Daemon stderr log (macOS LaunchAgent)
`~/.cos/harvested_sessions.json`	Tracks which session files have been processed
`~/.cos/venv/`	Client-install Python venv (macOS / remote hosts)
`~/.cos/prompt_submit.py`	UserPromptSubmit hook script
`~/.cos/worktrees/`	Git worktrees for autonomous agent workspaces
`~/.cos/notes/`	Marimo notebook notes
`~/.claude/statusline-command.sh`	Statusline rendering script
`~/.claude/CLAUDE.md`	Bootstrap instructions (loaded by Claude Code)
`~/.claude/settings.json`	Claude Code settings (hooks, statusline, env)

Notification Backends

Set COS_NOTIFY_BACKEND to one or more (comma-separated):

# Single backend
export COS_NOTIFY_BACKEND=desktop

# Multiple backends
export COS_NOTIFY_BACKEND=log,desktop

Backend	Platform	Method
`macos`	macOS	`osascript` display notification
`desktop`	Linux	`notify-send` (freedesktop)
`log`	Any	Python `logger.info`
`none`	Any	Silent

Notifications are sent by agents when they complete significant work -- for example, when the SpecFileExpert registers a new spec, or when the TriggerAgentDesigner creates a new agent.

Testing

# Install with test dependencies
pip install -e ".[test]"

# Run tests
pytest

# Run a specific test file
pytest tests/test_router.py

# Verbose output
pytest -v

Tests use a real PostgreSQL connection with transaction rollback -- no separate test DB needed. LLM calls are mocked via fixtures in tests/conftest.py.

Project Structure

cos/
  __init__.py       Public API (import cos) -- 63 exported functions
  __main__.py       Entry point for python -m cos
  client.py         PostgreSQL client: memory, agenda, swarm, sessions, sync
  router.py         LLM routing: node discovery, task dispatch, quality tracking
  daemon.py         Background workers (session harvest, memory sync, swarm, etc.)
  agents.py         Specialist agents (transcript analysis, trigger matching, etc.)
  cli.py            Click-based CLI (cos status, cos recall, etc.)
  dashboard.py      Web UI on port 7432 (zero dependencies)
  notify.py         Multi-backend notifications (macOS, Linux, log)
  note.py           Marimo-based note system with AST-validated headers
migrations/
  001_swarm_and_pipeline.sql     Core schema (swarm, pipeline, sessions)
  002_meeting_assistant_sync.sql Meeting assistant sync tables
tests/
  conftest.py       DB fixtures (rollback), LLM mocks
  test_*.py         Unit tests
Dockerfile          Container image for daemon
docker-compose.yml  Multi-service orchestration
Makefile            Convenience targets (build, up, down, cli, logs)

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.spec-workflow		.spec-workflow
cos		cos
migrations		migrations
skills/cos-assistant		skills/cos-assistant
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
COSWORK.md		COSWORK.md
COS_CHEATSHEET.md		COS_CHEATSHEET.md
COS_COMPARISON.md		COS_COMPARISON.md
Dockerfile		Dockerfile
GAP_ANALYSIS.md		GAP_ANALYSIS.md
HANDOFF.md		HANDOFF.md
LICENSE		LICENSE
LLM_AGENT_SUCCESS_RATES.md		LLM_AGENT_SUCCESS_RATES.md
Makefile		Makefile
README.md		README.md
REPORT.md		REPORT.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Document	Description
`HANDOFF.md`	Full system architecture, inference philosophy, schema reference, bootstrap mechanism
`COS_CHEATSHEET.md`	Quick-reference card for CLI commands, Python API, daemon workers, DB tables
`COSWORK.md`	Cost/quality analysis of local vs. cloud LLM routing, model recommendations, measurement plan

Folders and files

Latest commit

History

Repository files navigation

CoS -- Chief of Staff

How is this different from CrewAI / AutoGen / mem0 / MemGPT / Letta?

Introduction

What is CoS?

When is CoS useful?

How does it work in practice?

The Trigger Pipeline

Architecture

Key Concepts

Specialist Agents

Prerequisites

Installation

Server install (Linux)

Client install (macOS / remote Linux)

Docker (server)

Docker volume mounts

Optional: myma integration

Database Setup

Database Schema

Pull Required Ollama Models

Configuration

CLI Reference

CLI Examples

Scenario: Triaging your morning

Scenario: Capturing work during a session

Makefile Shortcuts (Docker)

Python API

Memory & RAG

Scenario: Recovering context from last week

Scenario: Building a knowledge base from documents

Agenda

Scenario: Managing a release checklist

Swarm Coordination

Scenario: Coordinating agents across projects

Agent Tasks

Scenario: Re-processing a past session

LLM Routing

Scenario: Adding a new Ollama node

Standing Orders & Context

Daemon

Running the Daemon on Multiple Hosts

Starting the Daemon

Linux server (Docker)

Linux server (foreground, for debugging)

macOS client (LaunchAgent)

Scenario: Debugging the daemon

Web Dashboard

Spawn Templates

Claude Code Integration

Multi-platform Python detection

Bootstrap sequence

Statusline

Excluding directories

Claude Code Settings

Prompt Submit Hook Details

Key File Paths

Notification Backends

Testing

Project Structure

Further Reading

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages