A self-extending, file-based agentic operating system built on Claude Code.
You clone this repo, run ./install.sh, then claude. You now have:
- A router (
/os <intent>) that dispatches every action —/os write change,/os pr review,/os status report, etc. - A lifecycle for engineering work: scaffold a change → plan it → peer-review the plan → execute it → open a PR → review the PR → publish the review → close on merge. Each step is a skill; the orchestrator can drive them automatically.
- A dashboard (
/os dashboard) with live views over changes, projects, PR reviews, runs, costs, notifications, and audit findings. - A vault (
vault/wiki/) that accumulates structured knowledge as you work — every change, decision, review, and research report is a markdown file with frontmatter the OS understands. - GitHub + Slack integration via MCPs (set up in 2 minutes after install with your PAT/webhook).
- Cost telemetry + event tracking for every skill run, queryable + visible per-project.
Teams of 2-10 engineers fork this repo, customize for their stack, and each engineer runs their own instance. See vault/wiki/_seed/meta/decision/decision-distribution-v1-architecture.md for the team-install model and CONTRIBUTING.md for how to extend.
A workspace where Claude acts as the kernel for end-to-end workflow automation:
- Domains organize knowledge and skills by area (development, research, meta)
- Skills are invokable actions;
/os <intent>dispatches to the right one - Apps are optional visual UIs over domain state (the dashboard is the first)
- MCPs are structured tool surfaces — bridges to external services (GitHub, Slack…) or internal subsystems, exposed via the Model Context Protocol
- Vault is structured persistent memory:
raw/→wiki/→output/ - The OS extends itself: new domains, skills, apps, and MCPs are scaffolded through the OS
./install.shVerifies prerequisites (node version pinned in .nvmrc — currently v26.1.0, claude CLI), installs root tooling + dashboard deps, stamps the install marker, scaffolds .env files for each MCP (mcps/<id>/.env) and each app server (domains/<domain>/app/.env) from their committed .env.example siblings. The scaffolds ship with empty secrets — fill them in before the corresponding feature is exercised:
mcps/github/.env—GITHUB_TOKENfor PR open/read/list (used bydev-open-pr,dev-pr-review)domains/meta/app/.env—SLACK_BOT_TOKENorSLACK_WEBHOOK_URLfor notification delivery (see Notifications); optionalGITHUB_TOKENfor server-side GitHub calls separate from the MCP
The .env files are gitignored per standard-env-config; the loader (server/load-env.ts for apps, the MCP's own loadEnv() for MCPs) populates process.env at process start with shell-exported values winning.
Open Claude Code in this directory:
claudeThen dispatch through the router (/os <intent>). See Commands below for the full list.
A walkthrough for your first session. Each step takes a minute or two.
-
Launch the dashboard. From your
claudesession:/os dashboard. Opens a browser at localhost. The dashboard is the visual surface for everything below — changes, projects, PR reviews, runs, audit, notifications. -
Ingest your team's repo. Tell the OS about the code you'll be working on:
/os ingest repo https://github.com/your-org/your-appThis clones into
repos/your-app/(gitignored, local-only), analyzes the stack, and writes an entity wiki entry the downstream skills consume. -
(Optional) Set up GitHub. If you want PR review / open-PR / publish flows, configure the github MCP:
cp mcps/github/.env.example mcps/github/.env # Edit mcps/github/.env, paste a token (see comments for required scopes)Restart
claudeso the MCP picks up the new env. Confirm with/mcp. -
(Optional) Set up Slack. For notification delivery to Slack channels:
cp domains/meta/app/.env.example domains/meta/app/.env # Add SLACK_BOT_TOKEN or SLACK_WEBHOOK_URL -
Scaffold your first change. Try a small one:
/os add change # interactive — provide title + repoThe change lands in
vault/wiki/development/change/<slug>.md. The dashboard's Changes view shows it in the planning state. -
Plan + review + execute. From the dashboard's change detail page, click Write plan. Once the plan is generated, click Review plan. Once approved, click Execute. Each step writes structured artifacts (
vault/output/development/changes/) and updates the change's lifecycle state. -
Open the PR. Once execution is done, click Open PR — pushes the branch + creates the PR via the github MCP.
-
Review the PR. Click Review PR — runs
dev-pr-reviewagainst the open PR, produces a structured review with categorized comments. -
Look at the runs drawer. Top-right of the dashboard. Every skill dispatch lives here with cost + duration + output. The Insights view rolls these up into per-skill and per-project totals.
-
Check the audit panel. Bottom of the Overview page. Surfaces drift (dangling wikilinks, stale repo caches, missing skill registrations, etc.) with one-click Accept buttons for the common ones.
After this loop, you'll have a clear feel for the canonical workflow. Everything else in this README is reference material for specific features.
All actions dispatch through the /os router skill. The router reads OS.md's intent vocabulary and routes to the matching meta-* or domain skill. Direct invocation (e.g. /meta-brief) is the power-user escape hatch.
These ten cover ~80% of week-one usage. The full vocabulary is below.
| command | what |
|---|---|
/os brief |
Session brief — pending work, in-flight changes, recent activity |
/os dashboard |
Launch the OS dashboard (Vite + Fastify on localhost; opens browser) |
/os ingest repo |
Clone + analyze a GitHub repo so the OS knows about it |
/os add project |
Scaffold a project (scope + lifecycle + repos) |
/os add change |
Scaffold a code change against an ingested repo |
/os write change |
PLAN or EXECUTE based on the change's state (state-machine driven) |
/os review change |
Peer-review a plan before it executes |
/os review pr |
Review a pull request — categorized comments + structured verdict |
/os audit |
Compliance check across skills, wiki, domains, archetypes, router, logs |
/os status report |
Generate a project status report from commits + decisions + scheduler runs |
The canonical, always-current intent → skill mapping lives in OS.md (the ### Intent vocabulary table — ~50 rows covering scaffolding, lifecycle, research, PR review, notifications, automation, and meta-evolution).
For live discovery, open the dashboard's Skills view — every skill is listed with its frontmatter, inputs, and current invocation rate. Add new ones via + New Skill (calls meta-add-skill).
How dispatch + extension works:
- Intents are matched against the OS.md vocabulary table; misses get logged to
vault/raw/router-log.jsonlso the vocabulary grows with use - New skills auto-register into
OS.md+ the owning domain's playbook viameta-add-skill-to-router-vocab+meta-add-skill-to-playbook(or run them standalone to fix audit findings on existing skills) - The Action Items panel on the Overview surfaces missing-from-vocab / missing-from-playbook warnings with one-click resolutions
For destructive operations (/os rename, /os delete, /os evolve), the skill always prompts for confirmation before mutating cross-references. See the relevant SKILL.md.
The OS fires actions on a schedule without a human prompt. A scheduled job is a runbook wiki entry with two optional frontmatter fields:
schedule: "0 9 * * *" # standard 5-field cron, machine local time
prompt: "/os brief" # intent fed to `claude -p` when duescripts/scheduler-tick.mjs runs every 60s under a launchd LaunchAgent, finds due jobs, and fires each via claude -p. Runs append to vault/raw/scheduled-runs.jsonl.
./scripts/install-scheduler.sh # install the LaunchAgent (macOS)
node scripts/scheduler-tick.mjs --list # see all schedules + next-run times
node scripts/scheduler-tick.mjs --dry-run # show what would fire right nowFrom the dashboard, the Schedules view lists every scheduled runbook with its cron, next run, last run + exit code, and a Run now button for manual firing. Scaffold new ones with + New Schedule or /os add-schedule.
Three schedules ship by default:
runbook-morning-brief— fires/os briefdaily at 9amrunbook-weekly-curation-check— Sunday 8am scan for stalevault/raw/itemsrunbook-weekly-health-check— Sunday 8:30am runs/os audit, writes a dated summary tovault/output/meta/health-checks/<date>.md(proactive drift surfacing — the audit is otherwise pull-based)
Edit or delete the seed entries in vault/wiki/_seed/meta/runbook/.
Full standard: vault/wiki/_seed/meta/reference/standard-scheduled-jobs.md.
MCPs (Model Context Protocol servers) are how the OS gives Claude structured access to things outside the markdown — external services like GitHub or Slack, and internal subsystems like the vault or scheduler. Each MCP exposes a typed tool surface (create_pull_request, search_wiki, …) that any skill or session can call.
Two kinds, with different homes:
| kind | lives at | when to use |
|---|---|---|
| OS-built | mcps/<id>/ (custom Node code) |
Tight tool surfaces, OS-specific composites (call + log + write-back), no vendor MCP exists |
| Third-party | .mcp.json row only (no folder) |
A vendor offers a maintained MCP that covers what you need — pass through unchanged |
The OS ships with two OS-built MCPs wired up:
github(mcps/github/) — PR open/read/list + check status. PAT-based auth viamcps/github/.env. Used bydev-open-prand the planned PR-review backend.vault(mcps/vault/) — wiki search + entry read + archetype listing. No auth required (local filesystem read).
After cloning and running ./install.sh, drop a GitHub PAT into the github MCP's env:
cp mcps/github/.env.example mcps/github/.env
# Edit mcps/github/.env, paste a token (see the comments inside for required scopes)Then start Claude Code from this directory (claude). The MCPs are auto-spawned via stdio on session start; confirm with /mcp.
Hosted MCPs (vendor-run OAuth endpoints, e.g. for Linear / Notion / Slack) are also fully supported via /os add-mcp --kind hosted. There's one gotcha: Claude Code uses OAuth Dynamic Client Registration (RFC 7591), and some vendors (notably GitHub's hosted MCP) don't support DCR — you'll see "SDK auth failed: Incompatible auth server" in /mcp. When that happens, fall back to a custom OS-built MCP with PAT auth (which is exactly the path the github MCP uses). The MCP architecture standard documents this contract.
Scaffold a new MCP with /os add-mcp (custom or hosted mode). scripts/sync-mcp-config.mjs regenerates .mcp.json after each scaffold, preserving any third-party rows.
Full standard: vault/wiki/_seed/meta/reference/standard-mcp-architecture.md.
Before the OS can write code, it needs to know about the repos you work on. /os ingest repo accepts a GitHub URL, GitHub shorthand (owner/name), or a local path:
/os ingest repo https://github.com/me/my-app
/os ingest repo /Users/me/code/my-thingThe skill clones the repo (GitHub only) to repos/<slug>/, walks it, and produces a kind: repo entity wiki entry at vault/wiki/<domain>/entity/<slug>.md. The entry captures:
- Stack (language, framework, build/test commands)
- Top-level structure
- Entry points
- Style configs + CI
- Conventions inferred from CONTRIBUTING /
.github/
Downstream skills (dev-pr-review, dev-write-change, future PR-writers) read the entity entry to know how to operate on the repo without re-discovering metadata. repos/<slug>/ is gitignored — it's a local working copy, not OS state.
Re-ingest with overwrite: true to refresh after upstream changes. Multi-repo work is handled by projects (see below), not by ingesting differently.
Full standard: vault/wiki/_seed/meta/reference/standard-repo-ingestion.md.
A project is the workflow scope between a single change and "an ongoing area." Projects own a deadline, milestones, a reporting cadence, and (optionally) one or more ingested repos. They coordinate work that crosses repos and accumulates decisions over time.
/os add-project
# prompts for: name, title, domain, repos (comma-separated entity ids),
# deadline, reporting cadence + targetLifecycle: planning → active → review → shipped → archived. Status (active / paused / completed / cancelled) gates project-scoped scheduled runbooks — pausing a project pauses its weekly status reports automatically.
Two ways entries relate to a project:
- Owned —
project: <project-id>in the entry's frontmatter. The Projects dashboard view groups owned decisions, notes, and changes under the project as its accumulated work product. - Referenced —
[[<project-id>]]in the entry's body. Mentions / cross-references / context.
Cross-repo features compose as one project + N changes (one per repo) — each change's project: field auto-aggregates it under the project.
A seed project ships with the OS: build-agentic-os-v1 (vault/wiki/_seed/meta/project/build-agentic-os-v1.md) — the OS dogfooding itself. The existing seed decisions are owned by this project, so the Projects view's drill-down shows real content out of the box.
/os status report (or the dashboard's "Generate status report" button) walks recent commits + backlinked decisions + scheduler runs + milestone changes, then writes markdown to vault/output/<domain>/status-reports/<id>-<YYYY-MM-DDTHHMMSS-TZ>.md. v1 ships target: clipboard only — copy the markdown into Notion / Linear / Slack manually. Native integrations are deferred.
A few patterns are worth knowing:
- Multiple reports per day, no clobber. Filenames include
HHMMSSand a TZ suffix, so back-to-back generations during the same project produce distinct files instead of overwriting. The project'sreporting.last_sentis the canonical "most recent" pointer; the file tree is the audit trail. - Local time in filenames and bodies. Timestamps render in the user's local timezone (e.g.
2026-06-01T143022-PDT) rather than UTC. The Status app's report list groups by local day. Skills authored before this convention may still emit UTC — those will be migrated as they surface. - Continuous change-lifecycle tracking. Every status report includes a
### Changessection that lists every non-terminal change owned by the project and its current step (planning / in-progress / in-review / merged / abandoned) derived from the change'sstatus+review_status+pr_review_status. The same change can appear in consecutive reports as it walks the lifecycle — the report is a snapshot, not a delta. - Slack template. A status-report-specific Slack template (
vault/wiki/_seed/meta/template/notification-dashboard-status-report.md) renders the report into a Slack message when adashboard.status-report.generatednotification rule is configured. The skill stuffstitle,tldr,progress_summary,blockers,next,report_path,period_localinto the event'sargspayload; the template references those as flat vars (see Template syntax above).
Full standard: vault/wiki/_seed/meta/reference/standard-project-workflow.md.
A change is the atomic unit of code work — single repo, single branch, single PR. Smaller than a project; larger than an ad-hoc edit. Composes into projects when work spans repos.
/os add-change # scaffolds entry; auto-drafts Why/Approach/Done-when when context allows
/os write-change # state-machine driven (PLAN or EXECUTE depending on review_status)
/os review-change # peer review the planStatus lifecycle: planning → in-progress → in-review → merged | abandoned. Required fields: title, repo (must reference an ingested-repo entity), status, branch.
dev-add-change accepts an optional description input. When provided (or when the title is specific enough to derive intent from), the scaffolder drafts a first-pass ## Why / ## Approach / ## Done when from: title + description + repo entity context. Each section ships with a > **DRAFT** — review and refine before invoking dev-write-change. blockquote.
The human's job is to review and accept the draft — not write it from scratch. Accept happens during /os write-change PLAN: the gate prints the drafted sections, then AskUserQuestion offers Accept as-is (strips the DRAFT lines in place) or Stop & edit first. No separate manual-edit step required for clear-intent changes.
dev-write-change reads the change entry's review_status field and picks the right phase:
review_status |
what happens |
|---|---|
pending (no plan yet) |
PLAN phase — agent walks repo, composes structured plan (files modified/created/NOT-touched, tests, risks). Writes to vault/output/<domain>/changes/<slug>-plan.md. |
pending (plan exists) |
"Run /os review-change <id>" — the writer won't execute without review |
approved |
EXECUTE phase — creates branch (per [[standard-git-hygiene]]), follows plan exactly, runs tests, commits with conventional-commit format, sets status: in-progress |
request-changes |
Surface concerns. User picks: re-plan / override (review_status: overridden) / abandon |
rejected |
Surface verdict; suggests status: abandoned |
not-required |
Skip review; go straight to EXECUTE. Set at scaffolding via review_required: false for trivial changes (dep bumps, typo fixes) |
dev-review-change is read-only: walks the plan + repo + conventions, runs a 6-category checklist (scope discipline / convention alignment / risk / test coverage / existing code respect / git hygiene), writes a structured verdict to vault/output/<domain>/changes/<slug>-review.md, updates review_status. Cannot edit code, create branches, or run tests — the separation is the safety property.
Both dev-write-change PLAN/EXECUTE and dev-review-change read these standards at start of procedure. Repo-specific overrides in the entity entry's ## Conventions section take precedence where they conflict — but the standards are the floor.
standard-code-quality— idiomatic code, dependency hygiene, backwards compat, security, tests, comments, repo-convention adherencestandard-git-hygiene— pre-branch state (clean tree + ff-only pull), branch naming (<type>/<slug>or<type>/<issue>/<slug>using semantic-release typesfeat|fix|docs|style|refactor|test|chore), conventional commits with the Angular format, PR structure
Three layers enforce that the change body is human-reviewed before the writer plans:
- Skill gate —
dev-write-changePLAN phase refuses on either template placeholders OR un-accepted**DRAFT**markers (the latter triggers an interactive accept prompt rather than a hard reject) - Audit check —
change-body-template-placeholderwarns on planning-state changes with either symptom, surfaces in/os auditand the Health view - Dashboard hint — yellow state-hint card in the Changes view when the body needs editing/accepting
Full standard: vault/wiki/_seed/meta/reference/standard-change-workflow.md.
A research-report is the formal spec output of a structured investigation against a project's open questions. Lifecycle mirrors the change workflow's review gate: draft → review → revise → approve → scaffold recommendations into changes.
/os research write # research-write — investigates the question, drafts the report
/os research review # peer-review the draft (approve / request-changes / reject)
/os research revise # fold review findings back into a new revision
/os research update # incorporate new materials after approvalThe skill outputs land at vault/wiki/research/research-report/<project>-<topic>.md with recommended_changes[] in frontmatter. Once approved, research-scaffold-recommendations fans each recommendation out to a dev-add-change invocation, populating derived_from_report + recommendation_index on each new change entry so the audit trail traces back to the source report.
Three input channels feed the research skills:
- Materials (
vault/raw/project-research/<project>/<topic>/) — files dropped into a per-report directory. URLs/wikilinks/file uploads can be seeded before dispatch via the Add-report modal's drag-drop zone (writes to the materials dir; research-write picks them up on first walk). - Notes log (
notes_log:frontmatter array) — mid-lifecycle guidance the user adds after a report has been drafted. Each note carriesseverity(info/warn/blocker) + a hybrid persistent-considered_bychain: skills (research-review/revise/update) read unconsidered notes, fold them in, and append their run id. The UI surfaces an "unconsidered" badge so it's obvious which guidance is still pending action. - Mark approved (UI escape hatch) — overrides the reviewer's verdict when the user disagrees. Flips
review_status: request-changes → approvedvia vault-only endpoint. Gated to that specific transition; not a way to bypass review on a fresh report.
When a project has research_paths populated (via /os research write), the project's Plan tab renders the report inline — the legacy /plan/research flow stays as the fallback for projects authored before the research-report lifecycle existed. Both paths produce the same downstream artifact (recommended_changes[] → dev-add-change fan-out).
Full archetype: vault/wiki/_seed/meta/reference/archetype-research-report.md. Decision behind the inline rendering: vault/wiki/meta/decision/decision-research-report-vs-project-plan.md.
The OS dispatches structured notifications to Slack / email / desktop based on per-(event_type, channel) rules. Every event the system records (project complete, change merged, research approved, …) can trigger one. Rules live as notification-config wiki entries; the dispatch engine runs inside the dashboard server fed by two parallel paths so it catches events from any source.
event lands in events.db
│
│ Path A: in-process afterInsert hook fires immediately for events
│ recorded inside the dashboard server (test-sends, internal flows)
│ Path B: server poller reads events.db every 10s for id > lastSeen,
│ catches inserts from EXTERNAL processes (skills via
│ record-dashboard-action.mjs — the canonical event source)
▼
dispatcher matches rules ← reads vault/wiki/<domain>/notification-config/*.md
│ rate-limit check (global 100/day, per-rule override)
│ render via template ← reads vault/wiki/_seed/meta/template/
│ notification-<event-type>.md (per-event
│ override) → notification-default.md (fallback)
▼
channel adapter routes by rule.channel: slack | email | desktop
│
▼
events.db ← kind='notification', action='sent'|'failed'|'suppressed-rate-limit'
Why two paths. The afterInsert hook is a per-process module variable — it only fires when recordEvent runs inside the dashboard server. Skills invoke record-dashboard-action.mjs as a separate Node process that imports events-db fresh, has no hook registered, writes the row, and exits. Without the poller, every skill-driven event would silently bypass the dispatcher.
Templates use Mustache-style {{var}} substitution with one important convention:
event.raw.argsis flattened into top-level template vars. Skills stuff per-event metadata into the event'sargspayload (e.g.meta-status-reportwritestitle,tldr,progress_summary,blockers,next,report_path,period_local). Templates reference these as flat keys:{{title}},{{tldr}}— not{{args.title}}. The renderer's regex (\w+) doesn't support dotted paths.- Reserved keys win. Args fields named
project,kind,change_id,domain,skill,description,status,event_type,rule_id,tscan't shadow the canonical event fields. Skills should pick non-conflicting names. {{delivery_tags}}— space-joined string of the rule'sdelivery.tagsarray (e.g.@nico @sarah). For Slack CC lines.- No sections / loops. The renderer doesn't support
{{#X}}...{{/X}}. Skills that need iteration should pre-join into strings.
-
Slack (most common channel): copy
domains/meta/app/.env.example→domains/meta/app/.envand set ONE of:SLACK_BOT_TOKEN=xoxb-…(preferred) — per-rule channel routing viachat.postMessage. Requires a Slack app withchat:writescope (+chat:write.publicto skip per-channel bot invites). Create atapi.slack.com/apps.SLACK_WEBHOOK_URL=https://hooks.slack.com/…(fallback) — single bound channel chosen at webhook install time; per-ruledelivery.slack_channelis ignored. Zero-config but fixed-channel.
Both set? Bot-token wins. The Rule Editor's
slack_channelfield is editable in bot-token mode, disabled with a clear hint in webhook mode. -
Email — same
.env, setSMTP_*(deferred; the email channel adapter was abandoned in v1 in favor of Slack-first). -
Desktop — no env needed; uses the browser's Notification API from any open dashboard tab.
- Notifications sidebar item: defaults to the Activity log (table of every dispatch — rule, event, channel, outcome, error). Rules tab shows the per-
(event, channel)matrix; click any cell to add or edit a rule. - Rule Editor: per-rule channel + filters (project / domain / severity) + delivery shape + optional rate-limit override + Test send button. Severity-tinted alert hint reflects the active Slack transport.
- Per-lifecycle-step bells: project / change / research lifecycle steppers render a small bell next to each step. Click → Rule Editor pre-filled with that step's
event_type+filter.project. Maps the right event to the step contextually (per the event catalog). - Project page → Notifications tab: lists rules where
filter.project === <this project>. Per-project subscription view;+ Addbutton pre-fills the filter.
vault/wiki/_seed/meta/reference/event-catalog.md is the curated registry of every user-facing lifecycle event worth subscribing to. ~35 events organized by entity (project / change / research-report). Both the rule editor's event-type picker and the bell affordances read from it. To add a new subscribable event: add a row to the catalog + (optionally) ship a notification-<event-type>.md template for richer message rendering.
Three audit checks instrument the notification pipeline:
notification-rule-orphan(warn) — flags events tagged torule:<id>where the rule no longer existsnotification-rate-limit-exceeded(info) — surfaces suppression events in the trailing 24hnotification-delivery-failed(warn) — surfaces permanent send failures with the adapter's error verbatim
Full standards:
vault/wiki/_seed/meta/reference/archetype-notification-config.md— rule schemavault/wiki/_seed/meta/reference/event-catalog.md— subscribable event registryvault/wiki/_seed/development/reference/standard-env-config.md— per-surface.envpatternvault/wiki/_seed/development/reference/standard-shared-types.md— sibling.types.tspattern (used by notifications routes)
Projects can opt into running the change lifecycle (write → open PR → review → merge) without a human in the driver's seat. Automation is per-project, configured under the project's automation: frontmatter block; the orchestrator lives in the dashboard server and ticks state forward as each step lands.
project.automation: { enabled: true, mode: 'sequential-changes', pause_on: [...] }
│
▼
orchestrator picks next change with status: planning (oldest-first by `created`)
│
▼
state machine ──▶ WRITE ← dev-write-change (PLAN → REVIEW → EXECUTE)
OPEN_PR ← dev-open-pr
REVIEW ← dev-pr-review (writes pass entries)
MERGE ← (parks here; merge watcher closes the loop)
↓
on merge: advance to next planning change, repeat
↓
on failure / review-not-approved: PAUSE with reason
Pause gates. The orchestrator pauses (rather than aborting) on two conditions, listed in the project's pause_on: array:
skill-failure— any orchestrated skill exits non-zero. The failure is captured in the events log; the user resumes after addressing.review-not-approved—dev-pr-reviewrecords a pass withresult: request-changes. The orchestrator stops and waits for the user to revise the change (re-runningdev-write-changein REVISE mode) before resuming.
Merge watcher. Once a PR is opened, the orchestrator's WRITE → OPEN_PR → REVIEW steps are complete but the change isn't merged yet. A server-side poller (60s interval) calls gh pr view against any open PR for changes currently in MERGE state; when GitHub reports it merged, the watcher updates the change's frontmatter (status: merged, merged_at: <ts>) and ticks the orchestrator forward to the next change.
Auto-tick on step completion. When a skill subprocess wrapped by record-dashboard-action.mjs exits with exit: 0, an onAutomationStepComplete() hook runs inside the dashboard server (called from routes/runs.ts). It re-evaluates the project's automation state and dispatches the next step immediately — no polling delay for in-process advancement, only the merge watcher polls externally.
Surfaces.
- Project page → Automation tab — enable/disable, edit
pause_on, see current phase + which change is in flight, Start / Pause / Resume / Stop buttons. - Project page → Overview tab — the
ChangesLifecycleSteppershows distribution acrossplanning / in-progress / in-review / merged / abandoned, updating live as the orchestrator advances. - Audit hooks —
automation-paused(info),automation-skill-failure(warn),automation-stalled(warn, when in MERGE state for >24h without GitHub reporting merged).
Why no global automation. Automation is project-scoped on purpose: different projects have different review tolerances, different cost budgets, different stakeholder expectations. A single "auto-merge everything" toggle would lose that nuance. The per-project block makes the consent explicit and the scope obvious.
Every action the OS executes — router dispatches, dashboard AI bridge calls, vault edits, scheduler fires — writes a structured row to a pure-Node SQLite database at .claude/state/events.db. Captured per row: timestamp, kind, action, skill, project, model, tokens (in/out/cache), cost, duration, exit status, files touched.
This is telemetry, not knowledge. The vault holds what you know; events.db holds what happened. The two layers stay separate:
Vault (vault/) |
events.db (.claude/state/events.db) |
|---|---|
| Curated, semantic, archetype | Automatic, mechanical, instrumented |
| One entry per concept | One row per action |
Git-tracked (in _seed/) |
Gitignored, machine-local |
| Markdown + YAML frontmatter | Indexed SQL columns |
Bootstrap + seed from existing JSONL logs:
node scripts/events-db-init.mjs # idempotent schema
node scripts/events-db-backfill.mjs # seed from vault/raw/*.jsonl
node scripts/import-session-usage.mjs # import per-turn token usage from
# Claude Code session JSONL (in-session
# cost attribution for slash commands
# + interactive turns)| event source | captured? | model/tokens/cost? |
|---|---|---|
| Scheduler tick (subprocess) | ✓ via record-router-event / dual-write |
full metrics |
| Dashboard AI bridge (subprocess) | ✓ via routes/action.ts stream-json parsing |
full metrics |
| Dashboard edits | ✓ via routes/edit.ts dual-write |
metric-less (no LLM) |
Router CLI dispatches (/os …) |
✓ via record-router-event wrapper |
audit-only (subprocess outside) |
| Skill-body audit logs | ✓ via record-dashboard-action wrapper |
audit-only |
| In-session turns (slash + interactive) | ✓ via import-session-usage.mjs |
full metrics (parsed from Claude Code's session JSONL) |
Audit-only rows show a small audit-only pill in the Insights table — they record that the action happened but lack model/tokens/cost because no subprocess wrapped the LLM call.
Query from CLI:
sqlite3 .claude/state/events.db "SELECT kind, count(*) FROM events GROUP BY kind"
sqlite3 .claude/state/events.db "SELECT skill, count(*) AS n, printf('%.4f', sum(cost_usd)) AS cost FROM events WHERE skill IS NOT NULL GROUP BY skill ORDER BY 3 DESC"Or from the dashboard's Insights view — counts by kind/skill/model, total cost, slowest events, recent events table with click-to-expand + column resize. JSONL audit files in vault/raw/ continue to be appended for backward compatibility; future cleanup may retire them once events.db proves itself.
Full standard: vault/wiki/_seed/meta/reference/standard-event-store.md.
.claude/skills/ Invokable actions (one directory per skill: <name>/SKILL.md)
.claude/hooks/ Lifecycle hooks (curation, index rebuild, session brief)
.claude/state/ Internal state (install marker, schedule dedupe, launchd logs, events.db)
_templates/ Scaffolder templates for new domains/skills/apps/archetypes
scripts/ Out-of-band runners (scheduler tick + audit + macOS installer)
repos/ Ingested external repositories (gitignored; one clone per slug)
domains/ Domain playbooks + optional apps + sub-domains
meta/ The OS itself as a domain (includes the dashboard app)
development/
research/
vault/ 3-stage memory lifecycle
raw/ Unstructured ingest + JSONL audit logs (gitignored)
wiki/ Structured memory: <domain>/<archetype>/<slug>.md (only _seed/ committed)
output/ Generated artifacts (gitignored) — briefs, status reports,
change plans, change reviews, health checks
.index/ Derived manifest, rebuilt by hook (gitignored)
CLAUDE.md— workspace instructions auto-loaded by Claude Code on every sessionOS.md— entry-point map, intent vocabulary, domain indexCONTRIBUTING.md— how to extend the OS (add skills, domains, MCPs, archetypes)TROUBLESHOOTING.md— common failure modes + fixes for install, MCP setup, skill failures, vault state, commit/CIdomains/meta/playbook.md— full OS standards and evolution protocolvault/wiki/_seed/meta/decision/decision-distribution-v1-architecture.md— why the OS is shaped this wayvault/wiki/_seed/meta/reference/standard-team-customization.md— extension model for team forksvault/wiki/_seed/meta/reference/— detailed reference entries for each standard
- Plain files, no databases. Wiki entries are markdown with frontmatter; logs are JSONL; the index is JSON. Greppable, git-friendly, portable.
- Self-extending. The dashboard's
add domain,add skill,add appworkflows are how the OS grows. No bespoke kernel changes needed for new capabilities. - Two-layer memory. Claude Code's built-in memory holds user profile + feedback; the OS's vault holds structured domain knowledge.
- Router-first dispatch. All actions flow through
/os. Misses are logged to evolve the vocabulary. - Apps are optional UI over the same files. Reading vault from React is just
fs.readFile. No new auth, no new storage. - Layered defense. For every load-bearing constraint we want to enforce (filled-in change descriptions, valid YAML, project-scoped scheduler firing, …), the enforcement lives in three places: the skill (fail-fast at the point of harm), the audit (passive scan, surfaces drift), and the dashboard (visual nudge). One layer alone fails silently; three layers catch what each misses.
- Backlinks are the inverse query. Manifest collects
[[wikilinks]]from every entry. "What belongs to project X" / "what mentions decision Y" / "what changes touched repo Z" are all manifest reads, not stored lists. No dual-write, no list maintenance.
v1 build, distribution-ready for small-team installs. The architecture is locked (see vault/wiki/_seed/meta/decision/decision-distribution-v1-architecture.md), the standards are documented in vault/wiki/_seed/meta/reference/, and the OS scaffolds itself for everything beyond the initial bootstrap. End-to-end automation has been validated on real changes through the full lifecycle (research → plan → review → execute → PR → review → publish → close).
Deferred to v2+: bot-account separation for true PR APPROVE events (currently auto-downgrades to COMMENT when the PAT-holder is also the PR author), team-shared metrics aggregation across engineers, skill marketplace / upstream-tracking model.