Skip to content

rangelak/ColonyOS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

490 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ColonyOS

ColonyOS

The fully autonomous AI engineering team that builds itself.

CI PyPI version MIT License Python 3.11 | 3.12

Installation · Quickstart · How It Works · CLI · Config · Slack · Dashboard · VM Deploy · Architecture


ColonyOS is an autonomous software engineering pipeline. Give it a feature description — or let its built-in CEO agent decide what to build — and it writes a PRD, implements the code with tests, runs parallel multi-persona code reviews, fixes issues, and opens a pull request. No human in the loop.

Under the hood it orchestrates Claude agent sessions via the Claude Agent SDK with full codebase awareness. Point it at any repo and let it work.

ColonyOS builds itself. Every feature, fix, and review in this repo was proposed, implemented, and shipped by ColonyOS agents running on their own codebase.


Installation

Prerequisites

Dependency Why Check
Python 3.11+ Runtime python3 --version
Claude Code CLI Agent execution engine claude --version
Git Branch/commit operations git --version
GitHub CLI PR creation, issue fetching gh auth status

Don't have Claude Code CLI yet? See Setting up Claude Code below.

Install ColonyOS

macOS (Homebrew):

brew install rangelak/colonyos/colonyos

Cross-platform (curl installer):

# Handles everything — installs pipx if needed, then colonyos
curl -sSL https://raw.githubusercontent.com/rangelak/ColonyOS/main/install.sh | sh

pip (if you already have it):

pip install colonyos

Optional extras

# Slack integration (Socket Mode listener)
pip install "colonyos[slack]"

# Web dashboard (local FastAPI + React UI)
pip install "colonyos[ui]"

# Development (tests, pre-commit hooks, dashboard)
pip install "colonyos[dev]"

Slack-enabled deployments are best on Python 3.11-3.13. The core package supports newer Python versions, but third-party Slack dependencies may lag new interpreter releases.

Verify your environment

colonyos doctor

This checks Python version, Claude Code CLI authentication, Git, GitHub CLI, and optional dependencies. Fix anything it flags before continuing.


Quickstart

1. Initialize a project

cd your-project/
colonyos init

By default, an AI assistant reads your repo, detects your tech stack, and proposes a complete configuration for you to confirm with a single "y". For the classic interactive wizard:

colonyos init --manual

For zero-prompt setup:

colonyos init --quick --name "MyApp" --description "B2B analytics" --stack "Python/FastAPI"

2. Run the pipeline

Directed mode — you choose what to build:

colonyos run "Add a health check endpoint"

From a GitHub issue:

colonyos run --issue 42

Autonomous mode — the CEO agent decides what to build:

colonyos auto

Long-running autonomous loop — walk away for the day:

colonyos auto --loop 50 --max-hours 24 --max-budget 500 --no-confirm

That's it. ColonyOS generates a PRD, implements the code with tests, runs multi-persona code review, and opens a pull request.


How It Works

ColonyOS runs a multi-phase pipeline where each phase is an isolated Claude agent session with its own instruction template and budget cap.

flowchart LR
    CEO["🧠 CEO\nPropose"] --> Plan["📋 Plan\nPRD + Tasks"]
    Plan --> Impl["⚡ Implement\nCode + Tests"]
    Impl --> Verify["✅ Verify\nRun Tests"]
    Verify --> Review["🔍 Review / Fix\nLoop"]
    Review --> Decision{"🎯 Decision\nGate"}
    Decision -->|GO| Deliver["🚀 Deliver\nPush + PR"]
    Decision -->|NO-GO| Stop["⛔ Stopped"]
Loading
Phase What happens
CEO (auto mode only) Reviews the strategic directions landscape doc (if present), analyzes the project and its history (CHANGELOG.md), then proposes the single highest-impact feature. Optionally refreshes directions after each proposal.
Plan Explores the codebase, generates a PRD with clarifying Q&A from your defined personas (running as parallel subagents), and produces a task breakdown.
Implement Creates a feature branch, writes tests first, then implements each task. Commits incrementally.
Verify (optional) Runs your test command (e.g. pytest, npm test). Failed tests trigger implement retries with failure context before the expensive review phase.
Review / Fix Loop Reviewer personas run independent, parallel, read-only reviews. If any request changes, a Staff+ fix agent addresses findings, then reviewers re-run. Repeats up to max_fix_iterations.
Decision Gate Reads all review artifacts and makes a GO / NO-GO verdict. NO-GO halts the pipeline.
Deliver Pushes the branch, opens a PR linking back to the PRD, and updates CHANGELOG.md.
Learn Extracts patterns from review artifacts and persists them to .colonyos/learnings.md for future runs.

Review / Fix loop detail

flowchart TD
    Start["Start Review Round"] --> Parallel["Parallel Persona Reviews\n(read-only, concurrent)"]
    Parallel --> Check{"All\napprove?"}
    Check -->|Yes| Gate["Decision Gate"]
    Check -->|No| Fix["Fix Agent\nAddress Findings"]
    Fix --> Budget{"Budget &\niterations\nremaining?"}
    Budget -->|Yes| Parallel
    Budget -->|No| Gate
Loading

CLI Reference

Getting started

Command Description
colonyos doctor Check all prerequisites and environment health
colonyos init AI-assisted setup (default) — reads repo, proposes config
colonyos init --manual Classic interactive wizard
colonyos init --quick Zero-prompt setup with defaults
colonyos init --personas Re-run just the persona workshop
colonyos status Show recent runs, loop summaries, and cost breakdown
colonyos directions View CEO strategic directions
colonyos directions --regenerate Regenerate directions from scratch
colonyos directions --static Lock directions (CEO reads but never rewrites)
colonyos directions --auto-update Unlock directions to evolve each CEO iteration

Running the pipeline

Command Description
colonyos run "prompt" Directed mode — plan, implement, review, deliver
colonyos run "prompt" --plan-only Stop after PRD + tasks (no code)
colonyos run --from-prd PATH Skip planning, implement an existing PRD
colonyos run --issue NUMBER Use a GitHub issue as the feature prompt
colonyos run --resume RUN_ID Resume a failed run from the last successful phase
colonyos run --offline Skip remote git checks (preflight)
colonyos run --force Bypass preflight warnings
colonyos run "prompt" Run the pipeline; interactive terminals default to the Textual TUI
colonyos run "prompt" --no-tui Force plain streaming output instead of the TUI

Interactive TUI

Command Description
colonyos tui Deprecated alias for the interactive terminal UI (requires pip install colonyos[tui])
colonyos tui "prompt" Deprecated alias that launches the TUI and immediately runs a prompt

Autonomous mode

Command Description
colonyos auto CEO proposes a feature and the pipeline builds it
colonyos auto --loop N Run N autonomous cycles back-to-back
colonyos auto --max-hours H Stop loop after H wall-clock hours
colonyos auto --max-budget USD Stop loop after USD aggregate spend
colonyos auto --no-confirm Skip human approval of CEO proposals
colonyos auto --propose-only CEO proposes but does not execute
colonyos auto --resume-loop Resume the most recent interrupted loop

Codebase sweep

Command Description
colonyos sweep Analyze entire codebase for quality issues (dry-run report)
colonyos sweep PATH Scope analysis to a specific file or directory
colonyos sweep --execute Fix findings via the implement→review pipeline
colonyos sweep --execute --plan-only Generate task file but stop before implementation
colonyos sweep --max-tasks N Cap the number of findings (default: 5)

Code review

Command Description
colonyos review BRANCH Standalone multi-persona review on any branch
colonyos review --base BRANCH Base branch to diff against (default: main)
colonyos review --no-fix Review only, skip the fix loop
colonyos review --decide Run the decision gate after reviews

Execution queue

Command Description
colonyos queue add "p1" "p2" --issue 42 Enqueue prompts and/or GitHub issues
colonyos queue start Process pending items sequentially
colonyos queue start --max-cost N Aggregate USD cap for the queue
colonyos queue start --max-hours N Wall-clock cap for the queue
colonyos queue status Show queue state
colonyos queue clear Remove all pending items
colonyos queue unpause Unpause the queue after a circuit breaker trip

Memory

Command Description
colonyos memory list Show recent memory entries
colonyos memory list --category codebase Filter by category
colonyos memory search QUERY Search memories by keyword
colonyos memory delete ID Delete a memory entry by ID
colonyos memory clear --yes Delete all memory entries
colonyos memory stats Show memory store statistics

CI fix

Command Description
colonyos ci-fix PR_NUMBER Fetch CI failure logs and auto-fix the code
colonyos ci-fix PR --wait Fix, then wait for CI re-run to pass
colonyos ci-fix PR --max-retries N Retry the fix-push-wait cycle up to N times

PR review auto-fix

Command Description
colonyos pr-review PR_NUMBER Process inline review comments and auto-fix actionable feedback
colonyos pr-review PR --watch Continuously poll for new review comments
colonyos pr-review PR --poll-interval N Poll interval in seconds (default: 60)
colonyos pr-review PR --max-cost USD Override per-PR budget cap (default: $5)

PR outcome tracking

Command Description
colonyos outcomes Show tracked PR outcomes (merge/close status)
colonyos outcomes poll Manually poll GitHub for open PR status updates

Analytics & inspection

Command Description
colonyos stats Aggregate analytics dashboard (cost, duration, failures)
colonyos stats --last N Limit to the N most recent runs
colonyos stats --phase NAME Drill into a specific phase
colonyos show RUN_ID Detailed single-run inspection
colonyos show RUN_ID --json Machine-readable JSON output
colonyos show RUN_ID --phase NAME Inspect a specific phase within a run
colonyos map Print a structural summary (repo map) of the codebase
colonyos map --max-tokens N Limit repo map output to N tokens
colonyos map --prompt "text" Rank files by relevance to the given prompt

Codebase maintenance

Command Description
colonyos cleanup branches List merged colonyos/ branches (dry-run)
colonyos cleanup branches --execute Delete merged branches
colonyos cleanup branches --include-remote Also prune from origin
colonyos cleanup artifacts List stale run artifacts beyond retention
colonyos cleanup artifacts --execute Delete stale artifacts
colonyos cleanup artifacts --retention-days N Override retention period (default: 30)
colonyos cleanup scan Find large files, long functions, dead code
colonyos cleanup scan --ai AI-powered qualitative analysis

Slack & Dashboard

Command Description
colonyos daemon Start fully autonomous daemon (Slack + GitHub + CEO + cleanup)
colonyos daemon --max-budget N Daily budget cap in USD
colonyos daemon --max-hours N Maximum wall-clock hours before exit
colonyos daemon --dry-run Log what would run without executing
colonyos watch-slack Watch Slack channels and trigger runs from messages
colonyos watch-slack --dry-run Log triggers without executing
colonyos watch-slack --max-hours N Wall-clock limit for the watcher
colonyos watch-slack --max-budget N Aggregate USD spend limit
colonyos watch Deprecated alias for watch-slack
colonyos ui Launch the local web dashboard
colonyos ui --port N Custom port (default: 7400)
colonyos ui --no-open Don't auto-open the browser

Repo runtime exclusivity: ColonyOS allows only one repo-bound runtime per checkout at a time. If colonyos daemon is already active for a repository, a separate watch-slack, queue start, auto, or other guarded runtime started against that same repo will fail fast instead of racing on branches, queue state, or Slack intake. The guard writes transient local state to .colonyos/runtime.lock and .colonyos/runtime_processes.json; both files are generated at runtime and are gitignored.

For dedicated daemon checkouts, daemon.auto_recover_dirty_worktree: true enables an opt-in preserve-and-reset recovery path when queue execution hits a dirty-worktree preflight failure. Leave it off for shared human/dev checkouts, because it may stash/reset local edits so queued work can continue.

Global flags

Flag Applies to Description
-v / --verbose run, auto Stream agent text alongside tool activity
-q / --quiet run, auto Suppress the streaming UI

Configuration Reference

Configuration lives at .colonyos/config.yaml, created by colonyos init.

Project & personas

project:
  name: "MyApp"
  description: "B2B analytics platform"
  stack: "Python/FastAPI, React, PostgreSQL"

personas:
  - role: "Senior Backend Engineer"
    expertise: "API design, database modeling, performance"
    perspective: "Thinks about scalability and data integrity"
    reviewer: true        # participates in code reviews (default: false)
  - role: "Product Lead"
    expertise: "User research, prioritization"
    perspective: "Thinks about user value and shipping incrementally"
    # reviewer defaults to false — plan-phase only

Personas are the expert panel that reviews your code and asks clarifying questions during planning. Mark reviewer: true on personas you want in the review loop.

Model selection

model: opus                  # global default: opus | sonnet | haiku

# Per-phase overrides — route mechanical phases to cheaper models
phase_models:
  plan: sonnet
  implement: opus
  review: opus
  fix: sonnet
  deliver: haiku
  ceo: opus
  verify: haiku
  learn: haiku

Using per-phase overrides can reduce costs by 50–70% while keeping opus for deep reasoning tasks.

Budget & safety caps

budget:
  per_phase: 5.00            # USD per Claude agent session
  per_run: 15.00             # USD total cap for a full pipeline run
  max_duration_hours: 8.0    # wall-clock cap for autonomous loops
  max_total_usd: 500.0       # aggregate spend cap for autonomous loops

Pipeline phases

phases:
  plan: true
  implement: true
  review: true               # parallel per-persona reviews + fix loop
  deliver: true              # set false to skip PR creation

max_fix_iterations: 2        # review/fix cycles before decision gate
auto_approve: true           # skip human confirmation in autonomous mode

Strategic directions

directions_auto_update: true   # false = directions stay read-only between iterations

The CEO agent reads .colonyos/directions.md before every proposal for landscape context, inspiration from similar projects, and the user's north star goals. Generate it with colonyos directions --regenerate.

When directions_auto_update is true (default), a lightweight agent refreshes the document after each CEO proposal. Set to false if you prefer to hand-curate your directions and keep them static.

Verification gate

verification:
  verify_command: "pytest --tb=short -q"   # auto-detected by `colonyos init`
  max_verify_retries: 2
  verify_timeout: 120                       # seconds

Runs your test suite between implement and review. Failed tests trigger implement retries with failure context — catching bugs before the expensive review phase.

CI fix

ci_fix:
  enabled: false             # set true to auto-fix CI failures in deliver phase
  max_retries: 2             # retry fix-push-wait cycle
  wait_timeout: 600          # seconds to wait for CI re-run
  log_char_cap: 12000        # truncate CI logs sent to the fix agent

PR review auto-fix

pr_review:
  budget_per_pr: 5.0                  # max USD spend per PR (default: $5)
  max_fix_rounds_per_pr: 3            # max fix attempts per PR
  poll_interval_seconds: 60           # watch mode poll interval
  circuit_breaker_threshold: 3        # pause after N consecutive failures
  circuit_breaker_cooldown_minutes: 15

PR Sync

Keeps open colonyos/ PRs up-to-date with main by automatically merging the latest main into stale branches. The daemon detects PRs that are behind main (via mergeStateStatus from the GitHub API) and performs the merge in an isolated ephemeral worktree — the main working tree is never touched.

How to enable:

  1. Set COLONYOS_WRITE_ENABLED=1 (or dashboard_write_enabled: true in config) — PR sync pushes to remote branches, so write access is required.
  2. Enable PR sync in your config:
pr_sync:
  enabled: true                # default: false (opt-in)
  interval_minutes: 60         # how often to check for stale PRs
  max_sync_failures: 3         # per-PR retry cap before giving up

Behavior:

  • Runs as concern #7 in the daemon tick loop on its own timer (separate from outcome polling).
  • Processes at most 1 PR per tick, consistent with the daemon's sequential model.
  • Only syncs branches matching the configured branch_prefix (default: colonyos/).
  • Skips PRs whose branch has a RUNNING queue item to avoid conflicts with active pipelines.
  • Uses git merge origin/main --no-edit (never rebase or force-push) to preserve review state.

On merge conflicts:

  • The merge is aborted cleanly (git merge --abort) and the branch is left untouched.
  • A Slack notification is sent with the list of conflicting files.
  • A PR comment is posted detailing the conflict.
  • The sync_failures counter is incremented. After max_sync_failures consecutive failures, the PR is skipped until manually resolved.

Retry on transient API errors

retry:
  max_attempts: 3              # total attempts per phase (1 = no retry)
  base_delay_seconds: 10.0     # base delay for exponential backoff
  max_delay_seconds: 120.0     # ceiling for backoff delay
  fallback_model: null          # optional: "sonnet" to retry with a lighter model

When the Anthropic API returns a transient error (HTTP 529 overloaded, 503), phases automatically retry with exponential backoff and jitter. Set max_attempts: 1 to disable retry. The optional fallback_model activates only after all retries are exhausted and is hard-blocked on safety-critical phases (review, decision, fix).

Cross-run learnings

learnings:
  enabled: true
  max_entries: 100

After each run, the pipeline extracts patterns from review artifacts and persists them to .colonyos/learnings.md. Future implement and fix phases see these learnings as context, enabling the pipeline to self-improve across iterations.

Codebase cleanup

cleanup:
  branch_retention_days: 0       # 0 = prune all merged colonyos/ branches
  artifact_retention_days: 30    # run logs older than this are stale
  scan_max_lines: 500            # flag files exceeding this line count
  scan_max_functions: 20         # flag files with more functions than this

Parallel implement mode

parallel_implement:
  enabled: true                  # enable parallel task execution
  max_parallel_agents: 3         # max concurrent agent sessions
  conflict_strategy: auto        # auto | fail | manual
  merge_timeout_seconds: 60      # timeout for merge lock acquisition
  worktree_cleanup: true         # remove worktrees after completion

When enabled, the implement phase parses the task file for dependency annotations (depends_on: [...]) and executes independent tasks in parallel using isolated git worktrees. This can significantly speed up multi-task implementations.

Conflict strategies:

  • auto: Spawn a conflict resolution agent when merge conflicts occur
  • fail: Abort on any merge conflict (useful for strict CI environments)
  • manual: Leave conflicts in place for human intervention

Budget allocation: Each parallel agent receives budget.per_phase / max_parallel_agents to contain blast radius from runaway agents.

Graceful degradation: Falls back to sequential mode if:

  • Only 1 task detected
  • Repository is a shallow clone
  • Git version < 2.5.0 (worktree support)
  • Disk space below threshold

Branch & directory naming

branch_prefix: "colonyos/"
prds_dir: "cOS_prds"
tasks_dir: "cOS_tasks"
reviews_dir: "cOS_reviews"
proposals_dir: "cOS_proposals"

Slack Integration

ColonyOS can listen to Slack channels and trigger pipeline runs from messages — your team can request features or report bugs directly in Slack without context-switching.

1. Create a Slack app

The fastest way is with the included manifest — it pre-configures all scopes, events, and Socket Mode in one step:

  1. Go to api.slack.com/apps and click Create New AppFrom an app manifest.
  2. Select your workspace.
  3. Choose YAML and paste the contents of slack-app-manifest.yaml from this repo.
  4. Click Create.
Manual setup (without manifest)
  1. Go to api.slack.com/appsCreate New AppFrom scratch.
  2. OAuth & Permissions — add these Bot Token Scopes:
    • app_mentions:read — detect @ColonyOS mentions
    • channels:history — read channel messages
    • chat:write — post progress updates
    • reactions:read — detect emoji triggers (if using reaction mode)
    • reactions:write — acknowledge messages with reactions
  3. Socket Mode — enable Socket Mode and generate an App-Level Token with the connections:write scope.
  4. Event Subscriptions — subscribe to bot events: app_mention, message.channels, reaction_added.

2. Install the app and grab tokens

  1. In your Slack app settings, go to SettingsSocket Mode → generate an App-Level Token with connections:write scope. Copy the xapp-... token.
  2. Go to Install AppInstall to Workspace and authorize. Copy the Bot User OAuth Token (xoxb-...).

3. Set environment variables

export COLONYOS_SLACK_BOT_TOKEN="xoxb-your-bot-token"
export COLONYOS_SLACK_APP_TOKEN="xapp-your-app-level-token"

4. Configure ColonyOS

Add a slack section to .colonyos/config.yaml:

slack:
  enabled: true
  channels:
    - bugs                   # channel names (with or without #) or IDs
    - feature-requests
  trigger_mode: mention          # mention | reaction | slash_command
  auto_approve: false            # require human approval before executing
  max_runs_per_hour: 3           # rate limit
  allowed_user_ids: []           # empty = allow all workspace members
  daily_budget_usd: 50.0         # optional daily spend cap
  max_queue_depth: 20            # max pending items from Slack
  max_consecutive_failures: 3    # pause watcher after N consecutive failures

5. Start the watcher

# Install the Slack extra if you haven't
pip install "colonyos[slack]"

# Start watching
colonyos watch-slack

If Slack imports fail on startup, run colonyos doctor, reinstall the Slack extra, and prefer Python 3.11-3.13 for the watcher or daemon process.

The watcher runs as a long-lived process using Slack Bolt SDK with Socket Mode (no public URL required). When someone mentions @ColonyOS fix the login bug in a configured channel, the watcher sanitizes the input, triggers a pipeline run, and posts threaded progress updates back to the Slack thread.

watch-slack is mutually exclusive with other repo-bound runtimes for the same checkout. If the daemon is already running and watching Slack, do not start a second standalone watcher for that repo; ColonyOS will reject it with a runtime-busy error instead of letting both processes mutate the same branch or queue.

Trigger modes:

Mode How it works
mention @ColonyOS <prompt> in any configured channel
reaction Add a specific emoji reaction to any message
slash_command /colonyos <prompt> (requires slash command setup in Slack app config)

Web Dashboard

ColonyOS includes a local web dashboard for monitoring runs, viewing costs, and managing configuration.

pip install "colonyos[ui]"
colonyos ui

This starts a FastAPI server at http://localhost:7400 serving a React + Tailwind SPA. The dashboard provides:

  • Run history with phase-by-phase timelines
  • Cost trend charts across runs
  • Inline configuration editing (config, personas)
  • Run launching from the browser
  • Artifact previews (PRDs, reviews, proposals)

The dashboard is localhost-only and supports bearer token authentication for write operations.


Output Structure

ColonyOS creates cOS_-prefixed directories in your repo as a timestamped audit trail:

your-repo/
├── CHANGELOG.md                                    # auto-updated by deliver phase
├── cOS_prds/
│   └── 20260317_172530_prd_stripe_billing.md       # generated PRDs
├── cOS_tasks/
│   └── 20260317_172530_tasks_stripe_billing.md     # task breakdowns
├── cOS_reviews/
│   ├── decisions/
│   │   └── 20260317_decision_stripe_billing.md     # GO/NO-GO verdicts
│   └── reviews/
│       ├── linus_torvalds/
│       │   └── 20260317_review_stripe_billing.md
│       └── staff_security_engineer/
│           └── 20260317_review_stripe_billing.md
├── cOS_proposals/
│   └── 20260317_proposal_ceo_proposal.md           # CEO proposals
└── .colonyos/
    ├── config.yaml                                 # project configuration
    ├── directions.md                               # CEO landscape & inspiration doc
    ├── learnings.md                                # cross-run learnings
    └── runs/                                       # run logs (gitignored)
        └── run-20260317-abc123.json

The CEO reads CHANGELOG.md and directions.md before proposing new features — the changelog avoids duplicating past work, while directions provide landscape context and inspiration from similar projects. Run logs (costs, durations, session IDs) and loop state go to .colonyos/runs/, which is gitignored.


Architecture

src/colonyos/
├── cli.py              # Click CLI — all commands, REPL mode, welcome banner
├── init.py             # Interactive setup wizard, persona packs, --quick mode
├── orchestrator.py     # Phase chaining: CEO → Plan → Implement → Verify → Review → Deliver → Learn
├── agent.py            # Claude Agent SDK wrapper, parallel execution support
├── config.py           # .colonyos/config.yaml loader + validation
├── directions.py       # CEO strategic directions — generation, updates, display
├── models.py           # Persona, PhaseResult, RunLog, LoopState, QueueItem
├── naming.py           # Deterministic timestamped filenames, slug generation
├── persona_packs.py    # Prebuilt persona packs (startup, backend, fullstack, opensource)
├── ui.py               # Rich streaming terminal UI, phase progress display
├── stats.py            # Aggregate analytics computation + Rich rendering
├── show.py             # Single-run inspector (data layer + render layer)
├── doctor.py           # Prerequisite validation (Python, Claude, Git, GitHub CLI)
├── github.py           # GitHub issue fetching, PR helpers
├── ci.py               # CI failure detection, log retrieval, fix agent
├── cleanup.py          # Branch pruning, artifact cleanup, structural scan
├── slack.py            # Slack Bolt listener, dedup, threaded replies
├── server.py           # FastAPI API for the web dashboard
├── learnings.py        # Cross-run learning extraction + persistence
├── sanitize.py         # Input sanitization for untrusted content (Slack, issues)
├── instructions/       # Markdown templates passed as system prompts
│   ├── ceo.md          # Autonomous feature proposal
│   ├── directions_gen.md  # Landscape doc generation prompt
│   ├── plan.md         # PRD + task generation with persona Q&A
│   ├── implement.md    # Test-first implementation
│   ├── review.md       # Per-persona structured review with VERDICT output
│   ├── fix.md          # Staff+ engineer fix agent
│   ├── decision.md     # GO/NO-GO decision gate
│   ├── deliver.md      # Branch push + PR creation
│   ├── verify_fix.md   # Test failure fix instructions
│   ├── ci_fix.md       # CI failure fix instructions
│   ├── learn.md        # Learning extraction
│   ├── cleanup_scan.md # Structural analysis
│   ├── review_standalone.md
│   ├── fix_standalone.md
│   └── decision_standalone.md
└── web_dist/           # Pre-built Vite SPA (React + Tailwind)

Instruction templates are shipped with the package. Override any of them by placing a file with the same name in .colonyos/instructions/ in your repo.


Security Model

ColonyOS runs Claude Code sessions with bypassPermissions enabled — the agent has full read/write/execute access within your repository. This is by design: the agent needs to create branches, write code, run tests, and push to GitHub.

What this means for you:

  • Only run ColonyOS on repos where you trust the agent to modify files.
  • Use budget caps (per_run, max_total_usd) to limit blast radius.
  • Review generated PRs before merging, just as you would for any contributor.
  • Slack integration sanitizes all incoming messages to mitigate prompt injection, but treat it as defense-in-depth, not a guarantee.
  • Long-running loops with auto_approve: true amplify the scope of autonomous action — set conservative budget and time caps.

Development

git clone https://github.com/rangelak/ColonyOS.git
cd ColonyOS
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest

The [dev] extra installs pytest, pre-commit, and the dashboard dependencies. The pre-commit hook runs a targeted pytest selection based on staged files so local commits stay fast, while CI still runs the full suite.

Releasing

ColonyOS uses tag-based automated releases. The version is derived from git tags via setuptools-scm — there is no hardcoded version string.

git tag v0.2.0
git push origin v0.2.0
# CI automatically: runs tests → builds → publishes to PyPI → creates GitHub Release

VM Deployment

Deploy ColonyOS as an always-on daemon on a fresh Ubuntu 22.04+ VM:

git clone https://github.com/rangelak/ColonyOS.git /tmp/colonyos-setup
cd /tmp/colonyos-setup
sudo bash deploy/provision.sh

The provisioning script installs all dependencies (Python 3.11+, Node.js, GitHub CLI, pipx), creates a colonyos system user, configures the systemd service, and runs colonyos doctor to verify the setup. See deploy/README.md for full options including --dry-run, --slack, and non-interactive mode.


Setting up Claude Code

ColonyOS requires the Claude Code CLI as its execution engine. If you don't have it yet:

1. Install Node.js (required by Claude Code):

# macOS
brew install node

# Linux (Debian/Ubuntu)
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs

2. Install Claude Code CLI:

npm install -g @anthropic-ai/claude-code

3. Authenticate:

claude
# Follow the prompts to connect your Anthropic account

4. Install GitHub CLI (needed for PR creation and issue fetching):

# macOS
brew install gh

# Linux (Debian/Ubuntu)
sudo apt install gh

5. Authenticate GitHub CLI:

gh auth login

6. Verify everything:

claude --version
gh auth status
colonyos doctor

License

MIT — Rangel Milushev

About

Autonomous AI pipeline that turns prompts into shipped PRs. CEO agent proposes, plans, implements, reviews, and delivers — all without human intervention.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors