The first self-expanding autonomous coding agent with hard safety rails.
Fully autonomous • Self-expanding • Browser automation • Cross-platform • Hard safety rails
AI coding agents today are either autonomous but unsafe (Devin, Cursor — sandbox-only safety) or safe but constantly asking for permission (vanilla Claude Code). Autopilot is the third option: fully autonomous, with a deterministic safety layer the AI cannot override, and it teaches itself new services as it encounters them.
See it in action (click to expand)
claude --agent autopilot
> Set up Supabase for this project with user auth, deploy to Vercel, and configure Razorpay payments.
Autopilot: Here's the plan:
1. Install Supabase CLI
2. Sign up & create project
3. Run migrations (users, api_keys, usage_logs)
4. Install Vercel CLI & deploy to preview
5. Set environment variables from Supabase
6. Get Razorpay API keys (needs 2FA)
7. Configure webhook
Proceed? yes
[1/7] Supabase CLI installed via brew
[2/7] Signed up at supabase.com .................. ACCOUNT CREATED
[3/7] Migrations complete — 3 tables created
[4/7] Deploying to preview ....................... https://myapp-abc123.vercel.app
[5/7] Environment variables set from Supabase
[6/7] Opening Razorpay dashboard... "Enter the OTP sent to your phone."
> [user enters OTP]
[7/7] Razorpay keys stored, webhook configured
Done. Preview: https://myapp-abc123.vercel.app
All actions logged to .autopilot/log.md
Autopilot is a general-purpose autonomous agent — not limited to any specific service or workflow:
| Category | Examples |
|---|---|
| Deploy code | Vercel, Netlify, Railway, Cloudflare Workers, Fly.io |
| Set up databases | Supabase, PostgreSQL, Firebase, Redis |
| Manage infrastructure | R2 buckets, KV stores, DNS, SSL, CDN |
| Configure services | Stripe, Razorpay, SendGrid, Resend, Sentry, Auth0, Clerk |
| Handle git | Commits, branches, PRs, issues, Actions, releases |
| Browse the web | Login to dashboards, fill forms, get API tokens |
| Install tools | CLIs, MCP servers, dependencies |
| Teach itself | Unknown service? Researches docs, creates registry, installs CLI, keeps going |
The 5 pre-built service files are just a head start. Autopilot self-expands when it encounters anything new.
You give a task
|
v
+--------------------------+
| Autopilot Agent | Reads: decision framework,
| Plan > Confirm > | service registry, MCP whitelist
| Execute All |
+--------------------------+
|
+----------+----------+-----------+
| | | |
v v v v
+--------+ +--------+ +--------+ +---------+
| MCP | | CLI | | API | | Browser |
| Tools | | Tools | | (curl) | | (CDP) |
+--------+ +--------+ +--------+ +---------+
| | | |
+----------+----------+-----------+
| |
v v
+--------------------------+ +--------------------------+
| Credential Store | | Guardian Hook |
| (OS-native encrypted) | | (blocks dangerous |
| | | commands before they |
| macOS Keychain | | execute) |
| Linux: libsecret | | |
| Windows: Cred Manager | | 55 tested patterns |
+--------------------------+ +--------------------------+
Priority: MCP > CLI > API > Browser > Ask user
Autopilot acts first, asks only when the decision framework says to. It deploys code, configures databases, manages infrastructure, and obtains credentials — all without you leaving the terminal.
For complex tasks: numbered plan, single "proceed", then runs every step without pausing. Simple tasks execute immediately. Never stops to ask "what next?"
Every action logged to
{project}/.autopilot/log.md— timestamped, with decision level and result. Account creations, logins, and token acquisitions are tracked with special markers.
Set your primary email and password once. Autopilot uses them for all new service signups. Stored in your OS credential store (Keychain / libsecret / Credential Manager).
Unknown service? Researches docs, creates registry file, installs CLI, adds safety rules, keeps going — all inline, no stopping.
Command Entered
|
v
[Guardian Hook] exit code 2 = HARD BLOCK
| (overrides all permissions)
|-- rm -rf / ? BLOCKED
|-- bash -c "evil" ? BLOCKED
|-- npm publish ? BLOCKED
|-- git push --force ? BLOCKED
|-- vercel --prod ? BLOCKED
|-- DROP DATABASE ? BLOCKED
|-- base64 | bash ? BLOCKED
|-- npm install ? ALLOWED
v
[Permission Allowlist] auto-approve safe commands
|
v
Command Executes (no prompt)
Chrome (background) <-- CDP --> Playwright MCP <--> Claude Code
| |
Always alive Dies with session
Sessions persist Reconnects on start
Three layers: (1) persistent Chrome via CDP, (2) auto-retry on tab crashes, (3) smart browser avoidance.
| Level | Action | Examples |
|---|---|---|
| 1 — Just do it | Brief note | npm install, git push, read files |
| 2 — Do it, notify | Brief note | Preview deploys, create branches |
| 3 — Ask first | Wait for approval | Prod deploys, destructive DB ops |
| 4 — Must ask | Show exact command | Spending money, publishing |
| 5 — Escalate | Cannot proceed | 2FA codes, CAPTCHAs |
# One command
curl -fsSL https://raw.githubusercontent.com/rish-e/autopilot/main/install.sh | bash
# Or clone and install
git clone https://github.com/rish-e/autopilot.git
cd autopilot && ./install.shRequirements
- macOS, Linux, or Windows (Git Bash / WSL)
- Claude Code installed
- Node.js (installer handles it)
- Google Chrome (for browser automation via CDP)
- Credential store: macOS Keychain (auto) /
secret-toolon Linux (installer installs it) / Windows Credential Manager (built-in) - Package manager: Homebrew (macOS), apt/dnf/pacman (Linux), choco/winget/scoop (Windows)
Use from any Claude Code session — no separate terminal needed:
/autopilot deploy this to Vercel with environment variables from Supabase
/autopilot set up Supabase with user auth tables and API keys
/autopilot configure Stripe payments with webhooks
/autopilot create a Cloudflare R2 bucket for image storageFor big multi-service orchestrations:
claude --agent autopilot --dangerously-skip-permissions
> I need this running in production with a Postgres database, Stripe payments, and Sentry monitoring| Situation | Use |
|---|---|
| Quick deploy, get an API key | /autopilot |
| Full project setup from scratch | Agent mode |
| Mid-coding infrastructure task | /autopilot |
| Multi-service orchestration (5+ services) | Agent mode |
Every action is tracked in your project:
your-project/.autopilot/log.md
## Session: 2026-03-25 14:05 — Set up Supabase and deploy to Vercel
| # | Time | Action | Level | Service | Result |
|---|-------|--------------------------------------------|-------|----------|---------------------|
| 1 | 14:05 | Installed Supabase CLI via brew | L1 | supabase | done |
| 2 | 14:06 | Signed up at supabase.com (primary email) | L2 | supabase | ACCOUNT CREATED |
| 3 | 14:06 | Stored Supabase API token in keychain | L1 | supabase | TOKEN STORED |
| 4 | 14:07 | Created project (ref: abc123) | L2 | supabase | done |
| 5 | 14:08 | Ran migration: create users table | L2 | supabase | done |
| 6 | 14:09 | Logged in to vercel.com (primary email) | L2 | vercel | LOGGED IN |
| 7 | 14:10 | Deployed to preview | L2 | vercel | done — https://... |View the execution log from the terminal with audit.sh:
audit.sh # Latest session
audit.sh all # All sessions
audit.sh search supabase # Search logs
audit.sh accounts # Account activity (signups, logins, tokens)
audit.sh failures # Failed actions only
audit.sh summary # One-line-per-session overview
audit.sh --path ~/myproject # Specify project pathColor-coded output: green = done, red = FAILED, yellow = ACCOUNT CREATED, blue = LOGGED IN, cyan = TOKEN STORED.
Before executing a plan, Autopilot snapshots the current state using git stash. If something goes wrong, roll back instantly.
snapshot.sh create pre-deploy # Create a named snapshot
snapshot.sh list # List all autopilot snapshots
snapshot.sh rollback # Rollback to latest snapshot
snapshot.sh rollback pre-deploy # Rollback to a specific snapshot
snapshot.sh diff # Show what changed since snapshot
snapshot.sh clean # Remove all autopilot snapshotsSnapshots are automatic during complex tasks (Flow B). The agent creates one before executing any plan and mentions rollback availability in the completion report. Metadata is stored in .autopilot/snapshots.json.
Work survives rate limits and crashes. Autopilot saves progress after each step so it can resume where it left off.
session.sh save "Deploy to Vercel" # Save session state
session.sh status # Check if a saved session exists
session.sh resume # Show full saved session for pickup
session.sh update '{"current_step": 3, "notes": "Step 2 done"}' # Update progress
session.sh clear # Remove saved sessionOn startup (Flow B), the agent checks for a saved session and offers to resume. Session data is stored in .autopilot/session.json and includes the task, plan, completed steps, services used, and notes.
~/MCPs/autopilot/
bin/
keychain.sh Cross-platform credential store
guardian.sh PreToolUse safety hook (55 tested patterns)
chrome-debug.sh Persistent Chrome manager (CDP)
setup-clis.sh CLI installer (gh, vercel, supabase, etc.)
test-guardian.sh Guardian test suite
audit.sh Execution log viewer (terminal dashboard)
snapshot.sh Snapshot & rollback (git stash wrapper)
session.sh Session persistence (save/resume state)
config/
decision-framework.md When to act vs. ask (5 levels)
guardian-custom-rules.txt Append-only blocklist
trusted-mcps.yaml MCP whitelist (20+ pre-vetted)
playwright-config.json CDP endpoint config
browser-profile/ Persistent browser sessions
services/ Service registry (5 built-in + template)
commands/ /autopilot slash command
agent/ Full agent definition
# Per-project (created automatically):
your-project/.autopilot/
log.md Execution log (audit trail)
snapshots.json Snapshot metadata
session.json Saved session state (if interrupted)
|
The system only gets more restrictive:
|
Unknown service detected
|
v
Check MCP whitelist ──> Install silently if whitelisted
|
v
WebSearch CLI + API docs
|
v
Create service registry file
|
v
Append Guardian safety rules
|
v
Install CLI tool
|
v
Acquire credentials (browser or primary email)
|
v
Continue with original task
No interruption. Only pauses for 2FA codes and first-time primary credentials.
| Capability | Autopilot | Devin | Cursor Agents | Claude Code (vanilla) |
|---|---|---|---|---|
| Autonomous deployment | Yes (CLI + browser) | Yes (sandbox) | Yes (VM) | Needs permission |
| Browser credential acquisition | Yes (Playwright CDP) | Partial | Partial | No |
| Hard safety rails | Yes (Guardian) | Sandbox only | Sandbox only | Permission prompts |
| Self-expanding knowledge | Yes | No | No | No |
| MCP auto-discovery | Yes (whitelist) | No | Partial | No |
| Credential vault | Yes (OS-native) | Session-scoped | VM-scoped | No |
| Append-only safety | Yes | No | No | No |
| Cross-platform | macOS, Linux, Windows | Cloud only | Cloud only | Yes |
| Open source | Yes (MIT) | No | No | CLI only |
Hard blockers (cannot automate):
- 2FA/MFA codes — sent to your phone
- CAPTCHAs — can't solve image challenges
- Email verification — requires inbox access
- PCI-compliant payment forms — resist automation
Technical:
- Chrome CDP needs to be running (
chrome-debug.sh start— installer does this automatically) - Browser UIs change — Playwright steps can break when dashboards redesign
- New MCPs need a session restart to take effect
| Contribution | Impact |
|---|---|
| Add a service | Copy services/_template.md, fill in CLI commands, auth flow |
| Expand Guardian | Add patterns to bin/guardian.sh or config/guardian-custom-rules.txt |
| Add trusted MCPs | Add to whitelisted section in config/trusted-mcps.yaml |
| Improve tests | Add test cases to bin/test-guardian.sh |
MIT License • Built for developers who'd rather code than configure