They work while you sleep.
Elves is an open-source Agent Skill for autonomous, multi-batch development. It gives AI coding agents (Claude Code, Codex, or any agent that supports the Agent Skills standard) the ability to execute large development plans unattended (with testing, review, and documentation) while surviving context compaction across long runs.
You write the plan and do the final merge. The agent does everything in between.
This is v0. The system I use in production at Aigora is more elaborate than what you see here. It includes custom review tools, proprietary verification infrastructure, and integration with our internal deployment pipeline. I've extracted the key ideas and patterns into something that works with standard tools (git, GitHub PRs, CI) so it's useful to anyone, not just people with my exact setup. I'll be using this open-source version myself going forward (with my additional tooling bolted on), so it will continue to improve from real production use. But this is scaffolding, not a finished product. It may not work for you out of the box. Your model, your stack, your test infrastructure, and your review setup will all be different from mine. I'm relying on community feedback to make this skill more generalizable. If something doesn't work, open an issue. Your experience makes this better for everyone.
In the old fairy tale, a tired shoemaker goes to bed with work undone and wakes to find it finished. That story is the premise of this skill.
Throughout economic history, wealth creation has followed a consistent pattern: a resource sits idle until someone builds a tool that makes it useful. Coal sat in the ground until the steam engine. Cars sat in driveways until Uber. Spare bedrooms sat empty until Airbnb. The resource already existed. What was missing was the mechanism.
Every knowledge worker has 12 to 14 hours each day when they are not working: evenings, nights, weekends. For most of history, that time was genuinely unproductive. AI agents change that. A well-configured agent can execute code, run tests, conduct reviews, and document decisions while its owner is asleep. The sleeping hours are now a resource. They weren't before.
The question is no longer "what can I have my AI do today?" It's "what will my AI be doing at 2am on Saturday?"
Elves is the mechanism. It converts idle hours into shipped code.
The core pattern is the Ralph Loop: try, check, feed back, repeat. An AI doesn't return correct or incorrect answers. It returns drafts. Judging AI on its first attempt is like judging a tree by its first day of growth. The people who get extraordinary results aren't writing better prompts. They are running better loops.
Elves is the harness that lets the Ralph Loop run for hours without supervision, with a Survival Guide so the agent knows what it's doing, an Execution Log so it can recover after a restart, and test gates so it knows whether its work is actually correct before it moves on.
Part of a series by John Ennis: The Shoemaker's Elves (the 14-hour resource), The Survival Guide (keeping agents on track), and Water the Tree (the Ralph Loop).
Plan → Batch → Implement → Validate → Review → Document → Continue
Elves runs a tight loop. For each batch of planned work, the agent implements the changes, runs validation gates, reads PR review comments, fixes any blocking findings, updates the documentation, and pushes a checkpoint, then immediately starts the next batch. No waiting, no prompting, no drift.
AI agents are stateless. Context compaction erases working memory. Elves solves this with three persistent documents that act as the agent's memory across compactions, restarts, and long multi-hour runs:
| Document | Purpose |
|---|---|
| Plan | What needs to be built (the authoritative scope) |
| Survival Guide | Standing brief: mission, rules, tool config, current phase, next batch |
| Execution Log | Running record of every batch completed, every decision made, every commit pushed |
After any compaction or restart, the agent reads these three files in order and resumes without losing its place. The survival guide is marked # READ THIS FILE FIRST AFTER ANY COMPACTION OR RESTART so the agent can't miss it.
The shape of productive work is changing. The human operates on both ends: specifying problems and reviewing output, while the agent runs loops in the middle.
- Front end (human): Decide what's worth working on. Write the plan. Specify the problem fully. 30 minutes to an hour.
- Middle (agent): Open a branch, commit the plans, open a PR, then run the loop: implement, validate, review, fix, iterate. For each batch, the agent builds the code, runs the tests, reads the PR review comments (from bots or humans), fixes what the reviews found, pushes, and iterates until the batch is tight. Then it moves to the next batch. This runs for hours or days while you sleep.
- Back end (human): Review the output. By the time you look at the PR, every batch has already been through multiple rounds of implement-test-review-fix. Your review is a final pass on work that's already tight, not a first look at raw output. 30 minutes to an hour.
The agent never merges. That gate stays with you.
The elves won't do the job perfectly. That isn't the goal. The goal is leverage. AI returns drafts, not finished products. But the drafts are refined through dozens of Ralph Loop iterations, and by the time you review the work, it's far closer to done than anything you could have produced in the same wall-clock time. See Water the Tree for the full philosophy.
The math is striking. You spend 30 minutes writing a plan. The agent runs for 10-20 hours. You spend 30-60 minutes reviewing the PR. In that 1-2 hours of your time, you may get weeks or months of equivalent human output. The exact multiplier depends on your project, your plan quality, and your test infrastructure, but ratios of 100:1 to 500:1 (agent hours per human hour) are real. In practice, users have reported getting 6-9 months of equivalent work done in a total of 3-4 hours of human time across planning, monitoring, and review.
This is the leverage that makes the setup cost worth it. A half hour of planning unlocks days of autonomous execution.
You don't have to leave. You can watch the agent work, check in, give it additional context, or adjust priorities on the fly. But there is one rule: say "do not stop" in every message. Be explicit and repetitive. This isn't overkill. It makes a measurable difference in agent behavior. Without it, the agent may interpret your message as a request to pause and discuss, which kills the momentum.
Good: "The payment tests are expected to fail. Ignore them. Do not stop. Keep going." Good: "Quick question: did you update the migration? Do not stop. Answer my question and keep going, but do not stop." Bad: "What do you think we should do about the database schema?" Bad: "Looks good so far." (no instruction to continue, so the agent may pause waiting for more)
1. Install the skill
See Installation below for full details. The short version:
- Claude Code: copy the
elves/directory into.claude/skills/elves/in your repo - Codex: copy
AGENTS.mdinto.agents/skills/elves/AGENTS.md - Claude.ai: zip the
elves/directory and upload via Settings > Features > Skills
2. Write a plan
Use references/plan-template.md as your starting point. The plan describes what needs to be built, broken into logical batches. Commit it to your repo (e.g., docs/plans/my-feature.md).
3. Start the session
Use references/kickoff-prompt-template.md to start the agent. It tells the agent where your plan, survival guide, and execution log live, and what branch to work on. If the survival guide and execution log don't exist yet, the agent generates them from the templates.
4. Walk away
Elves runs preflight checks first: git access, test gates, sleep prevention, notifications. Once preflight passes, the agent starts executing batches and won't stop until the plan is complete or time runs out.
- Multi-batch execution with configurable batch sizing (default: 4 developers × 2-week sprint)
- Context compaction survival via the three-document system: reads survival guide, plan, and execution log after every compaction
- Auto-discovered validation gates for Node.js, Python, Go, Rust, and Makefile projects. No configuration required.
- Pluggable review: GitHub PR comments by default (zero config), custom review API opt-in, additional custom checks
- Subagent delegation for long runs (Claude Code): coordinator manages the loop, subagents do the deep work
- Rollback safety:
git tag elves/pre-batch-Nbefore every batch, so any batch can be cleanly unwound - Scout mode: after all planned work is done, the agent looks for adjacent improvements, test gaps, and documentation holes
- Two run modes: finite (deadline-based, default) or open-ended (continue until explicitly stopped). Open-ended mode disables Final Completion and treats every checkpoint as a relaunch point.
- Time-aware pacing: tracks how long each batch takes and uses that to decide whether to start another batch or wrap up cleanly (finite mode)
- Slack notifications (or any custom command): know when your run finishes without watching the terminal
- Structured session data in
.elves-session.jsonfor tooling, dashboards, and analytics - Comprehensive preflight checks: git remote, push access, GitHub CLI auth, test gates, sleep prevention, Slack webhook, stale branch detection
This is the most common failure mode for overnight runs. If your machine sleeps, the session stops. Handle this before you walk away.
# Prevent display, idle, and system sleep for the duration of your terminal session
caffeinate -dims &Or wrap your agent command: caffeinate -dims <your-agent-command>
Elves preflight will warn you if caffeinate isn't running and if you are on battery power.
systemd-inhibit --what=idle <your-agent-command>Open Power Options → Change plan settings → set "Put the computer to sleep" to Never for the duration of the run. Restore it afterward.
Running on a cloud VM, GitHub Codespaces, or a remote server eliminates the sleep problem entirely. The session runs independently of your local machine. This is the most reliable option for very long runs.
If you're running over SSH, your session dies when the connection drops. Always use a terminal multiplexer:
# Start a new tmux session
tmux new -s elves
# Run your agent inside tmux, then detach with Ctrl+B, D
# Reconnect later with:
tmux attach -t elvesscreen works the same way: screen -S elves, detach with Ctrl+A, D, reattach with screen -r elves.
Some coding tools show survey popups, feedback requests, or update prompts during sessions. These will stall an unattended run. Configure your tools before starting:
- Claude Code: add to your CLAUDE.md:
"Do not show surveys, popups, or update prompts during this session." - Codex: add to your AGENTS.md:
"Never pause for surveys, feedback requests, or update prompts." - Cursor / other tools: check settings for telemetry, notifications, and update checks. Disable anything interactive.
- Agent has the permissions it needs (file access, git push,
ghauth, any tool approvals). If your platform requires you to approve actions (file writes, terminal commands, etc.), grant those permissions before you walk away. A permission prompt at 3am with nobody to click "allow" will stall the entire run. You're granting these permissions at your own risk. See Disclaimer. - Machine is plugged in (not on battery)
- Sleep / display sleep is disabled or caffeinate running
- Terminal is in tmux/screen (if SSH) or won't be closed
- Surveys and popups disabled in your coding tool's settings
- Notifications are configured so you know when the run finishes
- Preflight passed (Elves will verify the above automatically)
You don't need to watch the terminal. Here's how to check in from elsewhere.
GitKraken is the recommended way to monitor visually. Open it on the working branch and watch:
- Commit graph: steady commit cadence means the agent is making progress. A long gap may mean a slow test suite, a stuck review cycle, or an unexpected blocker.
- Branch activity: new commits appear as the agent completes each batch and pushes a checkpoint.
- PR status: review comments arriving on the PR means the review step is working.
Slack notifications deliver a completion message when the session ends (or when a batch completes, if you configure that). You can check your phone without opening a terminal.
The execution log is the most detailed view. Each batch entry records what changed, what commands ran, what the test results were, how long each phase took, and what decisions were made autonomously. Read it when you return to understand exactly what happened.
- Go to api.slack.com/apps and create a new app (from scratch).
- Under Features, select Incoming Webhooks and enable it.
- Click Add New Webhook to Workspace and select the channel where you want notifications.
- Copy the webhook URL (it looks like
https://hooks.slack.com/services/T.../B.../...). - Set the environment variable before starting your session:
export ELVES_SLACK_WEBHOOK="https://hooks.slack.com/services/YOUR/WEBHOOK/URL"Elves preflight will send a test message to confirm the webhook works before you walk away.
Set ELVES_NOTIFY_CMD to any shell command you want run at session completion:
# Example: send a push notification via ntfy
export ELVES_NOTIFY_CMD='curl -d "Elves done" ntfy.sh/your-topic'
# Example: send an email via sendmail
export ELVES_NOTIFY_CMD='echo "Elves session complete" | sendmail you@example.com'If neither ELVES_SLACK_WEBHOOK nor ELVES_NOTIFY_CMD is set, Elves falls back to leaving a comment on the PR.
Tool-specific configuration lives in the survival guide under ## Tool Configuration. This keeps the agent's instructions with the session rather than scattered across environment variables.
See references/tool-config-examples.md for full examples covering Node.js, Python, Go, Rust, monorepos, and custom review APIs.
Minimal Node.js example (add to survival guide):
## Tool Configuration
### Validation Gates
- lint: `npm run lint`
- typecheck: `npm run typecheck`
- build: `npm run build`
- test: `npm test`
### Review
- method: github-pr-commentsIf you don't configure validation gates, Elves auto-discovers them from your project files (package.json, Makefile, pyproject.toml, Cargo.toml, go.mod).
The default batch size is what a team of 4 developers would accomplish in a 2-week sprint: roughly 40 person-days of effort. This limits blast radius and makes compaction recovery tractable.
Override in your plan or survival guide:
## Batch Sizing
- team-size: 2
- sprint-length: 1 weekEach batch must be independently shippable: code, tests, docs, and passing review before moving on.
| Tier | Method | Configuration |
|---|---|---|
| Tier 1 | GitHub PR comments + built-in review subagent | Default (zero config). Agent spawns a review subagent that reads all PR comments, the diff, and the plan, then produces a structured assessment. Agent fixes blockers and iterates until the batch is clean. |
| Tier 2 | Custom review API | Set method: custom-api and review-api-url in survival guide. |
| Tier 3 | Additional checks | Smoke tests, screenshot diffs, doc checks, or any custom script returning 0/non-zero. |
The agent uses the highest tier you have configured. Non-blocking findings are logged; persistent false positives (3+ cycles) are assessed and dismissed with a written explanation in the execution log.
elves/
├── SKILL.md # Claude Code skill (main instructions)
├── AGENTS.md # Codex variant
├── config.json.example # Persistent preferences template
├── references/
│ ├── survival-guide-template.md # Bootstrap template for new projects
│ ├── execution-log-template.md # Log entry template
│ ├── plan-template.md # How to write a good plan
│ ├── kickoff-prompt-template.md # Copy-paste prompts for starting a run
│ ├── tool-config-examples.md # Configs for Node, Python, Go, Rust, etc.
│ ├── validation-guide.md # Detailed validation gates and auto-discovery
│ ├── autonomy-guide.md # Non-interactive operation and mid-run protocols
│ ├── review-subagent.md # Built-in review protocol and adversarial review
│ ├── verification-patterns.md # Headless browser, video recording, state assertions
│ └── open-ended-guide.md # Open-ended mode patterns, QA/audit expansion rules
├── scripts/
│ ├── preflight.sh # Pre-run checklist
│ └── notify.sh # Notification helper
├── README.md
└── LICENSE
| Platform | File | Subagents | Notes |
|---|---|---|---|
| Claude Code | SKILL.md | Yes | Full feature set |
| Codex | AGENTS.md | No | All work done directly |
| Claude.ai | SKILL.md (zip upload) | No | Upload as skill |
| Any Agent Skills compatible | SKILL.md | Varies | Open standard |
- The human sandwich. The human operates on both ends: specifying problems and reviewing output. The agent runs the loop in the middle. Your working hours become morning for reviewing last night's output, afternoon for setting up the next run.
- The Ralph Loop. Try, check, feed back, repeat. AI returns drafts, not answers. A dumb, stubborn loop beats over-engineered sophistication because AI is non-deterministic. Any single attempt might fail. But if you keep trying, checking, and feeding back, the process converges.
- The 14-hour resource. Every knowledge worker has 12-14 hours per day when they're not working. Elves converts those hours into shipped code. A two-hour planning session on Friday can produce a week's worth of output before you touch your keyboard on Monday.
- Three documents are the agent's memory. Without them, long runs drift and repeat work. With them, a restarted agent picks up exactly where it left off. These aren't overhead: they're the minimum viable infrastructure for the loop to run unsupervised.
- Tests are the watch. An agent working overnight has no one watching. The tests are the watch. Without them, you wake up to code that compiles, passes lint, and does the wrong thing.
- Never merge. The PR is for review, not for merging. That gate stays with the human.
- Document every decision. Anything the agent decides without user input goes in the execution log under Decisions made. The human reviews these choices when they return.
- Fail safely, not silently. If the agent is genuinely blocked, it stops and says so. If a test gate fails, it fixes the issue before continuing. It doesn't skip gates or paper over failures.
- Rollback before every batch.
elves/pre-batch-Ntags mean any batch can be cleanly unwound without touching other work. - Agent infrastructure is real engineering. Developers who treat agent infrastructure as a real engineering concern (tight code review systems, organized work trees, failure handling) end up with something that functions like a tireless junior team working every hour they're away from their desk.
Overnight agent runs fail in predictable ways. Knowing the failure modes makes them preventable.
| Failure | What happens | Mitigation |
|---|---|---|
| Machine sleeps | Session stops silently. You wake up to 45 minutes of work instead of 8 hours. | caffeinate (macOS), systemd-inhibit (Linux), or run in cloud. Elves preflight warns you. |
| Agent runs destructive git commands | git reset --hard wipes hours of uncommitted work. This has happened to real users. |
Elves explicitly forbids git reset --hard, git checkout ., git push --force, and git clean -fd. The survival guide template includes these as non-negotiables. |
| Agent disables or weakens tests | Agent comments out failing tests, weakens assertions, or shortens timeouts to make the gate pass. You wake up to code that "passes" but is broken. | Elves has a Test Integrity rule: never modify a test to make it pass. Fix the code, not the test. If the agent thinks a test is wrong, it logs the issue and moves on without changing it. |
| Context compaction loses instructions | Long sessions hit memory limits. The agent's conversation gets summarized, and safety instructions disappear. | Elves stores all instructions in files on disk (survival guide, plan, execution log), not in conversation memory. The agent re-reads the survival guide after every push. Compaction can't erase files. |
| Interactive prompt stalls the session | A tool asks for confirmation, a survey pops up, or npm install wants input. Nobody is there to click yes. |
Elves surfaces the recommended non-interactive env vars during preflight, and the skill requires --yes flags plus tool-level survey suppression before unattended runs. |
| Flaky tests block progress | A test passes locally but fails intermittently. The agent loops trying to fix a non-bug. | The agent logs flaky tests in the execution log and moves on after 3 failed attempts on the same non-deterministic failure. |
| Terminal closes (SSH disconnect) | The SSH connection drops and the session dies. | Use tmux or screen. Elves mentions this in the pre-run checklist. |
| Agent drifts from the plan | After many batches, the agent starts making changes that weren't in the plan. | The agent re-reads the survival guide after every push and checks the plan hash to detect modifications. The three-document system anchors every decision. |
Most of these are prevented by the preflight checks. Run preflight, fix the warnings, and most overnight failures never happen.
For Claude Code users, you can make compaction recovery fully automatic by adding a SessionStart hook that loads the survival guide at the beginning of every session.
Add this to your .claude/settings.json:
{
"hooks": {
"SessionStart": [
{
"type": "command",
"command": "echo '=== ELVES CONTEXT ===' && cat docs/plans/*-survival-guide.md 2>/dev/null && echo '' && echo '=== GIT STATUS ===' && git status --short && echo '' && echo '=== RECENT COMMITS ===' && git log --oneline -5"
}
]
}
}This injects the survival guide, current git status, and recent commits into Claude's context at session start, even after a compaction or restart. The agent gets its bearings immediately without needing to be told to read the files.
Adjust the cat path to match where your survival guide lives.
Elves tells the agent not to run destructive git commands, but instructions can be forgotten after context compaction. For bulletproof enforcement, add a PreToolUse hook that blocks them deterministically:
{
"hooks": {
"PreToolUse": [
{
"type": "command",
"command": "case \"$TOOL_INPUT\" in *'git reset --hard'*|*'git checkout .'*|*'git clean -fd'*|*'git push --force'*|*'git push -f '*|*'rm -rf /'*) echo 'BLOCKED: Forbidden command detected. Elves does not allow destructive git operations.' >&2; exit 1;; esac",
"matcher": "Bash"
}
]
}
}This runs before every Bash command and blocks the operation if it matches a forbidden pattern. Unlike instructions (which can be compacted away), hooks are deterministic. The agent can't forget them and can't override them.
This pattern comes from Anthropic's internal practices. Their /careful hook uses the same approach to block destructive operations in production environments.
Block time at the end of your workday (even 30 minutes) to brief your agents. Load them with enough well-defined work to keep them running through the night. Before you go offline, everything needs to be provisioned and pointed in the right direction.
Friday afternoons deserve more deliberate treatment. The weekend is roughly 60 hours of potential agent runtime. A two-hour planning session on Friday, setting up plans, configuring the survival guide, and queuing batch work, can produce a week's worth of output before Monday morning.
The people who start treating their idle hours as the asset they've suddenly become will have a real advantage.
Elves can be installed globally (applies to all your projects) or per-project (lives in the repo).
Global installation means the skill is always available, no matter which project you're in. Install it once, use it everywhere, and customize it as you learn.
Claude Code:
# Create the global skills directory if it doesn't exist
mkdir -p ~/.claude/skills/elves
# Clone and copy
git clone https://github.com/aigorahub/elves.git /tmp/elves
cp -r /tmp/elves/SKILL.md /tmp/elves/references /tmp/elves/scripts ~/.claude/skills/elves/
rm -rf /tmp/elvesCodex:
mkdir -p ~/.codex/skills/elves
git clone https://github.com/aigorahub/elves.git /tmp/elves
cp /tmp/elves/AGENTS.md ~/.codex/skills/elves/
cp -r /tmp/elves/references /tmp/elves/scripts ~/.codex/skills/elves/
rm -rf /tmp/elvesPer-project installation puts the skill in your repo so it's versioned with your code and visible to collaborators.
Claude Code:
# From your project root
mkdir -p .claude/skills
git clone https://github.com/aigorahub/elves.git .claude/skills/elves
rm -rf .claude/skills/elves/.git # remove the nested git repoCodex:
mkdir -p .agents/skills
git clone https://github.com/aigorahub/elves.git .agents/skills/elves
rm -rf .agents/skills/elves/.git- Download or clone this repo
- Zip the
elves/directory - Go to Settings > Features > Skills > Upload
- Upload the zip file
pip install -q skills-ref
agentskills validate ~/.claude/skills/elves/ # or wherever you installed itYou should see: Valid skill: ...
Star the repo to bookmark it and show support:
gh repo star aigorahub/elvesWatch for releases to get notified when the skill is updated:
gh api repos/aigorahub/elves/subscription --method PUT --field subscribed=trueElves is scaffolding, not a finished product. It gives you the framework: the loop, the documents, the gates. But every project is different. You'll need to customize it for your own purposes, and you'll learn your own lessons along the way.
The survival guide template is where most customization happens. When you generate a survival guide for your project, you'll fill in:
- Your specific test commands (not every project uses
npm run lint) - Your non-negotiables (what must never happen in your codebase)
- Your review method (PR comments, a custom API, manual checks)
- Your notification preference (Slack, email, PR comment)
- Your batch sizing (maybe your team is 2 people, not 4)
The validation gates will be different for every project. A Python data pipeline has different gates than a React web app. Edit the survival guide's ## Tool Configuration section to match your stack. See references/tool-config-examples.md for examples across Node, Python, Go, Rust, and monorepos.
The plan template is a starting point. Some teams want more structure (acceptance criteria per batch, risk statements). Others want less (just a task list). Make the plan format work for how you think, not how the template thinks.
The first time you run Elves overnight, you'll discover things no template can predict:
- Which of your test suites is flaky and needs to be fixed before agents can rely on it
- Which commands in your toolchain prompt for input and need
--yesflags - How long your batches actually take (probably longer than you estimate)
- Where your plan was vague and the agent had to guess
- What non-negotiables you forgot to list
This is normal. After each run, read the execution log (especially the Decisions made sections) and update your survival guide template with what you learned. The skill gets better every time you use it because you get better at writing plans and configuring the harness.
If you installed globally, your customized skill lives at ~/.claude/skills/elves/SKILL.md (Claude Code) or ~/.codex/skills/elves/AGENTS.md (Codex). Edit these files directly. Add your own defaults, remove sections that don't apply to your work, add project-type-specific guidance. This is your copy. Make it yours.
When you want to update from upstream (new features, fixes), pull the latest and merge manually:
git clone https://github.com/aigorahub/elves.git /tmp/elves-update
diff ~/.claude/skills/elves/SKILL.md /tmp/elves-update/SKILL.md
# Review the diff, merge what you want, skip what you don'tIf you have a global installation but one project needs different behavior, put a project-level copy in .claude/skills/elves/ inside that repo. The project-level skill takes precedence over the global one.
This is useful when:
- One project uses Python while your default is Node
- A project has specific non-negotiables ("never touch the billing module")
- You want to experiment with a modified workflow without affecting other projects
Issues and pull requests are welcome. If you find a bug, have a feature idea, or want to add support for a new platform or tool, open an issue to discuss it first.
When submitting a PR:
- Keep changes focused: one concern per PR.
- Update the relevant template or reference file if your change affects agent behavior.
- Test your change with at least one real overnight run if possible.
This software is provided "as is", without warranty of any kind, express or implied. Neither Aigora nor John Ennis are liable for any claims, damages, or other liability arising from using this software. That includes code changes, data loss, security incidents, infrastructure costs, or anything else that happens. The MIT license already says this, but we want to be clear about it here too.
Elves expects you to grant your AI agent the permissions it needs to run autonomously. That might mean file system access, git push, GitHub CLI auth, shell command execution, or other tool approvals depending on your platform. If the agent has to pause and wait for permission during an unattended run, it'll stall. So the skill works best when you pre-approve what the agent will need. You're granting those permissions at your own risk. Know what you're allowing before you walk away.
There's nothing uniquely dangerous about Elves. It uses standard tools (git, GitHub, your existing test suite) and it has safety measures (forbidden commands, test integrity rules, rollback tags). But no software is foolproof, and an agent running for hours with broad permissions can make mistakes. Always review the PR before merging.
MIT, see LICENSE.
Copyright (c) 2026 Aigora.
