Skip to content

buda-ai/bunny-agent

Bunny Agent Logo

Bunny Agent

A calm, powerful coding agent — runs anywhere, ships everywhere.

Daily driver CLI  ·  AI SDK UI native  ·  Remote sandbox in one command  ·  Build your own Agent product

License TypeScript AI SDK pnpm Powered by Pi

Quick Start  ·  Features  ·  Remote Sandbox  ·  Build Your Own Agent  ·  Docs


What is Bunny Agent?

Bunny Agent is a coding agent built on Pi Coding Agent — multi-model, harness-ready, and designed from the ground up for three jobs at once:

Mode What it means
🖥️ Daily CLI agent Install and use it like a local coding assistant, today
☁️ Remote sandbox agent bunny remote my-project — spin up a cloud machine for $5/mo
🏗️ Your own Agent product Next.js SaaS · Desktop app · Build your own OpenClaw alternative

It outputs a native AI SDK UI stream — meaning you can wire it directly into any useChat() frontend with zero glue code.


✨ Features

🧠 Multi-Model, One CLI

Switch between Claude, Gemini, OpenAI, or any provider — no code changes required.

bunny run --runner pi --model google:gemini-2.5-pro   -- "refactor this module"
bunny run --runner claude --model claude-opus-4        -- "review my PR"
bunny run --runner codex                               -- "fix the failing tests"

Powered by Pi Coding Agent — think of it as the oh-my-zsh of coding agents: pre-wired for every major provider, battle-tested on real engineering tasks.


🔧 Harness-Ready — Tools Included

No config needed. Bunny ships with a pre-built tool harness:

Tool What it does
🔍 Web Search Brave / Tavily, auto-detected from env keys
🌐 Web Fetch Full page content extraction
🖼️ Image Generation AI image creation from prompts
🔨 Bash Execute Run shell commands in the sandbox
📁 File Ops Read / write files in the workspace

Add your own tools by dropping a skill file — the harness discovers them automatically.


🐰 Built with a Conscience

Every Bunny Agent ships with a core directive baked into its system prompt:

"Protect Human. Push Humanity Forward."

It's not decoration — it's the guiding principle behind every tool call, every decision, and every line of code Bunny writes.


📡 AI SDK UI Native — Zero Glue

Bunny's stdout is an AI SDK UI stream. Pipe it to your server, pass it to your client, done.

// Next.js API route — this is the entire backend
export async function POST(req: Request) {
  const { messages, sessionId } = await req.json();

  const agent = new BunnyAgent({
    id: sessionId,
    sandbox: new SandockSandbox(),
    runner: { kind: "pi", model: "google:gemini-2.5-pro" },
  });

  return agent.stream({ messages }); // returns a Response with AI SDK UI stream
}
// React client — useChat just works
const { messages, input, handleSubmit } = useChat({ api: "/api/agent" });

No protocol translation. No buffering. Pure passthrough.


☁️ One-Command Remote Sandbox

Stop worrying about your laptop's specs. Launch a cloud machine instantly:

bunny remote my-project

That's it. You're now in a remote machine backed by Sandock:

  • NVMe SSD — fast local I/O, not sluggish network storage
  • 🗂️ POSIX-compliant filesystem — full compatibility, no quirks for coding agents
  • 🔒 Isolated container, persistent volume across sessions
  • 💰 Starting at $5 / month — production-grade at hobby prices
  • ♾️ Launch as many sandboxes as you need — no local resource constraints

Sandock is purpose-built for coding agents: SSD-backed, POSIX-native, and optimised for the read/write patterns that agents generate. It's the best-performing sandbox at the lowest cost.


💾 Persistent Sessions

Every agent run is tied to an id. Resume exactly where you left off — same filesystem, same context.

bunny run --resume my-project -- "continue where we left off"

🚀 Quick Start

Install

npm install -g @bunny-agent/runner-cli

Set your API key

export ANTHROPIC_API_KEY=sk-ant-...
# or for Gemini:
export GEMINI_API_KEY=...

Run your first task

# Local — uses your current directory
bunny run -- "explain this codebase and suggest improvements"

# Remote — cloud sandbox via Sandock
bunny remote my-project

# Choose a model
bunny run --runner pi --model google:gemini-2.5-pro -- "write unit tests for src/auth.ts"

From source

git clone https://github.com/vikadata/sandagent.git
cd bunny-agent
pnpm install && pnpm build

cd apps/runner-cli
npx bunny-agent run -- "your task here"

☁️ One-Command Remote Sandbox

bunny remote <project-name>

Under the hood this:

  1. Provisions a Sandock container with NVMe storage
  2. Mounts a persistent volume for <project-name>
  3. Drops you into an interactive agent session on that machine

Run multiple sandboxes in parallel:

bunny remote frontend-work    # machine 1
bunny remote backend-api      # machine 2
bunny remote data-pipeline    # machine 3

Each runs in full isolation. No config drift, no "works on my machine".

Get a Sandock API key at sandock.ai — plans start at $5/month.


🏗️ Build Your Own Agent Product

Use the SDK to embed Bunny Agent in any product — a Next.js SaaS, an Electron desktop app, or your own OpenClaw alternative. The architecture is the same either way: your UI talks to an AI SDK stream, Bunny handles the rest.

Architecture

Your Next.js App
    │
    ├── useChat() ─────────────────────────  React client (AI SDK)
    │
    └── POST /api/agent ──────────────────   your API route
            │
            └── Bunny Agent.stream() ────────  Bunny Agent SDK
                    │
                    ├── runner: pi / claude / codex / gemini
                    │
                    └── sandbox: Sandock / E2B / Daytona / Local

Sandbox options

Sandbox Best for Setup
Sandock ⭐ NVMe SSD · POSIX filesystem · coding-agent optimised · from $5/mo API key from sandock.ai
E2B Managed cloud sandboxes API key from e2b.dev
Daytona Enterprise / self-hosted API key from daytona.io
Local Development, no cloud needed No key required

Switch with one import — the rest of your code stays unchanged.

import { createBunnyAgent } from "@bunny-agent/sdk";
import { SandockSandbox } from "@bunny-agent/sandbox-sandock";

const agent = createBunnyAgent({
  sandbox: new SandockSandbox(),
  runner: { kind: "pi", model: "anthropic:claude-sonnet-4" },
});

// Returns LanguageModelV3 — compatible with Vercel AI SDK
const model = await agent.getModel();

🔧 CLI Reference

bunny run [options] -- "<task>"

Options:
  -r, --runner <name>    Runner: pi | claude | gemini | codex | opencode  (default: claude)
  -m, --model  <model>   Model override  (e.g. google:gemini-2.5-pro)
  -c, --cwd    <path>    Working directory  (default: current dir)
  -s, --system-prompt    Custom system prompt
  -t, --max-turns <n>    Maximum turns
      --resume <session> Resume a previous session
      --yolo             Skip confirmation prompts
  -h, --help             Show help

Environment Variables

Variable Description
ANTHROPIC_API_KEY Claude / Anthropic models
GEMINI_API_KEY Google Gemini models
OPENAI_API_KEY OpenAI models
SANDOCK_API_KEY Sandock remote sandbox
E2B_API_KEY E2B cloud sandbox
BRAVE_API_KEY Brave web search
TAVILY_API_KEY Tavily web search (fallback)

📦 Packages

Package Description
@bunny-agent/sdk Embed Bunny in your app
@bunny-agent/runner-harness Pre-built tool harness (search, bash, files, image gen)
@bunny-agent/runner-pi Pi coding agent runner (multi-model)
@bunny-agent/runner-claude Claude Agent SDK runner
@bunny-agent/runner-codex OpenAI Codex runner
@bunny-agent/runner-gemini Gemini CLI runner
@bunny-agent/sandbox-sandock Sandock sandbox adapter
@bunny-agent/sandbox-e2b E2B sandbox adapter
@bunny-agent/sandbox-daytona Daytona sandbox adapter
@bunny-agent/sandbox-local Local sandbox adapter

📚 Documentation


📊 Benchmark Results

Bunny Agent is evaluated on the GAIA benchmark — a challenging real-world task benchmark designed for general AI assistants.

Model: Gemini 3.1 Pro (via OpenAI-compatible API)

Level Tasks Score Pass Rate
L1 (simple reasoning) 42 34/42 81%
L2 (multi-step) 66 55/66 83%
L3 (complex reasoning) 19 13/19 68%

Results are legitimate zero-shot runs (no answer-revealing hints). Scores significantly exceed typical zero-shot baselines (~50–60% L2, ~11–30% L3).

Benchmarks are run using apps/bunny-bench — the integrated evaluation harness that ships with this repo. Wrong-answer tracking lets you iterate on failures without re-running solved tasks.


🤝 Contributing

PRs welcome. See CONTRIBUTING.md for guidelines.

pnpm install    # install all workspace dependencies
pnpm build      # build all packages
pnpm test       # run tests
pnpm typecheck  # type-check everything

License

Apache 2.0 — see LICENSE.


Built with 🐰 calm energy  ·  Powered by Pi Coding Agent

About

Build coding agent SaaS via native AI SDK UI

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors