pai-agent

An AI agent that runs as a Cloudflare Durable Object. It connects to an LLM, executes tools (shell, files, search), and streams everything over WebSocket.

1,700 lines of TypeScript. 6 source files. Zero frameworks on the frontend.

Browser ─── WebSocket ───▶ PaiAgent DO ───▶ LLM (Claude / GPT-4o)
                                │
                                └──▶ ShellSession DO (exec, read, write)

Quickstart

git clone https://github.com/acoyfellow/pai-agent.git
cd pai-agent
npm install

# Add your API key (any one of these)
echo 'OPENAI_API_KEY=sk-...' > .dev.vars
# or: ANTHROPIC_API_KEY=sk-ant-...
# or: OPENROUTER_API_KEY=sk-or-...

npx wrangler dev
# Open http://localhost:8787

That's it. Type a message. Watch the agent think, call tools, and respond.

What happens when you send a message

Your message goes over WebSocket to the PaiAgent Durable Object
The DO calls the LLM with your message + tool definitions
If the LLM wants to use a tool → the DO executes it via ShellSession
Tool result goes back to the LLM → it can call more tools or respond
This loops up to 25 times until the LLM gives a final answer
Every step streams to the browser in real-time

Project structure

src/
  index.ts      → HTTP router (Hono) + WebSocket routing (agents-sdk)
  agent.ts      → PaiAgent Durable Object — the core agent loop
  llm.ts        → LLM provider abstraction (Anthropic, OpenAI, OpenRouter)
  tools.ts      → Tool definitions: shell_exec, read_file, write_file, search_files, list_directory, think
  shell.ts      → ShellSession Durable Object — sandboxed file/command execution
  types.ts      → All TypeScript interfaces
public/
  index.html    → Chat UI (vanilla HTML/CSS/JS, no build step)
migrations/
  0001_init.sql → D1 schema (sessions + messages)

How to deploy

# Create the D1 database
npx wrangler d1 create pai-agent-db
# Copy the database_id into wrangler.jsonc

# Run migrations
npx wrangler d1 migrations apply pai-agent-db --remote

# Set your LLM API key
npx wrangler secret put OPENAI_API_KEY

# Deploy
npx wrangler deploy

How to add a new tool

Open src/tools.ts. Each tool is an object:

{
  name: "my_tool",
  description: "What this tool does (the LLM reads this)",
  parameters: {
    type: "object",
    properties: {
      input: { type: "string", description: "..." },
    },
    required: ["input"],
  },
  execute: async (args, ctx) => {
    // ctx.shellExec(), ctx.readFile(), ctx.writeFile(), ctx.broadcast()
    return "result string shown to the LLM";
  },
}

Add it to the TOOLS array. The agent picks it up automatically.

How to swap LLM providers

The agent auto-detects which key you've set:

Key	Provider	Default model
`ANTHROPIC_API_KEY`	Anthropic (direct)	claude-sonnet-4-20250514
`OPENAI_API_KEY`	OpenAI	gpt-4o
`OPENROUTER_API_KEY`	OpenRouter	anthropic/claude-sonnet-4-20250514

Priority: Anthropic → OpenAI → OpenRouter. Set any one in .dev.vars for local or wrangler secret put for production.

WebSocket protocol

Send (client → server):

{"type": "message", "content": "your question"}
{"type": "configure", "model": "gpt-4o"}
{"type": "cancel"}

Receive (server → client):

{"type": "status", "status": "thinking"}
{"type": "message", "id": "...", "role": "assistant", "content": "...", "timestamp": 123}
{"type": "tool_call", "messageId": "...", "tool": {"name": "shell_exec", "arguments": {"command": "ls"}}}
{"type": "tool_result", "messageId": "...", "result": {"content": "...", "isError": false}}
{"type": "done"}
{"type": "error", "message": "..."}

HTTP API

All responses follow { ok, command, result, error, fix, next_actions }.

Method	Path	Description
GET	`/api`	Health + endpoint discovery
GET	`/api/sessions`	List sessions
POST	`/api/sessions`	Create session → returns `wsUrl`
DELETE	`/api/sessions/:id`	Delete session
WS	`/agents/pai-agent/:id`	WebSocket connection

Architecture

Why Durable Objects? Each agent session is a stateful, long-lived object. The DO holds the conversation in memory, maintains a WebSocket connection to the browser, and orchestrates the LLM ↔ tool loop without any external state management. When the session is idle, Cloudflare hibernates it. When a message arrives, it wakes up with full state intact.

Why a separate ShellSession DO? Isolation. Each agent gets its own sandboxed execution environment. In production, this maps to a Cloudflare Container. In the prototype, it simulates a filesystem in memory.

Why not use the Vercel AI SDK / LangChain / etc? This is 236 lines of LLM glue (src/llm.ts). It calls fetch(). It parses JSON. Adding a framework would triple the dependency tree to save maybe 50 lines.

Stack

What	Why
Cloudflare Workers	Edge runtime, zero cold start
Durable Objects	Stateful WebSocket + agent state
`agents` SDK	WebSocket routing, DO lifecycle
Hono	HTTP routing (7kb)
D1	SQLite at the edge
Anthropic / OpenAI	LLM providers
Vanilla HTML/JS	No build step for the frontend

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
migrations		migrations
public		public
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
wrangler.jsonc		wrangler.jsonc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pai-agent

Quickstart

What happens when you send a message

Project structure

How to deploy

How to add a new tool

How to swap LLM providers

WebSocket protocol

HTTP API

Architecture

Stack

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pai-agent

Quickstart

What happens when you send a message

Project structure

How to deploy

How to add a new tool

How to swap LLM providers

WebSocket protocol

HTTP API

Architecture

Stack

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages