GitHub - OpenBMB/PilotDeck: Task-oriented AI Agent productivity platform

Task-oriented AI Agent productivity platform — redefining operational boundaries and memory evolution, one WorkSpace at a time.

English | 简体中文
 Website · Live Demo · Tutorial · Quick Start · Highlights · Use Cases · Community

News 🔥

[2026.05.28] PilotDeck is now open source! Visit our official website at pilotdeck.openbmb.cn. We welcome contributions, feedback, and stars from the community.

💡 About PilotDeck

PilotDeck is an open-source agent operating system designed around the concept of "WorkSpace". It is jointly developed and open-sourced by Tsinghua University THUNLP, ModelBest, OpenBMB, and AI9Stars. Targeting general-purpose, multi-task scenarios, PilotDeck is built to be a true productivity tool for the Agent era.

A wave of excellent AI Agent harnesses has emerged in recent years, each with its own focus: Claude Code / Cursor / Trae Solo brought model reasoning deep into the programming IDE; Claude Cowork introduced the notion of project-level isolation to desktop-side knowledge work; WorkBuddy connected agents to IM ecosystems such as WeCom and Feishu so AI is one message away.

When we shift the lens from "one-shot programming" or "immediate Q&A" to long-running, multi-project productivity work, however, several questions remain open:

When many projects run in parallel, can memory be white-box and traceable? When the AI gets something wrong, can you pinpoint which memory entry caused it and edit it directly — without starting a new chat from scratch?
Can token cost be tracked per task, so that running agents in the background actually becomes economically viable?
Can tasks of different difficulty automatically be matched to different models, instead of burning the flagship model on trivial calls?
When you step away from the keyboard, can the work keep moving? Can the agent proactively discover what's worth doing, report progress, and land results as files on disk?

PilotDeck is an incremental exploration around exactly these questions. It uses the WorkSpace as the fundamental unit — completely isolating files, memory and skills per project — and pairs it with three pillar capabilities: White-box Memory, Smart Routing and Always-on. The entire system natively supports the Model Context Protocol (MCP) and behaves consistently across front-ends (Web / CLI / IM).

✨ Key Highlights

WorkSpace-Level Isolation & Accretion

Every project gets its own file system, memory store and skill set. Parallel work no longer interferes with itself, retrieval has a bounded scope, and skills accrete naturally as each task grows — no more global context pollution.

Traceable White-box Memory

Memory generation, extraction, storage and retrieval are visible end-to-end. When the AI mis-remembers, you can pinpoint and fix the offending entry. Built-in Dream Mode consolidates memory in idle windows, and supports one-click rollback.

Smart Routing & Cost Optimization

Task difficulty is auto-detected; complex calls go to flagship models (e.g. Claude 3.5 Sonnet / GPT-4o), simple ones drop to lighter models. Through on-device / cloud co-orchestration and precise matching, token spend shrinks dramatically without sacrificing quality.

Always-on Background Execution

PilotDeck breaks the "you ask, it answers" loop: after you sign off, the agent keeps discovering candidate tasks, running long-horizon monitors, and finally lands deliverables as local files with a summary report waiting for you.

📊 Real-world Numbers

The three pillar capabilities have shown clear advantages in production-grade workflows:

1. Smart Routing — ~70% cost savings on social-media workloads

In Xiaohongshu-style social-media operations, enabling Smart Routing automatically demotes simple polishing / layout tasks to a sub-agent (e.g. Sonnet 4.5) and only invokes Opus 4.5 at planning checkpoints:

Setup	Model configuration	Cost	Multiplier
Smart Routing ON	Opus 4.5 (main) + Sonnet 4.5 (sub)	$2.83	1.1×
Smart Routing OFF	All Opus 4.5 (main + sub)	$12.58	5.0×
Monolithic	Single Opus 4.5 long-react (estimated)	$12.20	4.8×

2. Smart Routing — 1/6 the cost while beating frontier models on hard tasks

The research team benchmarked 7 complex tasks (multilingual podcast push, multi-source data reports, domain-specific literature review, codebase architecture docs, etc.). The "strong main + light sub" routing setup matches or beats the frontier single-model setup at a fraction of the cost:

Setting	Score	Cost
MiniMax-M2.7 single-agent	37.1	$1.90
Claude Sonnet 4.6 single-agent	69.1	$18.36
Sonnet 4.6 (main) + MiniMax-M2.7 (sub)	70.6	$3.15

3. White-box Memory — layout & tone never bleed across projects

In black-box agents, mixing tasks in a shared context pool inevitably pollutes memory. PilotDeck's WorkSpace-scoped white-box memory addresses this end-to-end:

Dimension	Current AI Agents (black-box)	PilotDeck (white-box)
Visibility	You can't see what the AI remembers, only what it outputs	View every memory entry: what was stored, when, and which WorkSpace
Control	Once written, memory can't be edited or removed	Edit / delete entries, pin critical decisions so they don't drift
Traceability	When it goes wrong, you can't find the root cause	Generation → extraction → storage → retrieval, all auditable
Isolation	One shared pool — projects bleed into each other	Scoped per WorkSpace; A's memory never reaches B
Reversible	After compression, the original is gone	Dream-mode supports one-click rollback to the prior state

🖥️ UI & Demo

PilotDeck ships an out-of-the-box Web UI with full WorkSpace management, white-box memory editing, and visualization of multi-agent collaboration.

Use Cases

All demos below are generated entirely by edge-side models via PilotDeck's Smart Routing — no cloud-side frontier model required.

Work Document Generation

"Survey the Chinese LLM application market and turn it into a formal HTML white paper."

Process	Result

Mini-Game Development

"Walk me through building an iOS AR mini-game Ball Finder in Vibe Coding mode."

Process	Result

AI Engineering Platform Development

"Build a low-code embedding fine-tuning platform from scratch."

Process	Result

Audio-Video Editing & Social Media Operations

"Push this English podcast to a global audience in Chinese / Japanese / French / Korean / Spanish / Arabic."

Process	Result (with audio)
	podcast_result.mp4

📦 Installation & Quick Start

We provide a one-line installer for macOS / Linux, plus a source-based workflow for developers.

Option A: One-line install (recommended, macOS / Linux)

curl -fsSL https://raw.githubusercontent.com/OpenBMB/PilotDeck/main/install.sh | bash

The script auto-installs Node.js 22, clones the repo, installs dependencies, and builds the frontend. Once it finishes:

pilotdeck            # starts the server at http://localhost:3001
pilotdeck status     # check runtime status

Option B: From source (for developers)

1. Clone and install dependencies

This repo uses Git LFS for large media assets. Make sure git lfs is installed before cloning. If you don't need the demo videos/GIFs, add GIT_LFS_SKIP_SMUDGE=1 before git clone to skip downloading them.

git clone https://github.com/OpenBMB/PilotDeck.git
cd PilotDeck

npm install              # root deps (Gateway runtime)
cd ui && npm install     # UI deps
cd ..

2. Configure a model provider

PilotDeck reads ~/.pilotdeck/pilotdeck.yaml. You can create it manually, let the bootstrap script generate one, or just open the Web UI and configure providers visually in the settings panel. Supported protocols include OpenAI, Anthropic, DeepSeek, Qwen, Kimi, MiniMax and other OpenAI-compatible endpoints.

schemaVersion: 1
agent:
  model: deepseek/deepseek-v4-pro
model:
  providers:
    deepseek:
      protocol: openai
      url: https://api.deepseek.com/v1
      apiKey: sk-your-api-key

3. Start the services

cd ui && npm run dev     # dev mode (HMR), visit http://localhost:5173
# or
cd ui && npm run start   # production mode, visit http://localhost:3001

Option C: Docker Compose

If Docker is installed, you can start PilotDeck with:

docker compose up -d

🍎 Desktop App (Apple Silicon)

For macOS users we ship a signed, Apple-notarized DMG — double-click to run, no command-line setup required.

🛠️ Extension Protocol

PilotDeck has an open plugin architecture with a strict boundary between the open-source core and plugin customization. Extending the system is a plugin.json away:

MCP Servers — first-class integration with any Model Context Protocol server.
Tools & Skills — register custom tools, or pull community skills via ClawHub.
Lifecycle Hooks — intercept PreToolUse, UserPromptSubmit, and other critical lifecycle events.
Custom Memory — plug in your own memory store provider.

🤝 Contributing

Thanks to everyone who has contributed code, feedback, and ideas. New contributors are warmly welcome — let's build the next-gen agent OS together.

Workflow: Fork → feature branch → PR. Please make sure the unit tests and linters pass before opening a PR:

npm test
cd ui && npx vitest run

💬 Community

For bugs and feature requests, please open a GitHub Issue.
Join our community channels:

WeChat Community	Feishu Community	Discord Community

🙏 Acknowledgements

PilotDeck builds upon the following outstanding open-source projects:

ClawXRouter — Intelligent model routing
ClawXMemory — Agent memory system
Claude Code UI — Web UI reference
Claude Code Router — Model routing reference
UltraRAG — RAG framework
Anthropic Skills — Agent skill framework and built-in skills (skill-creator, frontend-slides)
MiniMax-AI Skills — minimax-pdf skill
Karpathy Guidelines — LLM coding behavioral guidelines
Vite — Frontend build tool
React — UI framework
Tailwind CSS — Utility-first CSS framework

🏢 Joint Development

PilotDeck is jointly developed by Tsinghua University THUNLP, ModelBest, OpenBMB and AI9Stars.

⭐ Support Us

If PilotDeck has been helpful in your work or research, please consider giving us a Star on GitHub!

📝 Citation

@misc{pilotdeck2026,
  author       = {PilotDeck Team},
  title        = {PilotDeck: A WorkSpace-Centric Open-Source Agent Operating System},
  howpublished = {\url{https://github.com/OpenBMB/PilotDeck}},
  year         = {2026},
  note         = {Accessed: 2026-05-29}
}

📄 License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0). The products/** directory contains customer-specific customizations and is not part of the open-source release scope.

Name		Name	Last commit message	Last commit date
Latest commit History 596 Commits
.github/workflows		.github/workflows
apps/desktop		apps/desktop
assets		assets
products/_example		products/_example
scripts		scripts
skills		skills
src		src
tests		tests
ui		ui
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
README.zh.md		README.zh.md
README_DOCKER.md		README_DOCKER.md
docker-compose.yml		docker-compose.yml
docker-entrypoint.sh		docker-entrypoint.sh
install.sh		install.sh
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.json		tsconfig.json
vitest.config.js		vitest.config.js
vitest.setup.ts		vitest.setup.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💡 About PilotDeck

✨ Key Highlights

📊 Real-world Numbers

1. Smart Routing — ~70% cost savings on social-media workloads

2. Smart Routing — 1/6 the cost while beating frontier models on hard tasks

3. White-box Memory — layout & tone never bleed across projects

🖥️ UI & Demo

Use Cases

Work Document Generation

Mini-Game Development

AI Engineering Platform Development

Audio-Video Editing & Social Media Operations

📦 Installation & Quick Start

Option A: One-line install (recommended, macOS / Linux)

Option B: From source (for developers)

Option C: Docker Compose

🍎 Desktop App (Apple Silicon)

🛠️ Extension Protocol

🤝 Contributing

💬 Community

🙏 Acknowledgements

🏢 Joint Development

⭐ Support Us

📝 Citation

📄 License

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

💡 About PilotDeck

✨ Key Highlights

📊 Real-world Numbers

1. Smart Routing — ~70% cost savings on social-media workloads

2. Smart Routing — 1/6 the cost while beating frontier models on hard tasks

3. White-box Memory — layout & tone never bleed across projects

🖥️ UI & Demo

Use Cases

Work Document Generation

Mini-Game Development

AI Engineering Platform Development

Audio-Video Editing & Social Media Operations

📦 Installation & Quick Start

Option A: One-line install (recommended, macOS / Linux)

Option B: From source (for developers)

Option C: Docker Compose

🍎 Desktop App (Apple Silicon)

🛠️ Extension Protocol

🤝 Contributing

💬 Community

🙏 Acknowledgements

🏢 Joint Development

⭐ Support Us

📝 Citation

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages