Open-Source Multi-Agent Orchestration Platform
One control plane to deploy, govern, and scale your AI agent fleet.
Running in production at RunDiffusion since February 2026.
Why This Exists Β Β·Β Control Plane Β Β·Β Quick Start Β Β·Β Architecture Β Β·Β Use Cases Β Β·Β Showcase Β Β·Β Docs Β Β·Β Contact
|
Multi-agent orchestration |
YAML control plane |
|
Per-tenant isolation |
Agent-managed operations |
|
Self-hosted & cost-aware |
Production-tested |
Every team is experimenting with AI agents. Very few can actually operationalize them. The gap between "we have a chatbot" and "we have a governed agent fleet driving real output" is enormous β and that gap is where your competitors are moving right now.
Without a control plane, you get tool sprawl, leaked secrets, shadow AI, and agents that nobody owns. With one, you get centralized governance, controlled rollout, and 10x more output without the chaos.
Your competitors are standing up governed AI infrastructure today. Yes, it's bleeding-edge. But the teams that operationalize AI agents first will compound that advantage every single week. The cost of waiting is falling behind.
This repo is proof that RunDiffusion knows how to deploy governed AI infrastructure in the real world. It is the exact system running in production across our organization β not a demo, not a reference architecture, not a blog post. Real agents, real governance, real output.
Spin it up. Get a win. Then scale it to your team.
Every tenant in your fleet is governed by a single YAML file. Pin versions, enforce model policy, rotate secrets, and toggle routes β all without touching individual containers or hand-editing env files.
tenants:
dholbrook-marketing-agent: # This becomes the agent URL prefix
openclawVersion: 2026.3.24 # Pin the exact OpenClaw version per tenant
secrets: # Inject API keys from the host β not baked into images
GEMINI_API_KEY: ""
GEMINI_CLI_API_KEY: ""
HERMES_OPENAI_API_KEY: ""
CODEX_OPENAI_API_KEY: ""
CLAUDE_ANTHROPIC_API_KEY: ""
OPENROUTER_API_KEY: ""
models:
allowed: # Allowlist which models this tenant can use
- openai/gpt-5.4
primary: openai/gpt-5.4 # Set the default model
fallbacks: [] # Optional fallback chain
agents:
main:
model: openai/gpt-5.4 # Bind the operator agent to a specific model
providers:
google:
hydrateAuth: false # Control provider-level auth behavior
routes:
gemini:
enabled: false # Feature-flag any tool on or off per tenantWhat you control from one file:
- Version pins β lock each tenant to a specific OpenClaw release, roll forward on your schedule
- Secret injection β API keys managed at the host level, never committed to git
- Model governance β allowlists, primary model selection, and fallback chains
- Agent-to-model binding β decide which model powers each tenant's operator
- Provider policy β toggle auth hydration and provider-level behavior
- Route-level feature flags β enable or disable Gemini, Claude, Codex, or any route per tenant
This is the difference between "we have AI tools" and "we have AI governance." One file. Full fleet control.
See the Configuration Guide for the complete override surface across all four config layers.
Point an agent at this repo, have it install the matching skill, and it handles the entire deployment β configuration, secrets, tenant creation, and verification β then hands you back the URLs and credentials.
Step 1. Tell your agent which setup you want:
| I want... | Have your agent install this skill |
|---|---|
| Multiple agents on a shared host with Traefik routing (recommended) | $rundiffusion-host-agent-manager |
| One agent on this machine or one remote host | $rundiffusion-standalone-agent-manager |
Step 2. Give your agent a prompt:
Install the skill from /skills/rundiffusion-host-agent-manager, then use it to
create me a RunDiffusion multi-tenant cluster locally with one tenant called
"Chad Smith" using slug "csmith-1234". Configure the first tenant, deploy it,
and return the tenant URLs and credentials.
More example prompts
Install the skill from /skills/rundiffusion-standalone-agent-manager, then use it to create me a RunDiffusion Agent locally. Use the keys in services/rundiffusion-agents/.env. Name the deployment "My Agent". Return the /openclaw/ URL, the /dashboard/ URL, the operator credentials, and the OpenClaw gateway token.Install the skill from /skills/rundiffusion-standalone-agent-manager, then use it to create me a RunDiffusion Agent locally. Use the keys in <path-to-env-file>. Name the deployment "My Agent". Keep native OpenClaw auth enabled and return the actual reachable URLs plus the generated credentials.Install the skill from /skills/rundiffusion-standalone-agent-manager, then use it to create me a remote single-tenant RunDiffusion Agent. Use HTTPS or Cloudflare Tunnel as needed for native /openclaw, tell me which values you can infer from the repo, and tell me exactly what host or DNS inputs you still need from me.Install the skill from /skills/rundiffusion-host-agent-manager, then use it to create me a RunDiffusion multi-tenant cluster with one tenant called "Chad Smith" using slug "csmith-1234", and help me get Cloudflare Tunnel set up. Use the repo root host stack, configure the tenant, deploy it, and tell me which Cloudflare values or DNS steps still need my input.
Step 3. Keep using the same agent + skill for updates, redeploys, tenant changes, health checks, and troubleshooting.
Both skills inspect first, ask second β they check the working directory and repo state, infer your intent when possible, and only ask clarifying questions when the state is ambiguous.
Single-tenant only requires Docker with Compose support.
Multi-tenant also requires: bash, curl, jq, yq, openssl, and optionally python3.
We highly recommend getting a Gemini API key (or using a Codex account) to give your agents some "gas" to get moving immediately.
macOS (Homebrew)
brew install jq yq openssl
brew install --cask dockerOr run the all-in-one helper:
./scripts/bootstrap-mac-mini.shUbuntu / Debian
sudo apt-get update
sudo apt-get install -y docker.io docker-compose-v2 curl jq yq openssl python3Fedora
sudo dnf install -y docker-cli docker-compose curl jq yq openssl python3Arch Linux
sudo pacman -S --needed docker docker-compose curl jq yq openssl pythonWindows (WSL2)
- Install Docker Desktop on Windows.
- Enable WSL2 integration for your Linux distro in Docker Desktop.
- Inside the WSL distro, install the Linux prerequisites above for your distro.
Use this when you want the full shared-host architecture β Traefik ingress, per-tenant isolation, and the ability to scale to many agents.
-
Configure the Environment
cp .env.example .env
Edit the root
.envfile with your host paths, ingress mode, and shared settings. -
Create the local tenant registry
cp deploy/tenants/tenants.example.yml deploy/tenants/tenants.yml
If you skip this, the deploy scripts create
deploy/tenants/tenants.ymlautomatically from the example. -
Create your first tenant
./scripts/create-tenant.sh tenant-a "Tenant A"This generates the tenant env file for you at
${TENANT_ENV_ROOT}/tenant-a.env. Start from that generated file rather than inventing a new contract. -
Edit the tenant env file outside git Set the tenant hostname, auth, and provider keys. Keep
OPENCLAW_CONTROL_UI_ALLOWED_ORIGINSequal to the exact browser origin. For vanilla native/openclaw, that browser origin should be HTTPS unless you are onlocalhost. -
Deploy the host stack
./scripts/deploy.sh
This brings up shared ingress and deploys all enabled tenants.
-
Verify health
./scripts/status.sh ./scripts/smoke-test.sh --all
Multi-Tenant LAN & Remote Notes
INGRESS_MODE=localis for local multi-tenant and LAN/private-network installs.INGRESS_MODE=directis for remote multi-tenant hosts where you already have DNS and HTTPS handled outside the repo.INGRESS_MODE=cloudflareis for remote multi-tenant hosts published through Cloudflare Tunnel.- On plain HTTP LAN hostnames,
/dashboard,/terminal,/filebrowser,/hermes,/codex,/claude, and/geminiwork cleanly, but vanilla native/openclawstill needs HTTPS or localhost.
TLS automation for private-hostname LAN installs is still outside the scope of this release.
Use services/rundiffusion-agents when you want one agent package without Traefik.
-
Open the standalone package
cd services/rundiffusion-agents -
Copy the standalone env template
cp .env.example .env
-
Edit the standalone env Set at least:
OPENCLAW_ACCESS_MODE=nativeOPENCLAW_GATEWAY_TOKEN=<long-random-secret>TERMINAL_BASIC_AUTH_USERNAME=<username>TERMINAL_BASIC_AUTH_PASSWORD=<strong-password>OPENCLAW_CONTROL_UI_ALLOWED_ORIGINS=http://127.0.0.1:8080,http://localhost:8080for localhostOPENCLAW_CONTROL_UI_ALLOWED_ORIGINS=https://agent.example.comfor a remote HTTPS hostname
-
Start the package
docker compose up -d --build
-
Open the service
- Dashboard:
http://127.0.0.1:8080/dashboardfor localhost - OpenClaw:
http://127.0.0.1:8080/openclawfor localhost
- Dashboard:
For remote single-tenant installs, see docs/standalone-host-quickstart.md for the full localhost, remote DNS, and remote Cloudflare flows.
This is running in production. Here is what it does for us:
-
Content Creation: Our content team has their agents hooked up to our blogging platform. A vast majority of our articles on RunDiffusion Image and RunDiffusion Video are produced agentically. This includes dynamic image generation and intelligent cross-site linking. Switching to our agent farm has saved our content team over 8 hours of work per week, allowing them to focus on strategy rather than formatting.
-
Development & Code Review: Our devs use the agents to check code at night. The agents scan commits, run static analysis, flag potential bugs, and even draft pull requests by the time the team wakes up. It's like having an indefatigable senior engineer reviewing your work 24/7.
-
Sales & Marketing: Our teams use agents to help draft highly-contextualized replies to prospects. We never send automated responses to clients β the human touch is crucial β but we use agents to draft the perfect reply, ensuring everyone gets answered promptly and accurately.
-
Automated QA: Our QA team uses agents hooked into Playwright to autonomously write and execute tests across our platforms. The agents interact with the DOM, test user flows, and report back on breaking changes before they hit production.
-
Project Management: We have our agents hooked into Monday.com to monitor the health of tasks and keep projects moving seamlessly. One agent's specific job is to scan the board for any tasks that have been stale for 3 days and gently ping the assignees for updates, completely eliminating the need for manual "just checking in" messages.
-
Design Review: We imported the rules of Refactoring UI into an agent's context to have it automatically check design decisions and mockups submitted by the team. It acts as an automated design linter, ensuring our UI consistency stays top-notch.
Hardware: All of this runs on a single 2024 Mac Mini M4 with 16 GB of RAM β about 6 to 8 agents comfortably.
A strong operator agent sits above the deployment and manages the rest of the fleet. This is what makes RunDiffusion Agents an agent orchestration platform, not just another AI tool.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Operator Agent β
β (GPT-5.4 / Claude Opus / Gemini) β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββ
β reads / writes
βΌ
βββββββββββββββββββββββββββββββββ
β control-plane.yml β
β versions Β· models Β· secrets β
β routes Β· agents Β· policy β
ββββββββββββββββ¬βββββββββββββββββ
β applied at deploy
βΌ
βββββββββββββββββββββ
β Traefik β
β (edge router) β
βββββ¬ββββ¬ββββ¬ββββ¬ββββ
β β β β
ββββββ ββββββ ββββββ ββββββ
β T1 β β T2 β β T3 β β T4 β β isolated Docker containers
ββββββ ββββββ ββββββ ββββββ
each: OpenClaw Β· Codex Β· Claude Β· Gemini Β· Hermes
Terminal Β· FileBrowser Β· Dashboard
At a glance:
- Traefik routes traffic to the right tenant and tool at the edge
- Each tenant runs in its own isolated Docker container
- A host-side control-plane YAML centrally manages version pins, route flags, model policy, and secrets
- Provisioning scripts handle create, deploy, update, rollback, backup, restore, smoke tests, and health checks
- Cloudflare Tunnel support for secure remote access without exposing the host
- Cloudflare Access can be added for additional authentication and gateway security
- Each agent gets its own stateful workspace β dashboard, terminal, and file browser in one place
See the deployment matrix for all four deployment shapes and the configuration guide for the control-plane override surface.
Instead of manually editing containers and env files, point a capable agent at this repo and give it high-level tasks:
Create a new agent:
> Create me a new agent for David Smith.
Done. I created tenant "dsmith-7821", generated the env file, provisioned data
directories, and assigned the default OpenClaw version. Here are the URLs,
operator credentials, and gateway token. Want me to deploy and run smoke tests?
Upgrade the whole cluster:
> Update all agents to the latest OpenClaw and run the tests.
Done. Updated the host default version, redeployed each tenant in order, and ran
smoke tests after each rollout. All tenants are aligned. Full report attached.
More agent operation examples
Copy skills between agents:
> Copy the skills from Dave's Marketing agent to Tyler's Marketing agent.
Done. Synced the missing marketing skills from Dave's agent into Tyler's,
preserved Tyler's local customizations, and verified file paths. Tyler now has
the same baseline skill pack as Dave.
Audit secrets across the fleet:
> Check all deployments for secrets and tell me what should move to the control plane.
Done. Scanned all tenant env files and host config. Grouped findings into:
shared secrets that belong in the control plane, tenant-specific credentials,
duplicated values, and keys that should be rotated before centralization.
Ready to generate a migration plan when you are.
| Model | Best For | Notes |
|---|---|---|
| OpenAI GPT-5.4 | Top-end agentic operations, coding, long-running infra tasks | Highest capability vs cost |
| Claude Opus 4.6 | Most capable Claude-class operator | Strong reasoning |
| Claude Sonnet 4.5 | Fast, highly capable day-to-day operations | Best speed/capability ratio |
| Gemini 3 Flash (preview) | Budget-conscious operations | Our pick for capability per dollar |
Each tool runs on its own dedicated route β pop any app out of the dashboard for a full-screen experience.
Run your favorite models side-by-side in fully featured, stateful environments.
Hermes β Delegated Tasks
Delegate complex, asynchronous tasks to sub-agents and let Hermes manage the execution.
Integrated Terminal
Separate, fully integrated terminal sessions for each application with multiple modes for scrolling, copying, and pasting.
Quantum Filebrowser
A secure file browser for managing documents and adding secrets securely. (Proxy-authenticated FileBrowser users are provisioned with full operator permissions automatically, including upload, edit, delete, sharing, API, and realtime access.)
Utilities & Dashboard
Your entire agent farm is secured by Basic Auth, with robust device pairing for OpenClaw. The utilities section features custom scripts like approve-device and gateway recovery helpers to keep your farm healthy.
- Terminal Quirks: The integrated terminal has multiple modes for scrolling, copying, and pasting. Read the help modal inside the terminal for full details.
- Filebrowser Permissions: Proxy-authenticated FileBrowser users receive full operator permissions automatically when they first sign in, and existing proxy users are reconciled on startup so no manual settings changes are required.
- Hardware: For an optimal experience balancing performance and cost, a Mac Mini M4 (16 GB RAM) runs about 6 to 8 agents comfortably.
| Guide | Description |
|---|---|
| Deployment Matrix | Choose the right deployment shape for your environment |
| Standalone Host Quickstart | Single-tenant from zero to running |
| Multi-Tenant Deployment | Shared host stack with Traefik & ingress modes |
| Linux Host Quickstart | Cloud VM and Linux server deployment |
| Configuration Guide | All four config layers and precedence rules |
| Tenant Operations Runbook | Day-to-day operational tasks |
| Operator Runbook | OpenClaw gateway operator responsibilities |
| Release Checklist | Release process and verification |
RunDiffusion has been in business since 2022. Our team of DevOps engineers brings over 30 combined years of experience in software systems architecture. This codebase is free from malicious code, malware, or any devious intent β you are welcome to audit every line.
Join our Discord with 35k+ members.
Come build with us, get help, and talk with like-minded AI enthusiasts.
Find us:Β Β LinkedIn Β Β·Β X Β Β·Β GitHub Β Β·Β Discord Β Β·Β rundiffusion.com
Designate an Agent-Wise Internal Champion. To succeed, your team needs someone to run and maintain the farm β checking agent health daily, monitoring secrets inside containers, reviewing errors and usage, and managing the entire deployment using the bundled multi-tenant host manager skill or standalone manager skill.
Use At Your Own Risk. This is bleeding-edge software. You are responsible for security, secrets, data protection, access control, compliance, third-party API usage, costs, and any impact in your environment. See DISCLAIMER.md.
RunDiffusion Agents is licensed under Apache-2.0. See LICENSE, NOTICE, TRADEMARKS.md, DISCLAIMER.md, and the engineering inventory in docs/license-audit.md.
This repository can bundle or launch third-party tools such as OpenClaw, Codex, Claude Code, Gemini CLI, Hermes, Traefik, and FileBrowser. The Apache-2.0 license applies to RunDiffusion's code in this repo only. It does not relicense those third-party tools or grant rights to the separate APIs and hosted services they may use.
RunDiffusion Agents β Open-Source Multi-Agent Orchestration Platform
If you are a team or enterprise and want help deploying governed AI agents at scale, contact us.
RunDiffusion Agents is an open-source agent platform for multi-agent orchestration, AI agent orchestration, self-hosted AI agents, agentic AI, and agent fleet management β a complete agent orchestration platform and multi-agent platform for teams and enterprises.










