Skip to content

rundiffusion/RunDiffusion-Agents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RunDiffusion Agents β€” open-source multi-agent orchestration platform

RunDiffusion Agents

Open-Source Multi-Agent Orchestration Platform

License: Apache-2.0 Docker Ready Discord LinkedIn Production Status Platform: macOS | Linux | WSL2

One control plane to deploy, govern, and scale your AI agent fleet.
Running in production at RunDiffusion since February 2026.

Why This Exists Β Β·Β  Control Plane Β Β·Β  Quick Start Β Β·Β  Architecture Β Β·Β  Use Cases Β Β·Β  Showcase Β Β·Β  Docs Β Β·Β  Contact


Multi-agent orchestration
Run OpenClaw, Codex, Claude, Gemini & Hermes side-by-side. With a Filebrowser and full Terminal

YAML control plane
Govern versions, models, secrets & routes from one file

Per-tenant isolation
Docker containers with Traefik routing per agent operator

Agent-managed operations
Agents create, repair, upgrade & audit other agents

Self-hosted & cost-aware
Bare metal, LAN, cloud, or Cloudflare Tunnel β€” you own it

Production-tested
Runs our entire agent fleet on a single $600 Mac Mini M4

OpenClaw running inside RunDiffusion Agents


πŸ”₯ Why This Exists

Every team is experimenting with AI agents. Very few can actually operationalize them. The gap between "we have a chatbot" and "we have a governed agent fleet driving real output" is enormous β€” and that gap is where your competitors are moving right now.

Without a control plane, you get tool sprawl, leaked secrets, shadow AI, and agents that nobody owns. With one, you get centralized governance, controlled rollout, and 10x more output without the chaos.

Your competitors are standing up governed AI infrastructure today. Yes, it's bleeding-edge. But the teams that operationalize AI agents first will compound that advantage every single week. The cost of waiting is falling behind.

This repo is proof that RunDiffusion knows how to deploy governed AI infrastructure in the real world. It is the exact system running in production across our organization β€” not a demo, not a reference architecture, not a blog post. Real agents, real governance, real output.

Spin it up. Get a win. Then scale it to your team.


πŸŽ›οΈ The YAML Control Plane

Every tenant in your fleet is governed by a single YAML file. Pin versions, enforce model policy, rotate secrets, and toggle routes β€” all without touching individual containers or hand-editing env files.

tenants:
  dholbrook-marketing-agent: # This becomes the agent URL prefix
    openclawVersion: 2026.3.24 # Pin the exact OpenClaw version per tenant

    secrets: # Inject API keys from the host β€” not baked into images
      GEMINI_API_KEY: ""
      GEMINI_CLI_API_KEY: ""
      HERMES_OPENAI_API_KEY: ""
      CODEX_OPENAI_API_KEY: ""
      CLAUDE_ANTHROPIC_API_KEY: ""
      OPENROUTER_API_KEY: ""

    models:
      allowed: # Allowlist which models this tenant can use
        - openai/gpt-5.4
      primary: openai/gpt-5.4 # Set the default model
      fallbacks: [] # Optional fallback chain

    agents:
      main:
        model: openai/gpt-5.4 # Bind the operator agent to a specific model

    providers:
      google:
        hydrateAuth: false # Control provider-level auth behavior

    routes:
      gemini:
        enabled: false # Feature-flag any tool on or off per tenant

What you control from one file:

  • Version pins β€” lock each tenant to a specific OpenClaw release, roll forward on your schedule
  • Secret injection β€” API keys managed at the host level, never committed to git
  • Model governance β€” allowlists, primary model selection, and fallback chains
  • Agent-to-model binding β€” decide which model powers each tenant's operator
  • Provider policy β€” toggle auth hydration and provider-level behavior
  • Route-level feature flags β€” enable or disable Gemini, Claude, Codex, or any route per tenant

This is the difference between "we have AI tools" and "we have AI governance." One file. Full fleet control.

See the Configuration Guide for the complete override surface across all four config layers.


⚑ Quick Start

The Fastest Way: Let an Agent Do It

Agent-assisted install for RunDiffusion multi-agent platform

Point an agent at this repo, have it install the matching skill, and it handles the entire deployment β€” configuration, secrets, tenant creation, and verification β€” then hands you back the URLs and credentials.

Step 1. Tell your agent which setup you want:

I want... Have your agent install this skill
Multiple agents on a shared host with Traefik routing (recommended) $rundiffusion-host-agent-manager
One agent on this machine or one remote host $rundiffusion-standalone-agent-manager

Step 2. Give your agent a prompt:

Install the skill from /skills/rundiffusion-host-agent-manager, then use it to
create me a RunDiffusion multi-tenant cluster locally with one tenant called
"Chad Smith" using slug "csmith-1234". Configure the first tenant, deploy it,
and return the tenant URLs and credentials.
More example prompts
  • Install the skill from /skills/rundiffusion-standalone-agent-manager, then use it to create me a RunDiffusion Agent locally. Use the keys in services/rundiffusion-agents/.env. Name the deployment "My Agent". Return the /openclaw/ URL, the /dashboard/ URL, the operator credentials, and the OpenClaw gateway token.
  • Install the skill from /skills/rundiffusion-standalone-agent-manager, then use it to create me a RunDiffusion Agent locally. Use the keys in <path-to-env-file>. Name the deployment "My Agent". Keep native OpenClaw auth enabled and return the actual reachable URLs plus the generated credentials.
  • Install the skill from /skills/rundiffusion-standalone-agent-manager, then use it to create me a remote single-tenant RunDiffusion Agent. Use HTTPS or Cloudflare Tunnel as needed for native /openclaw, tell me which values you can infer from the repo, and tell me exactly what host or DNS inputs you still need from me.
  • Install the skill from /skills/rundiffusion-host-agent-manager, then use it to create me a RunDiffusion multi-tenant cluster with one tenant called "Chad Smith" using slug "csmith-1234", and help me get Cloudflare Tunnel set up. Use the repo root host stack, configure the tenant, deploy it, and tell me which Cloudflare values or DNS steps still need my input.

Step 3. Keep using the same agent + skill for updates, redeploys, tenant changes, health checks, and troubleshooting.

Both skills inspect first, ask second β€” they check the working directory and repo state, infer your intent when possible, and only ask clarifying questions when the state is ambiguous.


Manual Setup

Prerequisites

Single-tenant only requires Docker with Compose support.

Multi-tenant also requires: bash, curl, jq, yq, openssl, and optionally python3.

We highly recommend getting a Gemini API key (or using a Codex account) to give your agents some "gas" to get moving immediately.

macOS (Homebrew)
brew install jq yq openssl
brew install --cask docker

Or run the all-in-one helper:

./scripts/bootstrap-mac-mini.sh
Ubuntu / Debian
sudo apt-get update
sudo apt-get install -y docker.io docker-compose-v2 curl jq yq openssl python3
Fedora
sudo dnf install -y docker-cli docker-compose curl jq yq openssl python3
Arch Linux
sudo pacman -S --needed docker docker-compose curl jq yq openssl python
Windows (WSL2)
  1. Install Docker Desktop on Windows.
  2. Enable WSL2 integration for your Linux distro in Docker Desktop.
  3. Inside the WSL distro, install the Linux prerequisites above for your distro.

Multi-Tenant (Recommended)

Use this when you want the full shared-host architecture β€” Traefik ingress, per-tenant isolation, and the ability to scale to many agents.

  1. Configure the Environment

    cp .env.example .env

    Edit the root .env file with your host paths, ingress mode, and shared settings.

  2. Create the local tenant registry

    cp deploy/tenants/tenants.example.yml deploy/tenants/tenants.yml

    If you skip this, the deploy scripts create deploy/tenants/tenants.yml automatically from the example.

  3. Create your first tenant

    ./scripts/create-tenant.sh tenant-a "Tenant A"

    This generates the tenant env file for you at ${TENANT_ENV_ROOT}/tenant-a.env. Start from that generated file rather than inventing a new contract.

  4. Edit the tenant env file outside git Set the tenant hostname, auth, and provider keys. Keep OPENCLAW_CONTROL_UI_ALLOWED_ORIGINS equal to the exact browser origin. For vanilla native /openclaw, that browser origin should be HTTPS unless you are on localhost.

  5. Deploy the host stack

    ./scripts/deploy.sh

    This brings up shared ingress and deploys all enabled tenants.

  6. Verify health

    ./scripts/status.sh
    ./scripts/smoke-test.sh --all
Multi-Tenant LAN & Remote Notes
  • INGRESS_MODE=local is for local multi-tenant and LAN/private-network installs.
  • INGRESS_MODE=direct is for remote multi-tenant hosts where you already have DNS and HTTPS handled outside the repo.
  • INGRESS_MODE=cloudflare is for remote multi-tenant hosts published through Cloudflare Tunnel.
  • On plain HTTP LAN hostnames, /dashboard, /terminal, /filebrowser, /hermes, /codex, /claude, and /gemini work cleanly, but vanilla native /openclaw still needs HTTPS or localhost.

TLS automation for private-hostname LAN installs is still outside the scope of this release.


Single-Tenant

Use services/rundiffusion-agents when you want one agent package without Traefik.

  1. Open the standalone package

    cd services/rundiffusion-agents
  2. Copy the standalone env template

    cp .env.example .env
  3. Edit the standalone env Set at least:

    • OPENCLAW_ACCESS_MODE=native
    • OPENCLAW_GATEWAY_TOKEN=<long-random-secret>
    • TERMINAL_BASIC_AUTH_USERNAME=<username>
    • TERMINAL_BASIC_AUTH_PASSWORD=<strong-password>
    • OPENCLAW_CONTROL_UI_ALLOWED_ORIGINS=http://127.0.0.1:8080,http://localhost:8080 for localhost
    • OPENCLAW_CONTROL_UI_ALLOWED_ORIGINS=https://agent.example.com for a remote HTTPS hostname
  4. Start the package

    docker compose up -d --build
  5. Open the service

    • Dashboard: http://127.0.0.1:8080/dashboard for localhost
    • OpenClaw: http://127.0.0.1:8080/openclaw for localhost

For remote single-tenant installs, see docs/standalone-host-quickstart.md for the full localhost, remote DNS, and remote Cloudflare flows.


πŸš€ How We Use It at RunDiffusion

This is running in production. Here is what it does for us:

  • Content Creation: Our content team has their agents hooked up to our blogging platform. A vast majority of our articles on RunDiffusion Image and RunDiffusion Video are produced agentically. This includes dynamic image generation and intelligent cross-site linking. Switching to our agent farm has saved our content team over 8 hours of work per week, allowing them to focus on strategy rather than formatting.

  • Development & Code Review: Our devs use the agents to check code at night. The agents scan commits, run static analysis, flag potential bugs, and even draft pull requests by the time the team wakes up. It's like having an indefatigable senior engineer reviewing your work 24/7.

  • Sales & Marketing: Our teams use agents to help draft highly-contextualized replies to prospects. We never send automated responses to clients β€” the human touch is crucial β€” but we use agents to draft the perfect reply, ensuring everyone gets answered promptly and accurately.

  • Automated QA: Our QA team uses agents hooked into Playwright to autonomously write and execute tests across our platforms. The agents interact with the DOM, test user flows, and report back on breaking changes before they hit production.

  • Project Management: We have our agents hooked into Monday.com to monitor the health of tasks and keep projects moving seamlessly. One agent's specific job is to scan the board for any tasks that have been stale for 3 days and gently ping the assignees for updates, completely eliminating the need for manual "just checking in" messages.

  • Design Review: We imported the rules of Refactoring UI into an agent's context to have it automatically check design decisions and mockups submitted by the team. It acts as an automated design linter, ensuring our UI consistency stays top-notch.

Hardware: All of this runs on a single 2024 Mac Mini M4 with 16 GB of RAM β€” about 6 to 8 agents comfortably.


πŸ—οΈ Architecture

A strong operator agent sits above the deployment and manages the rest of the fleet. This is what makes RunDiffusion Agents an agent orchestration platform, not just another AI tool.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  Operator Agent                     β”‚
β”‚          (GPT-5.4 / Claude Opus / Gemini)           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β”‚  reads / writes
                       β–Ό
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚    control-plane.yml          β”‚
         β”‚  versions Β· models Β· secrets  β”‚
         β”‚  routes Β· agents Β· policy     β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚  applied at deploy
                        β–Ό
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚     Traefik       β”‚
              β”‚   (edge router)   β”‚
              β””β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”˜
                  β”‚   β”‚   β”‚   β”‚
         β”Œβ”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”
         β”‚ T1 β”‚ β”‚ T2 β”‚ β”‚ T3 β”‚ β”‚ T4 β”‚   ← isolated Docker containers
         β””β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”˜
          each: OpenClaw Β· Codex Β· Claude Β· Gemini Β· Hermes
                Terminal Β· FileBrowser Β· Dashboard

At a glance:

  • Traefik routes traffic to the right tenant and tool at the edge
  • Each tenant runs in its own isolated Docker container
  • A host-side control-plane YAML centrally manages version pins, route flags, model policy, and secrets
  • Provisioning scripts handle create, deploy, update, rollback, backup, restore, smoke tests, and health checks
  • Cloudflare Tunnel support for secure remote access without exposing the host
  • Cloudflare Access can be added for additional authentication and gateway security
  • Each agent gets its own stateful workspace β€” dashboard, terminal, and file browser in one place

See the deployment matrix for all four deployment shapes and the configuration guide for the control-plane override surface.

Agent-Managed Operations

Instead of manually editing containers and env files, point a capable agent at this repo and give it high-level tasks:

Create a new agent:

> Create me a new agent for David Smith.

Done. I created tenant "dsmith-7821", generated the env file, provisioned data
directories, and assigned the default OpenClaw version. Here are the URLs,
operator credentials, and gateway token. Want me to deploy and run smoke tests?

Upgrade the whole cluster:

> Update all agents to the latest OpenClaw and run the tests.

Done. Updated the host default version, redeployed each tenant in order, and ran
smoke tests after each rollout. All tenants are aligned. Full report attached.
More agent operation examples

Copy skills between agents:

> Copy the skills from Dave's Marketing agent to Tyler's Marketing agent.

Done. Synced the missing marketing skills from Dave's agent into Tyler's,
preserved Tyler's local customizations, and verified file paths. Tyler now has
the same baseline skill pack as Dave.

Audit secrets across the fleet:

> Check all deployments for secrets and tell me what should move to the control plane.

Done. Scanned all tenant env files and host config. Grouped findings into:
shared secrets that belong in the control plane, tenant-specific credentials,
duplicated values, and keys that should be rotated before centralization.
Ready to generate a migration plan when you are.

Recommended Operator Models

Model Best For Notes
OpenAI GPT-5.4 Top-end agentic operations, coding, long-running infra tasks Highest capability vs cost
Claude Opus 4.6 Most capable Claude-class operator Strong reasoning
Claude Sonnet 4.5 Fast, highly capable day-to-day operations Best speed/capability ratio
Gemini 3 Flash (preview) Budget-conscious operations Our pick for capability per dollar

πŸ“Έ Showcase

Each tool runs on its own dedicated route β€” pop any app out of the dashboard for a full-screen experience.

OpenClaw, Codex, Claude & Gemini

Run your favorite models side-by-side in fully featured, stateful environments.

OpenClaw β€” AI agent orchestration interfaceΒ Β  Codex β€” OpenAI agent terminal

Claude Code β€” Anthropic agent terminalΒ Β  Gemini CLI β€” Google agent terminal

Hermes β€” Delegated Tasks

Delegate complex, asynchronous tasks to sub-agents and let Hermes manage the execution.

Hermes β€” delegated task execution for multi-agent platform

Integrated Terminal

Separate, fully integrated terminal sessions for each application with multiple modes for scrolling, copying, and pasting.

Integrated terminal for self-hosted AI agents Terminal help modal

Quantum Filebrowser

A secure file browser for managing documents and adding secrets securely. (Proxy-authenticated FileBrowser users are provisioned with full operator permissions automatically, including upload, edit, delete, sharing, API, and realtime access.)

Filebrowser β€” secure file management for agent fleet

Utilities & Dashboard

Your entire agent farm is secured by Basic Auth, with robust device pairing for OpenClaw. The utilities section features custom scripts like approve-device and gateway recovery helpers to keep your farm healthy.

Utilities panel β€” agent fleet management dashboard


πŸ’‘ Good to Know

  • Terminal Quirks: The integrated terminal has multiple modes for scrolling, copying, and pasting. Read the help modal inside the terminal for full details.
  • Filebrowser Permissions: Proxy-authenticated FileBrowser users receive full operator permissions automatically when they first sign in, and existing proxy users are reconciled on startup so no manual settings changes are required.
  • Hardware: For an optimal experience balancing performance and cost, a Mac Mini M4 (16 GB RAM) runs about 6 to 8 agents comfortably.

πŸ“š Documentation

Guide Description
Deployment Matrix Choose the right deployment shape for your environment
Standalone Host Quickstart Single-tenant from zero to running
Multi-Tenant Deployment Shared host stack with Traefik & ingress modes
Linux Host Quickstart Cloud VM and Linux server deployment
Configuration Guide All four config layers and precedence rules
Tenant Operations Runbook Day-to-day operational tasks
Operator Runbook OpenClaw gateway operator responsibilities
Release Checklist Release process and verification

🀝 Trust & Community

RunDiffusion has been in business since 2022. Our team of DevOps engineers brings over 30 combined years of experience in software systems architecture. This codebase is free from malicious code, malware, or any devious intent β€” you are welcome to audit every line.

Discord Join our Discord with 35k+ members. Come build with us, get help, and talk with like-minded AI enthusiasts.

Find us:Β Β  LinkedIn Β Β·Β  X Β Β·Β  GitHub Β Β·Β  Discord Β Β·Β  rundiffusion.com

Designate an Agent-Wise Internal Champion. To succeed, your team needs someone to run and maintain the farm β€” checking agent health daily, monitoring secrets inside containers, reviewing errors and usage, and managing the entire deployment using the bundled multi-tenant host manager skill or standalone manager skill.

Use At Your Own Risk. This is bleeding-edge software. You are responsible for security, secrets, data protection, access control, compliance, third-party API usage, costs, and any impact in your environment. See DISCLAIMER.md.


License

RunDiffusion Agents is licensed under Apache-2.0. See LICENSE, NOTICE, TRADEMARKS.md, DISCLAIMER.md, and the engineering inventory in docs/license-audit.md.

This repository can bundle or launch third-party tools such as OpenClaw, Codex, Claude Code, Gemini CLI, Hermes, Traefik, and FileBrowser. The Apache-2.0 license applies to RunDiffusion's code in this repo only. It does not relicense those third-party tools or grant rights to the separate APIs and hosted services they may use.


RunDiffusion Agents RunDiffusion Agents β€” Open-Source Multi-Agent Orchestration Platform

If you are a team or enterprise and want help deploying governed AI agents at scale, contact us.

RunDiffusion Agents is an open-source agent platform for multi-agent orchestration, AI agent orchestration, self-hosted AI agents, agentic AI, and agent fleet management β€” a complete agent orchestration platform and multi-agent platform for teams and enterprises.

About

Your Agent Farm control plane for OpenClaw, Codex, Claude, Gemini, Hermes, and the recovery tools that keep them healthy.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors