English | 中文
"Can human craft rival the work of Nature herself?" — King Mu of Zhou, on witnessing Yan Shi's automaton
Vyane (偃) is named after Yan Shi (偃师), the legendary artisan who crafted the first recorded automaton circa 1000 BCE. Just as Yan Shi orchestrated mechanical parts into a living whole, Vyane orchestrates multiple AI models into a unified, intelligent system.
Cross-platform multi-model AI collaboration server. Dispatch, broadcast, and orchestrate tasks across Codex CLI, Gemini CLI, Claude Code CLI, Ollama, DashScope (Qwen/Kimi/MiniMax), and external A2A agents through a unified MCP + CLI interface — with smart routing v4, exponential retry, cost tracking, and multi-agent collaboration.
Different AI models have different strengths:
| Model | Strengths |
|---|---|
| Codex (GPT) | Code generation, algorithms, debugging |
| Gemini | Frontend/UI, multimodal, broad knowledge |
| Claude | Architecture, reasoning, code review |
| Ollama | Free local inference (DeepSeek, Llama, Qwen, etc.) |
| DashScope | Chinese models (Qwen, Kimi, MiniMax, GLM) |
Vyane lets any MCP-compatible platform orchestrate tasks across all of them — getting the best of each, with automatic failover, cost tracking, and true multi-agent collaboration.
MCP Client (Claude Code / Codex CLI / Gemini CLI / IDE)
│
└── Vyane (MCP server, stdio)
├── mux_dispatch → single provider (auto-route, failover, retry)
├── mux_broadcast → parallel multi-provider + comparison
├── mux_collaborate → iterative multi-agent collaboration (A2A)
├── mux_workflow → multi-step pipeline chains
├── mux_feedback → user quality ratings (drives routing)
├── mux_history → analytics, cost tracking
└── mux_check → availability & config status
│
├── CodexAdapter → codex exec --json
├── GeminiAdapter → gemini -p -o stream-json
├── ClaudeAdapter → claude -p
├── OllamaAdapter → ollama run <model>
├── DashScopeAdapter → OpenAI-compatible API
├── A2ARemoteAdapter → external A2A agents (httpx)
└── Custom Adapters → user-defined plugins
CLI (`vyane` dispatch / broadcast)
└── Same adapters + smart routing, JSON output for scripts & CI
A2A HTTP Server (`vyane` a2a-server)
├── GET /.well-known/agent.json → Agent Card
├── POST / (JSON-RPC 2.0)
│ ├── tasks/send → synchronous task
│ ├── tasks/get → query task state
│ ├── tasks/cancel → cancel running task
│ └── tasks/sendSubscribe → SSE streaming
└── TaskStore (in-memory + JSONL persistence)
- Python 3.10+
- uv package manager
- At least one model CLI installed:
codex—npm i -g @openai/codexgemini—npm i -g @google/gemini-cliclaude— Claude Codeollama— Ollama
# Recommended: install as MCP server for Claude Code
claude mcp add vyane -s user -- uvx vyane
# Or install for all platforms
git clone https://github.com/pure-maple/vyane.git
cd vyane && ./install.sh --all
# Quick availability check
vyane checkManual installation for other platforms
# Codex CLI (~/.codex/config.toml)
[mcp_servers.vyane]
command = "uvx"
args = ["vyane"]
# Gemini CLI (~/.gemini/settings.json)
{"mcpServers": {"vyane": {"command": "uvx", "args": ["vyane"]}}}# Smart routing (auto-excludes caller, picks best model for the task)
mux_dispatch(provider="auto", task="Implement a binary search tree")
# Explicit provider
mux_dispatch(provider="codex", task="Fix the memory leak in pool.py",
workdir="/path/to/project", sandbox="write")
# Specific model + multi-turn session
r1 = mux_dispatch(provider="codex", model="gpt-5.4", task="Analyze this codebase")
r2 = mux_dispatch(provider="codex", task="Fix the bug you found",
session_id=r1.session_id)
# Local model via Ollama
mux_dispatch(provider="ollama", model="deepseek-r1", task="Explain this algorithm")# Send to all available providers simultaneously
mux_broadcast(task="Review this API design for security issues")
# Specific providers with structured comparison
mux_broadcast(
task="Suggest the best data structure for this use case",
providers=["codex", "gemini", "claude"],
compare=True # adds similarity scores, speed ranking
)# Review loop: implement → review → revise until approved
mux_collaborate(
task="Implement a rate limiter with sliding window",
pattern="review" # codex builds, claude reviews, iterate
)
# Consensus: parallel analysis + synthesis
mux_collaborate(task="Evaluate our migration strategy", pattern="consensus")
# Debate: advocate vs critic + arbiter verdict
mux_collaborate(task="Should we use microservices?", pattern="debate")# List available workflows
mux_workflow(workflow="", task="", list_workflows=True)
# Run a built-in or custom workflow
mux_workflow(workflow="review", task="Optimize the database queries")# Recent dispatches
mux_history(limit=20)
# Statistics with cost breakdown
mux_history(stats_only=True, costs=True)
# Filter by provider and time range
mux_history(provider="codex", hours=24, costs=True)Token usage is automatically extracted from Codex and Gemini responses. Cost estimation uses configurable per-model pricing.
mux_check()
# Returns: provider availability, caller detection, active profile,
# policy summary, audit stats, active dispatchesExpose Vyane as an A2A protocol agent over HTTP:
# Start with default settings
vyane a2a-server
# Custom port + authentication
vyane a2a-server --port 8080 --token my-secret --sandbox writeOther A2A-compatible agents can discover and interact with Vyane via:
GET /.well-known/agent.json— Agent Card (capabilities, skills)POST /— JSON-RPC 2.0 (tasks/send,tasks/get,tasks/cancel,tasks/sendSubscribe)
Register external agents in your config to use them as providers:
# ~/.config/vyane/profiles.toml
[a2a_agents.my-agent]
url = "http://localhost:8080"
token = "secret"
pattern = "code-review"Then dispatch to them like any other provider:
mux_dispatch(provider="my-agent", task="Review this PR")# Server modes
vyane # Start MCP server (stdio)
vyane a2a-server # Start A2A HTTP server
vyane dashboard # Web monitoring dashboard (http://127.0.0.1:41521)
# Direct task execution (JSON output, for scripts & CI)
vyane dispatch "Review this code" # auto-route
vyane dispatch -p codex -m gpt-5.4 "Fix the bug" # explicit provider
vyane dispatch -p gemini --max-retries 3 "Analyze" # with retry
vyane dispatch --failover "Fix this bug" # auto-failover
cat diff.txt | vyane dispatch -p auto # pipe from stdin
vyane broadcast "Review this API" --providers codex gemini # parallel
vyane broadcast --compare "Best data structure?" # with analysis
# Feedback & management
vyane feedback --run-id abc --provider codex --rating 5 # rate results
vyane feedback --list # view ratings
vyane check # Check CLI availability
vyane check --json # JSON output for CI
vyane status -w # Live dispatch monitor
vyane history --stats --costs # Statistics with cost breakdown
vyane history --source cli-dispatch # filter by source
vyane benchmark # Run provider benchmark suite
vyane export --format csv # Export to CSV/JSON/Markdown
# Setup
vyane init # Interactive configuration wizard
vyane config # TUI configuration panel (requires vyane[tui])
vyane version # Show versionCreate ~/.config/vyane/profiles.toml (user) or .vyane/profiles.toml (project):
# Custom routing rules
[routing]
default_provider = "codex"
[[routing.rules]]
provider = "gemini"
[routing.rules.match]
keywords = ["frontend", "react", "css"]
[[routing.rules]]
provider = "claude"
[routing.rules.match]
keywords = ["security", "architecture"]
# Caller detection
caller_override = ""
auto_exclude_caller = true
# Named profiles
[profiles.budget]
description = "Use cheaper models"
[profiles.budget.providers.codex]
model = "gpt-4.1-mini"
[profiles.budget.providers.gemini]
model = "gemini-2.5-flash"Create ~/.config/vyane/policy.json:
{
"allowed_providers": [],
"blocked_providers": ["gemini"],
"blocked_sandboxes": ["full"],
"max_timeout": 600,
"max_calls_per_hour": 30,
"max_calls_per_day": 200
}All results follow the canonical schema:
{
"run_id": "a1b2c3d4",
"provider": "codex",
"status": "success",
"summary": "First 200 chars...",
"output": "Full model response",
"session_id": "uuid-for-multi-turn",
"duration_seconds": 12.5,
"token_usage": {
"input_tokens": 1200,
"output_tokens": 340,
"total_tokens": 1540
},
"routed_from": "auto",
"caller_excluded": "claude"
}token_usage is included when the provider returns token data (Codex, Gemini). Cost estimation is available via mux_history(costs=True).
| Feature | Description |
|---|---|
| Smart Routing v4 | Keyword + history + benchmark + user feedback scoring |
| Failover + Retry | Exponential backoff retry, then auto-failover to next provider |
| CLI Dispatch | vyane dispatch / broadcast for scripts, CI, and pipelines |
| Profiles | Named configs for model/API overrides (budget, china, etc.) |
| Multi-turn | Session continuity via native CLI session IDs |
| Broadcast | Parallel dispatch to multiple providers with comparison |
| Collaboration | Iterative multi-agent patterns (review, consensus, debate) |
| Workflows | Multi-step pipeline chains with variable substitution |
| Cost Tracking | Token usage extraction + per-model cost estimation |
| A2A Protocol | HTTP server + client for agent-to-agent interop |
| User Feedback | Rate results 1-5 to improve routing quality over time |
| Web Dashboard | Real-time monitoring with charts and feedback panel |
| Policy Engine | Rate limits, provider/sandbox blocking |
| Custom Plugins | User-defined adapters + A2A remote agents via config |
Architecture jointly designed through multi-model consultation:
- Claude Opus 4.6 — original proposal and synthesis
- GPT-5.3-Codex — recommended unified hub, OPA policy engine
- Gemini-3.1-Pro-Preview — recommended A2A protocol backbone
Key consensus: one unified MCP hub (not 3 bridges), MCP-first (no Bash permissions needed), canonical output schema, session continuity, code sovereignty.
See consultation records for full design discussion transcripts.
MIT