Model comparison pipeline for nexus-agents. Evaluates how different AI models and voting strategies handle the same task.
delegate_to_model → create_expert → execute_expert → consensus_vote (×5 strategies)
- Delegate a task to find the optimal model
- Create a specialized expert agent
- Execute the expert on the task
- Vote on the result using all 5 consensus strategies:
simple_majority— 50%+ agreementsupermajority— 67%+ agreementunanimous— 100% agreementproof_of_learning— evidence-based consensushigher_order— Bayesian-optimal aggregation
pnpm install
pnpm test # Run 54 unit tests
pnpm typecheck # TypeScript strict mode
pnpm build # Compile to dist/import { runShowdown, generateReport } from 'model-showdown';
import type { ToolCaller } from 'model-showdown';
const caller: ToolCaller = {
call: async (tool, args) => await mcpClient.callTool(tool, args),
};
const result = await runShowdown(caller, {
task: 'Implement a rate limiter with sliding window',
});
console.log(generateReport(result, 'markdown'));# 1. Create src/live-bridge.ts with your MCP client
# 2. Run:
NEXUS_LIVE=true npx tsx src/run-live.ts
# Custom task:
NEXUS_LIVE=true SHOWDOWN_TASK="Build a REST API" npx tsx src/run-live.ts| Tool | Purpose | Safety |
|---|---|---|
delegate_to_model |
Route task to optimal model | Read-only routing |
list_experts |
List available expert types | Read-only discovery |
create_expert |
Create specialized agent | Stateful but ephemeral |
execute_expert |
Run expert on task | Bounded by timeout |
consensus_vote |
Multi-model consensus | 5 strategies tested |
- markdown — Strategy comparison table with agreement scores
- json — Full showdown result as structured JSON
- text — Compact terminal output with per-strategy results
src/
types.ts # Zod schemas matching live MCP responses
fixtures/
mock-responses.ts # Mock data for all tools + 5 strategy variants
pipeline.ts # Showdown pipeline (delegate → create → execute → vote×5)
reporter.ts # Report formatter with strategy comparison
live-caller.ts # Live mode ToolCaller bridge
run-live.ts # CLI entry point for live integration testing
index.ts # Public API exports
MIT