model-showdown

Model comparison pipeline for nexus-agents. Evaluates how different AI models and voting strategies handle the same task.

Pipeline

delegate_to_model → create_expert → execute_expert → consensus_vote (×5 strategies)

Delegate a task to find the optimal model
Create a specialized expert agent
Execute the expert on the task
Vote on the result using all 5 consensus strategies:
- simple_majority — 50%+ agreement
- supermajority — 67%+ agreement
- unanimous — 100% agreement
- proof_of_learning — evidence-based consensus
- higher_order — Bayesian-optimal aggregation

Quick start

pnpm install
pnpm test        # Run 54 unit tests
pnpm typecheck   # TypeScript strict mode
pnpm build       # Compile to dist/

Usage as a library

import { runShowdown, generateReport } from 'model-showdown';
import type { ToolCaller } from 'model-showdown';

const caller: ToolCaller = {
  call: async (tool, args) => await mcpClient.callTool(tool, args),
};

const result = await runShowdown(caller, {
  task: 'Implement a rate limiter with sliding window',
});

console.log(generateReport(result, 'markdown'));

Live integration mode

# 1. Create src/live-bridge.ts with your MCP client
# 2. Run:
NEXUS_LIVE=true npx tsx src/run-live.ts

# Custom task:
NEXUS_LIVE=true SHOWDOWN_TASK="Build a REST API" npx tsx src/run-live.ts

MCP tools covered

Tool	Purpose	Safety
`delegate_to_model`	Route task to optimal model	Read-only routing
`list_experts`	List available expert types	Read-only discovery
`create_expert`	Create specialized agent	Stateful but ephemeral
`execute_expert`	Run expert on task	Bounded by timeout
`consensus_vote`	Multi-model consensus	5 strategies tested

Report formats

markdown — Strategy comparison table with agreement scores
json — Full showdown result as structured JSON
text — Compact terminal output with per-strategy results

Project structure

src/
  types.ts              # Zod schemas matching live MCP responses
  fixtures/
    mock-responses.ts   # Mock data for all tools + 5 strategy variants
  pipeline.ts           # Showdown pipeline (delegate → create → execute → vote×5)
  reporter.ts           # Report formatter with strategy comparison
  live-caller.ts        # Live mode ToolCaller bridge
  run-live.ts           # CLI entry point for live integration testing
  index.ts              # Public API exports

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

model-showdown

Pipeline

Quick start

Usage as a library

Live integration mode

MCP tools covered

Report formats

Project structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

model-showdown

Pipeline

Quick start

Usage as a library

Live integration mode

MCP tools covered

Report formats

Project structure

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages