A donation-funded benchmark platform for comparing AI models on Bitcoin-related tasks. Users donate BSV to fund benchmark runs, and results are published transparently.
The project consists of two main parts:
- bench: CLI tool for running benchmarks against 40+ AI models
- visualizer: Next.js web app for viewing results and donating to fund benchmarks
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ FUNDING │ ──▶ │ PENDING │ ──▶ │ RUNNING │ ──▶ │ COMPLETED │
│ │ │ │ │ │ │ │
│ Users donate│ │ Goal reached│ │ Admin runs │ │ Results │
│ BSV to suite│ │ awaiting run│ │ benchmarks │ │ published │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
- Fund: Users donate BSV to test suite addresses via the visualizer
- Pending: When funding goal is reached, suite enters pending state
- Run: Admin checks funding status and runs benchmarks locally
- Publish: Results are committed to repo and deployed to visualizer
Each test suite has a unique donation address derived using Type 42 key derivation:
- Master WIF → Suite ID → Deterministic BSV address
- Same master key always produces the same addresses
- Suite ID as invoice number ensures addresses survive renames
| Suite | Description | Est. Cost |
|---|---|---|
| Bitcoin SPV & Data Protocols | OP_RETURN, Ordinals, Runes, BRC-20 | ~$30 |
| Bitcoin Script & Transactions | Bitcoin Script, SegWit, Taproot | ~$30 |
| Bitcoin Libraries | @bsv/sdk, bitcoinjs-lib, etc. | ~$30 |
| Bitcoin Parsing | Transaction and block parsing | ~$30 |
| Protocol Parsing | MAP, AIP, B protocol parsing | ~$35 |
| sCrypt Smart Contracts | sCrypt language and tooling | ~$30 |
| Stratum Mining Protocol | Mining pool protocol | ~$30 |
| Type 42 Key Derivation | BIP-42 style key derivation | ~$30 |
- Bun runtime
- OpenRouter API key (all models routed through OpenRouter)
- Master WIF for donation address derivation
cd bench
bun install
cp .env.example .env # Add your API keys
# Interactive CLI with funding checks
bun run cli
# Quick funding status check
bun run funding
# Force run (bypass funding check)
bun run cli --forceThe CLI provides:
- Main Menu: Choose between viewing funding status or running benchmarks
- Funding Status Table: See all suites with their funding progress
- Unfunded Protection: Warning when selecting unfunded suites
- Real-time Progress: Live benchmark progress with model stats
- Force Flag:
--forcebypasses funding requirements
cd visualizer
bun install
bun devKV_REST_API_URL= # Upstash Redis REST URL
KV_REST_API_TOKEN= # Upstash Redis REST token
MASTER_WIF= # Master private key for address derivation
OPENROUTER_API_KEY= # All models routed through OpenRouter
MASTER_WIF= # Same master key as visualizer for address matching
- Wallet Integration: Yours Wallet Provider for BSV donations
- Address Derivation: Type 42 using @bsv/sdk (one address per suite)
- Data Storage: Upstash Redis for suites, donations, and run metadata
- Balance Checking: WhatsOnChain API
- Benchmark Results: Static JSON committed to repo
bitbench/
├── bench/
│ ├── cli.tsx # Interactive CLI with Ink
│ ├── index.ts # Benchmark engine
│ ├── funding.ts # Funding status module
│ ├── check-funding.ts # Quick funding check script
│ ├── constants.ts # Model definitions
│ └── tests/ # Test suite JSON files
├── visualizer/
│ ├── app/ # Next.js pages
│ ├── components/ # React components
│ ├── lib/ # Business logic
│ │ ├── addresses.ts # Type 42 derivation
│ │ ├── suites.ts # Suite management
│ │ ├── kv.ts # Redis helpers
│ │ └── types.ts # TypeScript types
│ └── data/
│ └── benchmark-results.json
└── README.md
cd bench
bun run fundingThis shows a table of all suites with their:
- Donation address
- Current balance (USD)
- Funding goal
- Progress bar
- Funded status (✓/✗)
bun run cli- Select "🚀 Run Benchmark"
- Choose a funded suite (marked with ✓)
- Enter version label (defaults to YYYY-MM-DD)
- Watch live progress
If you select an unfunded suite, you'll see a warning with the option to proceed anyway.
After a benchmark run:
# Copy results to visualizer
cp bench/results/[suite-id]/[version]/results.json visualizer/data/benchmark-results.json
# Commit and push
git add .
git commit -m "Add [suite-name] benchmark results [version]"
git pushResults are automatically deployed to the visualizer on push.
Contributions welcome. Please open an issue first to discuss proposed changes.
MIT