Five elite LLMs. One ruthless trading engine. Zero emotional decisions.
╔══════════════════════════════════════════════════════════╗
║ Claude 4.5 · GPT-5.4 · Gemini 3.1 · Grok · DS ║
║ → Ensemble Consensus → Kelly Sizing → ║
║ Risk-Managed Execution → Profit ║
╚══════════════════════════════════════════════════════════╝
Most "AI trading bots" are a single LLM prompted to gamble. This one is a trading firm in a terminal.
It runs five specialized AI agents in parallel — a forecaster, a news analyst, a bull researcher, a bear researcher, and a risk manager — then lets them debate, aggregates their probabilities with confidence-weighted consensus, and only pulls the trigger when disagreement is low and the edge is real. Every decision is logged, every model's calibration is tracked, and every dollar is sized with the Kelly Criterion inside hard position and daily-loss limits.
If that sounds like overkill for a prediction market bot — that's the point.
| Category | What You Get |
|---|---|
| Multi-Model Ensemble | Claude Sonnet 4.5 · GPT-5.4 · Gemini 3.1 Pro · DeepSeek V3.2 · Grok 4.1 — all orchestrated via OpenRouter with per-model health tracking and automatic failover |
| Agent Debate | Bull vs. Bear researchers argue the thesis; a Risk Manager has veto power. Disagreement above threshold automatically penalizes confidence. |
| Kelly Sizing | Fractional Kelly (default 25%) position sizing with hard caps on single-position, daily loss, and total open positions |
| Safe Compounder | NO-side edge compounding strategy for asymmetric, high-probability trades — runs in dry-run by default |
| Market Making | Optional spread-capture mode with inventory risk limits and automatic order refresh |
| Quick-Flip Scalping | Short-horizon opportunistic strategy for high-liquidity markets |
| Category Scoring | Continuously learns which market categories (sports, economics, politics, etc.) your bot is actually good at — and leans in |
| News & Sentiment | RSS aggregation from Reuters, NYT, BBC + LLM-scored sentiment & relevance feeding every decision |
| Real-Time Data | WebSocket streaming from Kalshi keeps market prices fresh without hammering the REST API |
| Hard Cost Guardrails | Daily AI spend cap (default $10/day) enforced at the router level — the bot literally refuses to call an LLM when the budget is out |
| Paper Trading First | Full simulation mode with its own tracker and dashboard so you can battle-test strategies risk-free |
| Rich CLI | run, dashboard, status, scores, history, safe-compounder, health — everything you need, nothing you don't |
| Type-Safe Core | Strict TypeScript + Zod validation + Vitest tests on the parts that actually matter (ensemble, portfolio optimization, JSON repair, DB, category scoring) |
┌────────────────────────────────┐
│ KALSHI MARKET + NEWS │
│ (REST + WebSocket + RSS) │
└──────────────┬─────────────────┘
│
┌────────────────┼────────────────┐
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Forecaster │ │News Analyst│ │Risk Manager│
│ (0.30 w) │ │ (0.20 w) │ │ (0.15 w) │
└─────┬──────┘ └─────┬──────┘ └─────┬──────┘
│ ┌───────────┼───────────┐ │
│ ▼ │ ▼ │
│ ┌──────────┐ │ ┌──────────┐ │
│ │ Bull │◄─┼─►│ Bear │ │
│ │Researcher│ │ │Researcher│ │
│ │ (0.20 w) │ │ │ (0.15 w) │ │
│ └────┬─────┘ │ └────┬─────┘ │
│ └─── DEBATE ─────┘ │
└───────────────┬───────────────┘
▼
┌──────────────────────────┐
│ ENSEMBLE CONSENSUS │
│ weighted · calibrated │
│ disagreement-penalized │
└────────────┬─────────────┘
▼
┌──────────────────────────────────────┐
│ Kelly Sizing → Position Limits │
│ Stop-Loss → Daily Loss Cap │
└────────────────────┬─────────────────┘
▼
┌────────────────┐
│ Paper or Live │
│ Execution │
└────────────────┘
Every iteration: ingest → analyze → decide → execute → track → evaluate. Every model call is metered. Every trade is logged to SQLite. Every position has a dynamic exit.
- Node.js 22.5+ (uses native
--experimental-sqlite) - A Kalshi API key + RSA private key
- An OpenRouter API key (one key, five models)
clone the repo
cd kalshi-ai-trading-bot
npm installCopy env.template → .env and fill in:
KALSHI_API_KEY=your_kalshi_api_key_here
KALSHI_PRIVATE_KEY_PATH=./kalshi_private_key.pem
OPENROUTER_API_KEY=your_openrouter_api_key_here
LIVE_TRADING_ENABLED=false # START HERE. Paper first, always.
DAILY_AI_COST_LIMIT=10 # hard cap in USD
LOG_LEVEL=infonpm run dev -- healthYou'll see your Kalshi balance, API key status, and daily budget. If anything is red, fix it before moving on.
npm run dev -- run --iterations 10Watch the five agents do their thing. When you're ready for real money:
npm run dev -- run --live --daily-limit 5Heads up:
--liveplaces real orders with real money. Review the safety section below first.
kalshi-bot <command> [options]
run Run the Beast Mode trading loop
--live Enable live trading (default: paper)
--daily-limit <n> Daily AI cost limit USD (default: 10)
--iterations <n> Max iterations (default: infinite)
dashboard Print paper trading dashboard
status Print current portfolio + open positions
scores Print learned category performance scores
history [--limit n] Print recent closed trades with PnL
safe-compounder Run NO-side edge compounder (dry-run by default)
--live Place real orders
health Print health diagnostics
help Show the full menu
This bot can lose you money. It's designed to minimize that — but no model is perfect. The project ships with layers of defense:
- Paper trading by default.
LIVE_TRADING_ENABLED=falseis the default.--liveis opt-in, per invocation. - Hard daily AI spend cap. The
ModelRouterphysically cannot exceedDAILY_AI_COST_LIMIT. When it's out, the bot skips trading rather than flying blind. - Position & loss limits. Max 3% of balance per position, max 10% daily loss, max 10 open positions — all configurable in
src/config/settings.ts. - Ensemble consensus requirement. At least 3 models must agree before a trade is considered, and high disagreement penalizes confidence automatically.
- Minimum confidence threshold (default
0.45) prevents coin-flip trades from ever hitting the wire. - Minimum volume & max-expiry filters keep the bot out of illiquid or stale markets.
- Stop-loss & dynamic exits on every position, with a max-hold-time sanity timer.
- Full audit trail. Every decision, every model output, every trade — all persisted in SQLite (
trading.db) and JSONL logs.
The project is provided as-is for educational and research purposes. Trade at your own risk. Past paper-trading performance is not indicative of anything.
- Runtime: Node.js 22.5+ with native SQLite
- Language: TypeScript 5.6 (strict)
- LLM Gateway: OpenRouter — one key, five frontier models
- Exchange: Kalshi REST + WebSocket
- Validation: Zod
- Logging: Pino (+ pretty in dev)
- Testing: Vitest
- News:
rss-parseragainst Reuters, NYT, BBC (configurable)
src/
├── agents/ Five specialized AI agents + ensemble runner + debate
├── clients/ KalshiClient, KalshiWS, OpenRouterClient, ModelRouter
├── config/ All tunables, with typed Settings and validation
├── data/ News aggregation + sentiment analysis
├── events/ In-process event bus
├── jobs/ ingest · decide · trade · track · evaluate · execute
├── paper/ Paper-trading tracker + terminal dashboard
├── strategies/ Safe Compounder · Market Making · Quick-Flip · Portfolio
│ · Category Scorer · Portfolio Enforcer · Unified System
├── utils/ DB · logger · Kelly · limits · stop-loss · JSON repair
├── beastModeBot.ts Main orchestration loop
└── cli.ts Command-line interface
tests/ Vitest suites for the critical paths
npm test # run all suites
npm run test:watch # TDD mode
npm run typecheck # strict TS, no emit
npm run lint # eslint- Cross-market arbitrage (structural wiring already in place)
- Options-style strategies on composite markets
- Fully algorithmic VWAP / TWAP execution
- Web dashboard (the terminal one is lovely, but...)
- Post-hoc calibration re-weighting of ensemble models
- Public benchmark + anonymized paper-trading leaderboard
PRs welcome.
- Fork it.
npm install && npm test— make sure the suite is green on your machine.- Open an issue first for anything non-trivial so we can align.
- Write tests for new strategies and agents. The ensemble and portfolio code has real test coverage; let's keep it that way.
Is this guaranteed to make money? No. Nothing is. It's a disciplined, multi-model framework that executes a strategy you configure. Markets change. Models drift. Trade small, review logs, start in paper.
Why OpenRouter instead of direct provider keys?
One key, five frontier models, built-in failover. The ModelRouter tracks per-model health and gracefully demotes flaky models until they recover.
Can I run just one model?
Yes. Disable the ensemble in src/config/settings.ts (ensemble.enabled = false) and the bot falls back to primaryModel with automatic fallback to fallbackModel.
How much does it cost to run? You control it. The default cap is $10/day in LLM spend. At that rate the bot will happily analyze dozens to hundreds of markets per day depending on depth.
Does it support live trading out of the box?
Yes — but LIVE_TRADING_ENABLED=false by default, and you must explicitly pass --live every time. This is intentional.
MIT — do what you want, just don't blame us for the drawdowns.
Built for traders who think like engineers, and engineers who trade like traders.
Paper-trade first. Size with Kelly. Listen to the ensemble. Ship.