Polymarket Trading Agent
An autonomous multi-agent system that trades prediction markets using a 7-phase pipeline of 13 specialized AI agents.
WARNING: This project is for educational and research purposes only. Trading prediction markets involves real financial risk. This software is provided as-is with no guarantees of profitability. Do not trade with money you cannot afford to lose.
The agent scans ~5,000 live Polymarket prediction markets, filters them down to a handful of mispriced opportunities, researches each one through web search and quantitative data feeds, estimates true probabilities using an ensemble of three independent estimators, validates every trade through 11 risk management checks, and executes via the Polymarket CLOB API with quarter-Kelly position sizing.
Every decision is logged. Every trade has a full audit trail. Every loss triggers a post-mortem.
This is the core of the system -- a 7-phase sequential pipeline that runs as a single trading cycle.
flowchart TB
subgraph phase0["Phase 0 -- Resolve"]
R[Resolve Settled Trades]
R --> R1[Mark outcomes WIN/LOSS]
R --> R2[Compute realized P&L]
R --> R3[Calculate Brier scores]
end
subgraph phase1["Phase 1 -- Scan"]
S[Market Scanner]
S --> F1["Volume > $1K"]
S --> F2["Spread < 5%"]
S --> F3["Resolves within 10 days"]
F1 & F2 & F3 --> C["5,000 markets --> 4 candidates"]
end
subgraph phase2["Phase 2 -- Research"]
direction LR
RCC1["ResearchAgentCC\n(Market A)"]
RCC2["ResearchAgentCC\n(Market B)"]
RCC3["ResearchAgentCC\n(Market C)"]
RCC4["ResearchAgentCC\n(Market D)"]
end
subgraph phase3["Phase 3 -- Estimate"]
direction LR
EB["EstimatorB\n30% weight\nResearch-informed"]
EC["EstimatorC\n40% weight\nQuantitative only"]
ED["EstimatorD\n30% weight\nAdversarial bear case"]
end
subgraph phase4["Phase 4 -- Aggregate"]
AGG["Weighted Average"]
AGG --> PLATT["Platt Calibration\n10% shrinkage toward 50%"]
PLATT --> DIS{"Disagreement\n> 35%?"}
DIS -- Yes --> REJECT[Reject Market]
DIS -- No --> PASS[Forward to Risk]
end
subgraph phase5["Phase 5 -- Risk Management"]
RM["11 Pre-Trade Checks"]
RM --> KELLY["Quarter-Kelly Sizing"]
end
subgraph phase6["Phase 6 -- CRO Approval"]
CRO["Chief Research Officer"]
CRO --> DA["Devil's Advocate Protocol"]
DA --> APPROVE{Approve?}
end
subgraph phase7["Phase 7 -- Execute"]
EXEC["Place Order via CLOB API"]
EXEC --> LOG["Full Audit Trail to SQLite"]
end
phase0 --> phase1 --> phase2 --> phase3 --> phase4
phase4 --> phase5 --> phase6
APPROVE -- Yes --> phase7
APPROVE -- No --> KILL[No Trade]
style phase0 fill:#1a1a2e,stroke:#444,color:#e0e0e0
style phase1 fill:#1a1a2e,stroke:#444,color:#e0e0e0
style phase2 fill:#1a1a2e,stroke:#444,color:#e0e0e0
style phase3 fill:#1a1a2e,stroke:#444,color:#e0e0e0
style phase4 fill:#1a1a2e,stroke:#444,color:#e0e0e0
style phase5 fill:#1a1a2e,stroke:#444,color:#e0e0e0
style phase6 fill:#1a1a2e,stroke:#444,color:#e0e0e0
style phase7 fill:#1a1a2e,stroke:#444,color:#e0e0e0
style REJECT fill:#8b0000,stroke:#444,color:#e0e0e0
style KILL fill:#8b0000,stroke:#444,color:#e0e0e0
style PASS fill:#006400,stroke:#444,color:#e0e0e0
style APPROVE fill:#1a1a2e,stroke:#444,color:#e0e0e0
-
Resolve -- The orchestrator checks all open positions against Polymarket's resolution API. Settled trades are marked WIN or LOSS, realized P&L is computed, and Brier scores are logged for calibration tracking.
-
Scan -- The MarketScanner pulls the full market list from Polymarket's CLOB API and applies hard filters: minimum $1K volume, spread under 5%, resolution date within 10 days. This typically cuts ~5,000 markets down to 4 candidates worth investigating.
-
Research -- Each candidate market is assigned its own ResearchAgentCC instance, which runs in parallel. The agent performs deep web search via Claude Code CLI, gathering news articles, data sources, and expert analysis relevant to the market's question.
-
Estimate -- Three independent estimators run simultaneously on each market:
- EstimatorB (30% weight) ingests the research and produces a probability estimate
- EstimatorC (40% weight) ignores research entirely and uses only quantitative data feeds (sportsbook odds, crypto prices, weather ensemble forecasts)
- EstimatorD (30% weight) plays adversarial bear case -- its job is to find trap risks and scenarios where the obvious answer is wrong
-
Aggregate -- The AggregatorAgent computes a weighted average of the three estimates, applies Platt calibration with 10% shrinkage toward 50% (correcting for overconfidence), and runs a disagreement check. If the spread between estimators exceeds 35%, the market is rejected -- high disagreement means low conviction.
-
Risk -- Every surviving trade passes through 11 independent risk checks (detailed below). Position size is calculated using quarter-Kelly criterion. Any single check failure kills the trade.
-
CRO + Execution -- The Chief Research Officer agent reviews the full analysis with a devil's advocate protocol, actively looking for reasons NOT to trade. If approved, the TraderAgent places the order on Polymarket's CLOB with the calculated stake.
| Agent | Role | Model | Phase |
|---|---|---|---|
OrchestratorV2 |
Pipeline coordinator, state machine | Sonnet | All |
MarketScanner |
Filters 5,000 markets to 4 candidates | Sonnet | 1 |
ResearchAgentCC |
Deep web search per market | Claude Code CLI | 2 |
EstimatorB |
Research-informed probability estimation | Sonnet | 3 |
EstimatorC |
Quantitative/data-driven estimation | Sonnet | 3 |
EstimatorD |
Adversarial bear case estimation | Sonnet | 3 |
AggregatorAgent |
Weighted combination + Platt calibration | Sonnet | 4 |
RiskManager |
11 pre-trade validation checks | Sonnet | 5 |
TraderAgent |
Executes trades on Polymarket CLOB | Sonnet | 7 |
SentimentAgent |
Market sentiment analysis | Haiku | 2 |
PositionMonitor |
Portfolio tracking and exposure | Sonnet | 0 |
ArbScanner |
Cross-exchange arbitrage detection | Sonnet | 1 |
HindsightAgent |
Post-mortem analysis on losses | Sonnet | Post |
Every trade must pass all 11 checks. A single failure vetoes the trade.
| # | Check | Rule |
|---|---|---|
| 1 | Kill switch | Abort if STOP file exists in project root |
| 2 | Max drawdown | Graduated response: GREEN / YELLOW / ORANGE / RED |
| 3 | Position size | Capped at 15% of total capital |
| 4 | Daily loss limit | Halt trading if daily loss exceeds -25% |
| 5 | Balance check | Verify sufficient USDC before order |
| 6 | Concentration | Max 2 open positions per market category |
| 7 | Correlation | Reject bets correlated with existing positions |
| 8 | Portfolio exposure | Single category capped at 40% of portfolio |
| 9 | Entry price bounds | Only trade in the 30c -- 70c range |
| 10 | Minimum edge | Category-specific threshold (see below) |
| 11 | Spread / liquidity | Reject illiquid markets |
All positions use quarter-Kelly (0.25x) for conservative sizing:
YES side: kelly = edge / (1 - price) * 0.25
NO side: kelly = edge / price * 0.25
Position capped at 15% of capital regardless of Kelly output.
| Category | Min Edge | Stake Multiplier | Data Source |
|---|---|---|---|
| Weather | 8% | 1.0x | Open-Meteo GFS 31-member ensemble |
| Crypto | 5% | 0.75x | Binance real-time price feeds |
| Politics | 7% | 0.5x | Polls, expert analysis |
| Economics | 5% | 0.75x | Nowcast data |
| Geopolitics | 10% | 0.5x | GDELT sentiment index |
| Sports | -- | BLOCKED | Disabled in V3 (insufficient edge) |
# Clone and set up environment
git clone https://github.com/sahnia3/polymarket-agent.git
cd polymarket-agent
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
# Configure credentials
cp .env.example .env # Fill in your API keys (see below)
# Run a single trading cycle
python scripts/run_cycle_v2.py
# Dry run (no real trades executed)
python scripts/run_cycle_v2.py --dry-run| Variable | Description |
|---|---|
POLYGON_WALLET_PRIVATE_KEY |
Polygon wallet private key for on-chain transactions |
ANTHROPIC_API_KEY |
Anthropic API key for Claude Sonnet + Haiku |
POLYMARKET_API_KEY |
Polymarket CLOB API key |
POLYMARKET_SECRET |
Polymarket API secret |
POLYMARKET_PASSPHRASE |
Polymarket API passphrase |
TRADE_WEBHOOK_URL |
Webhook URL for trade notifications (optional) |
APIFY_API_KEY |
Apify key for web scraping (optional) |
SQLite with full audit trail across 6 tables:
trades -- Every trade placed, with entry/exit prices and P&L
analyses -- Full analysis output per market per cycle
portfolio_snapshots -- Point-in-time portfolio state
agent_events -- Structured logs from every agent invocation
agent_runs -- Metadata for each pipeline run
agent_outputs -- Raw LLM outputs for debugging and review
Project Structure
polymarket-agent/
├── agents/ # Multi-agent orchestration (~2,400 LOC)
│ ├── orchestrator_v2.py # Main pipeline (914 lines)
│ ├── market_scanner.py # Market filtering (348 lines)
│ ├── researcher_cc.py # Web research (382 lines)
│ ├── estimators.py # 3 probability estimators (707 lines)
│ ├── aggregator.py # Calibration + weighting (677 lines)
│ ├── risk_manager.py # 11 risk checks (282 lines)
│ ├── trader.py # Trade execution (266 lines)
│ └── ... # Sentiment, position monitor, arb scanner
├── core/ # Trading logic (~1,700 LOC)
│ ├── risk.py # Pre-trade validations (270 lines)
│ ├── strategy.py # Kelly criterion sizing (113 lines)
│ ├── market_client.py # Polymarket API wrapper (237 lines)
│ ├── data_feeds.py # GFS / Binance / GDELT integrations (625 lines)
│ └── ... # Categories, microstructure, arb calculator
├── db/ # Database & CRUD (~600 LOC)
│ ├── models.py # 6-table schema
│ └── store.py # CRUD operations
├── scripts/ # Automation (~3,800 LOC)
│ ├── run_cycle_v2.py # Entry point
│ ├── cro_report.py # Daily CRO PDF report
│ ├── hindsight_agent.py # Post-mortem analysis
│ └── ... # Portfolio, risk checks, backtesting
├── config.py # All risk parameters & thresholds
└── requirements.txt # Dependencies
Data Feed Details
Pulls 31-member GFS ensemble forecasts for weather markets. Each ensemble member represents a slightly different initial condition, giving a natural probability distribution over outcomes. The agent converts "will temperature exceed X" into a probability by counting ensemble members that satisfy the condition.
Real-time price feeds for crypto markets. Used by EstimatorC to anchor probability estimates to current prices and recent volatility, rather than relying on narrative-driven research.
The GDELT Project monitors news media worldwide. The agent pulls sentiment indices for geopolitical markets to gauge media tone and event intensity, providing a quantitative signal for otherwise qualitative questions.
Why three estimators instead of one? A single LLM probability estimate is poorly calibrated and overconfident. By forcing three independent estimates -- one research-driven, one purely quantitative, one adversarial -- and then calibrating the aggregate, the system produces meaningfully better probabilities. The adversarial estimator (EstimatorD) is particularly important: it exists solely to find reasons the trade will fail.
Why quarter-Kelly? Full Kelly sizing maximizes long-run growth rate but produces enormous variance. Quarter-Kelly sacrifices ~50% of theoretical growth for a ~75% reduction in drawdown. For a system making bets on AI-estimated probabilities, conservative sizing is the only sane choice.
Why block sports? V2 traded sports markets. Win rate was below breakeven after accounting for spread. Sports markets on Polymarket are efficiently priced by specialists with better models. The honest move was to stop.
This software is provided for educational and research purposes only.
- Prediction market trading involves substantial risk of loss
- Past performance does not guarantee future results
- The authors assume no liability for financial losses incurred through use of this software
- This is not financial advice
- You are solely responsible for any trades executed by this software
- Ensure compliance with all applicable laws and regulations in your jurisdiction before use