Skip to content

sahnia3/polymarket-agent

Repository files navigation

Polymarket Trading Agent

An autonomous multi-agent system that trades prediction markets using a 7-phase pipeline of 13 specialized AI agents.

Python 3.12 Claude API Polymarket Lines of Code MIT License


WARNING: This project is for educational and research purposes only. Trading prediction markets involves real financial risk. This software is provided as-is with no guarantees of profitability. Do not trade with money you cannot afford to lose.


What This Does

The agent scans ~5,000 live Polymarket prediction markets, filters them down to a handful of mispriced opportunities, researches each one through web search and quantitative data feeds, estimates true probabilities using an ensemble of three independent estimators, validates every trade through 11 risk management checks, and executes via the Polymarket CLOB API with quarter-Kelly position sizing.

Every decision is logged. Every trade has a full audit trail. Every loss triggers a post-mortem.


The Pipeline

This is the core of the system -- a 7-phase sequential pipeline that runs as a single trading cycle.

flowchart TB
    subgraph phase0["Phase 0 -- Resolve"]
        R[Resolve Settled Trades]
        R --> R1[Mark outcomes WIN/LOSS]
        R --> R2[Compute realized P&L]
        R --> R3[Calculate Brier scores]
    end

    subgraph phase1["Phase 1 -- Scan"]
        S[Market Scanner]
        S --> F1["Volume > $1K"]
        S --> F2["Spread < 5%"]
        S --> F3["Resolves within 10 days"]
        F1 & F2 & F3 --> C["5,000 markets --> 4 candidates"]
    end

    subgraph phase2["Phase 2 -- Research"]
        direction LR
        RCC1["ResearchAgentCC\n(Market A)"]
        RCC2["ResearchAgentCC\n(Market B)"]
        RCC3["ResearchAgentCC\n(Market C)"]
        RCC4["ResearchAgentCC\n(Market D)"]
    end

    subgraph phase3["Phase 3 -- Estimate"]
        direction LR
        EB["EstimatorB\n30% weight\nResearch-informed"]
        EC["EstimatorC\n40% weight\nQuantitative only"]
        ED["EstimatorD\n30% weight\nAdversarial bear case"]
    end

    subgraph phase4["Phase 4 -- Aggregate"]
        AGG["Weighted Average"]
        AGG --> PLATT["Platt Calibration\n10% shrinkage toward 50%"]
        PLATT --> DIS{"Disagreement\n> 35%?"}
        DIS -- Yes --> REJECT[Reject Market]
        DIS -- No --> PASS[Forward to Risk]
    end

    subgraph phase5["Phase 5 -- Risk Management"]
        RM["11 Pre-Trade Checks"]
        RM --> KELLY["Quarter-Kelly Sizing"]
    end

    subgraph phase6["Phase 6 -- CRO Approval"]
        CRO["Chief Research Officer"]
        CRO --> DA["Devil's Advocate Protocol"]
        DA --> APPROVE{Approve?}
    end

    subgraph phase7["Phase 7 -- Execute"]
        EXEC["Place Order via CLOB API"]
        EXEC --> LOG["Full Audit Trail to SQLite"]
    end

    phase0 --> phase1 --> phase2 --> phase3 --> phase4
    phase4 --> phase5 --> phase6
    APPROVE -- Yes --> phase7
    APPROVE -- No --> KILL[No Trade]

    style phase0 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase1 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase2 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase3 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase4 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase5 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase6 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase7 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style REJECT fill:#8b0000,stroke:#444,color:#e0e0e0
    style KILL fill:#8b0000,stroke:#444,color:#e0e0e0
    style PASS fill:#006400,stroke:#444,color:#e0e0e0
    style APPROVE fill:#1a1a2e,stroke:#444,color:#e0e0e0
Loading

How a Single Cycle Works

  1. Resolve -- The orchestrator checks all open positions against Polymarket's resolution API. Settled trades are marked WIN or LOSS, realized P&L is computed, and Brier scores are logged for calibration tracking.

  2. Scan -- The MarketScanner pulls the full market list from Polymarket's CLOB API and applies hard filters: minimum $1K volume, spread under 5%, resolution date within 10 days. This typically cuts ~5,000 markets down to 4 candidates worth investigating.

  3. Research -- Each candidate market is assigned its own ResearchAgentCC instance, which runs in parallel. The agent performs deep web search via Claude Code CLI, gathering news articles, data sources, and expert analysis relevant to the market's question.

  4. Estimate -- Three independent estimators run simultaneously on each market:

    • EstimatorB (30% weight) ingests the research and produces a probability estimate
    • EstimatorC (40% weight) ignores research entirely and uses only quantitative data feeds (sportsbook odds, crypto prices, weather ensemble forecasts)
    • EstimatorD (30% weight) plays adversarial bear case -- its job is to find trap risks and scenarios where the obvious answer is wrong
  5. Aggregate -- The AggregatorAgent computes a weighted average of the three estimates, applies Platt calibration with 10% shrinkage toward 50% (correcting for overconfidence), and runs a disagreement check. If the spread between estimators exceeds 35%, the market is rejected -- high disagreement means low conviction.

  6. Risk -- Every surviving trade passes through 11 independent risk checks (detailed below). Position size is calculated using quarter-Kelly criterion. Any single check failure kills the trade.

  7. CRO + Execution -- The Chief Research Officer agent reviews the full analysis with a devil's advocate protocol, actively looking for reasons NOT to trade. If approved, the TraderAgent places the order on Polymarket's CLOB with the calculated stake.


Agent Architecture

Agent Role Model Phase
OrchestratorV2 Pipeline coordinator, state machine Sonnet All
MarketScanner Filters 5,000 markets to 4 candidates Sonnet 1
ResearchAgentCC Deep web search per market Claude Code CLI 2
EstimatorB Research-informed probability estimation Sonnet 3
EstimatorC Quantitative/data-driven estimation Sonnet 3
EstimatorD Adversarial bear case estimation Sonnet 3
AggregatorAgent Weighted combination + Platt calibration Sonnet 4
RiskManager 11 pre-trade validation checks Sonnet 5
TraderAgent Executes trades on Polymarket CLOB Sonnet 7
SentimentAgent Market sentiment analysis Haiku 2
PositionMonitor Portfolio tracking and exposure Sonnet 0
ArbScanner Cross-exchange arbitrage detection Sonnet 1
HindsightAgent Post-mortem analysis on losses Sonnet Post

Risk Management

Every trade must pass all 11 checks. A single failure vetoes the trade.

# Check Rule
1 Kill switch Abort if STOP file exists in project root
2 Max drawdown Graduated response: GREEN / YELLOW / ORANGE / RED
3 Position size Capped at 15% of total capital
4 Daily loss limit Halt trading if daily loss exceeds -25%
5 Balance check Verify sufficient USDC before order
6 Concentration Max 2 open positions per market category
7 Correlation Reject bets correlated with existing positions
8 Portfolio exposure Single category capped at 40% of portfolio
9 Entry price bounds Only trade in the 30c -- 70c range
10 Minimum edge Category-specific threshold (see below)
11 Spread / liquidity Reject illiquid markets

Kelly Criterion Sizing

All positions use quarter-Kelly (0.25x) for conservative sizing:

YES side:  kelly = edge / (1 - price) * 0.25
NO  side:  kelly = edge / price * 0.25

Position capped at 15% of capital regardless of Kelly output.

Category Strategy

Category Min Edge Stake Multiplier Data Source
Weather 8% 1.0x Open-Meteo GFS 31-member ensemble
Crypto 5% 0.75x Binance real-time price feeds
Politics 7% 0.5x Polls, expert analysis
Economics 5% 0.75x Nowcast data
Geopolitics 10% 0.5x GDELT sentiment index
Sports -- BLOCKED Disabled in V3 (insufficient edge)

Getting Started

# Clone and set up environment
git clone https://github.com/sahnia3/polymarket-agent.git
cd polymarket-agent
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# Configure credentials
cp .env.example .env  # Fill in your API keys (see below)

# Run a single trading cycle
python scripts/run_cycle_v2.py

# Dry run (no real trades executed)
python scripts/run_cycle_v2.py --dry-run

Required Environment Variables

Variable Description
POLYGON_WALLET_PRIVATE_KEY Polygon wallet private key for on-chain transactions
ANTHROPIC_API_KEY Anthropic API key for Claude Sonnet + Haiku
POLYMARKET_API_KEY Polymarket CLOB API key
POLYMARKET_SECRET Polymarket API secret
POLYMARKET_PASSPHRASE Polymarket API passphrase
TRADE_WEBHOOK_URL Webhook URL for trade notifications (optional)
APIFY_API_KEY Apify key for web scraping (optional)

Database Schema

SQLite with full audit trail across 6 tables:

trades              -- Every trade placed, with entry/exit prices and P&L
analyses            -- Full analysis output per market per cycle
portfolio_snapshots -- Point-in-time portfolio state
agent_events        -- Structured logs from every agent invocation
agent_runs          -- Metadata for each pipeline run
agent_outputs       -- Raw LLM outputs for debugging and review

Project Structure
polymarket-agent/
├── agents/                  # Multi-agent orchestration (~2,400 LOC)
│   ├── orchestrator_v2.py   # Main pipeline (914 lines)
│   ├── market_scanner.py    # Market filtering (348 lines)
│   ├── researcher_cc.py     # Web research (382 lines)
│   ├── estimators.py        # 3 probability estimators (707 lines)
│   ├── aggregator.py        # Calibration + weighting (677 lines)
│   ├── risk_manager.py      # 11 risk checks (282 lines)
│   ├── trader.py            # Trade execution (266 lines)
│   └── ...                  # Sentiment, position monitor, arb scanner
├── core/                    # Trading logic (~1,700 LOC)
│   ├── risk.py              # Pre-trade validations (270 lines)
│   ├── strategy.py          # Kelly criterion sizing (113 lines)
│   ├── market_client.py     # Polymarket API wrapper (237 lines)
│   ├── data_feeds.py        # GFS / Binance / GDELT integrations (625 lines)
│   └── ...                  # Categories, microstructure, arb calculator
├── db/                      # Database & CRUD (~600 LOC)
│   ├── models.py            # 6-table schema
│   └── store.py             # CRUD operations
├── scripts/                 # Automation (~3,800 LOC)
│   ├── run_cycle_v2.py      # Entry point
│   ├── cro_report.py        # Daily CRO PDF report
│   ├── hindsight_agent.py   # Post-mortem analysis
│   └── ...                  # Portfolio, risk checks, backtesting
├── config.py                # All risk parameters & thresholds
└── requirements.txt         # Dependencies
Data Feed Details

Open-Meteo GFS Ensemble (Weather)

Pulls 31-member GFS ensemble forecasts for weather markets. Each ensemble member represents a slightly different initial condition, giving a natural probability distribution over outcomes. The agent converts "will temperature exceed X" into a probability by counting ensemble members that satisfy the condition.

Binance (Crypto)

Real-time price feeds for crypto markets. Used by EstimatorC to anchor probability estimates to current prices and recent volatility, rather than relying on narrative-driven research.

GDELT (Geopolitics)

The GDELT Project monitors news media worldwide. The agent pulls sentiment indices for geopolitical markets to gauge media tone and event intensity, providing a quantitative signal for otherwise qualitative questions.


Architecture Decisions

Why three estimators instead of one? A single LLM probability estimate is poorly calibrated and overconfident. By forcing three independent estimates -- one research-driven, one purely quantitative, one adversarial -- and then calibrating the aggregate, the system produces meaningfully better probabilities. The adversarial estimator (EstimatorD) is particularly important: it exists solely to find reasons the trade will fail.

Why quarter-Kelly? Full Kelly sizing maximizes long-run growth rate but produces enormous variance. Quarter-Kelly sacrifices ~50% of theoretical growth for a ~75% reduction in drawdown. For a system making bets on AI-estimated probabilities, conservative sizing is the only sane choice.

Why block sports? V2 traded sports markets. Win rate was below breakeven after accounting for spread. Sports markets on Polymarket are efficiently priced by specialists with better models. The honest move was to stop.


Disclaimer

This software is provided for educational and research purposes only.

  • Prediction market trading involves substantial risk of loss
  • Past performance does not guarantee future results
  • The authors assume no liability for financial losses incurred through use of this software
  • This is not financial advice
  • You are solely responsible for any trades executed by this software
  • Ensure compliance with all applicable laws and regulations in your jurisdiction before use

License

MIT

About

Autonomous multi-agent trading system for prediction markets — 13 AI agents, 7-phase pipeline, real-time execution

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors