GitHub - sahnia3/polymarket-agent: Autonomous multi-agent trading system for prediction markets — 13 AI agents, 7-phase pipeline, real-time execution

Polymarket Trading Agent

An autonomous multi-agent system that trades prediction markets using a 7-phase pipeline of 13 specialized AI agents.

WARNING: This project is for educational and research purposes only. Trading prediction markets involves real financial risk. This software is provided as-is with no guarantees of profitability. Do not trade with money you cannot afford to lose.

What This Does

The agent scans ~5,000 live Polymarket prediction markets, filters them down to a handful of mispriced opportunities, researches each one through web search and quantitative data feeds, estimates true probabilities using an ensemble of three independent estimators, validates every trade through 11 risk management checks, and executes via the Polymarket CLOB API with quarter-Kelly position sizing.

Every decision is logged. Every trade has a full audit trail. Every loss triggers a post-mortem.

The Pipeline

This is the core of the system -- a 7-phase sequential pipeline that runs as a single trading cycle.

flowchart TB
    subgraph phase0["Phase 0 -- Resolve"]
        R[Resolve Settled Trades]
        R --> R1[Mark outcomes WIN/LOSS]
        R --> R2[Compute realized P&L]
        R --> R3[Calculate Brier scores]
    end

    subgraph phase1["Phase 1 -- Scan"]
        S[Market Scanner]
        S --> F1["Volume > $1K"]
        S --> F2["Spread < 5%"]
        S --> F3["Resolves within 10 days"]
        F1 & F2 & F3 --> C["5,000 markets --> 4 candidates"]
    end

    subgraph phase2["Phase 2 -- Research"]
        direction LR
        RCC1["ResearchAgentCC\n(Market A)"]
        RCC2["ResearchAgentCC\n(Market B)"]
        RCC3["ResearchAgentCC\n(Market C)"]
        RCC4["ResearchAgentCC\n(Market D)"]
    end

    subgraph phase3["Phase 3 -- Estimate"]
        direction LR
        EB["EstimatorB\n30% weight\nResearch-informed"]
        EC["EstimatorC\n40% weight\nQuantitative only"]
        ED["EstimatorD\n30% weight\nAdversarial bear case"]
    end

    subgraph phase4["Phase 4 -- Aggregate"]
        AGG["Weighted Average"]
        AGG --> PLATT["Platt Calibration\n10% shrinkage toward 50%"]
        PLATT --> DIS{"Disagreement\n> 35%?"}
        DIS -- Yes --> REJECT[Reject Market]
        DIS -- No --> PASS[Forward to Risk]
    end

    subgraph phase5["Phase 5 -- Risk Management"]
        RM["11 Pre-Trade Checks"]
        RM --> KELLY["Quarter-Kelly Sizing"]
    end

    subgraph phase6["Phase 6 -- CRO Approval"]
        CRO["Chief Research Officer"]
        CRO --> DA["Devil's Advocate Protocol"]
        DA --> APPROVE{Approve?}
    end

    subgraph phase7["Phase 7 -- Execute"]
        EXEC["Place Order via CLOB API"]
        EXEC --> LOG["Full Audit Trail to SQLite"]
    end

    phase0 --> phase1 --> phase2 --> phase3 --> phase4
    phase4 --> phase5 --> phase6
    APPROVE -- Yes --> phase7
    APPROVE -- No --> KILL[No Trade]

    style phase0 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase1 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase2 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase3 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase4 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase5 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase6 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style phase7 fill:#1a1a2e,stroke:#444,color:#e0e0e0
    style REJECT fill:#8b0000,stroke:#444,color:#e0e0e0
    style KILL fill:#8b0000,stroke:#444,color:#e0e0e0
    style PASS fill:#006400,stroke:#444,color:#e0e0e0
    style APPROVE fill:#1a1a2e,stroke:#444,color:#e0e0e0

How a Single Cycle Works

Resolve -- The orchestrator checks all open positions against Polymarket's resolution API. Settled trades are marked WIN or LOSS, realized P&L is computed, and Brier scores are logged for calibration tracking.
Scan -- The MarketScanner pulls the full market list from Polymarket's CLOB API and applies hard filters: minimum $1K volume, spread under 5%, resolution date within 10 days. This typically cuts ~5,000 markets down to 4 candidates worth investigating.
Research -- Each candidate market is assigned its own ResearchAgentCC instance, which runs in parallel. The agent performs deep web search via Claude Code CLI, gathering news articles, data sources, and expert analysis relevant to the market's question.
Estimate -- Three independent estimators run simultaneously on each market:
- EstimatorB (30% weight) ingests the research and produces a probability estimate
- EstimatorC (40% weight) ignores research entirely and uses only quantitative data feeds (sportsbook odds, crypto prices, weather ensemble forecasts)
- EstimatorD (30% weight) plays adversarial bear case -- its job is to find trap risks and scenarios where the obvious answer is wrong
Aggregate -- The AggregatorAgent computes a weighted average of the three estimates, applies Platt calibration with 10% shrinkage toward 50% (correcting for overconfidence), and runs a disagreement check. If the spread between estimators exceeds 35%, the market is rejected -- high disagreement means low conviction.
Risk -- Every surviving trade passes through 11 independent risk checks (detailed below). Position size is calculated using quarter-Kelly criterion. Any single check failure kills the trade.
CRO + Execution -- The Chief Research Officer agent reviews the full analysis with a devil's advocate protocol, actively looking for reasons NOT to trade. If approved, the TraderAgent places the order on Polymarket's CLOB with the calculated stake.

Agent Architecture

Agent	Role	Model	Phase
`OrchestratorV2`	Pipeline coordinator, state machine	Sonnet	All
`MarketScanner`	Filters 5,000 markets to 4 candidates	Sonnet	1
`ResearchAgentCC`	Deep web search per market	Claude Code CLI	2
`EstimatorB`	Research-informed probability estimation	Sonnet	3
`EstimatorC`	Quantitative/data-driven estimation	Sonnet	3
`EstimatorD`	Adversarial bear case estimation	Sonnet	3
`AggregatorAgent`	Weighted combination + Platt calibration	Sonnet	4
`RiskManager`	11 pre-trade validation checks	Sonnet	5
`TraderAgent`	Executes trades on Polymarket CLOB	Sonnet	7
`SentimentAgent`	Market sentiment analysis	Haiku	2
`PositionMonitor`	Portfolio tracking and exposure	Sonnet	0
`ArbScanner`	Cross-exchange arbitrage detection	Sonnet	1
`HindsightAgent`	Post-mortem analysis on losses	Sonnet	Post

Risk Management

Every trade must pass all 11 checks. A single failure vetoes the trade.

#	Check	Rule
1	Kill switch	Abort if `STOP` file exists in project root
2	Max drawdown	Graduated response: GREEN / YELLOW / ORANGE / RED
3	Position size	Capped at 15% of total capital
4	Daily loss limit	Halt trading if daily loss exceeds -25%
5	Balance check	Verify sufficient USDC before order
6	Concentration	Max 2 open positions per market category
7	Correlation	Reject bets correlated with existing positions
8	Portfolio exposure	Single category capped at 40% of portfolio
9	Entry price bounds	Only trade in the 30c -- 70c range
10	Minimum edge	Category-specific threshold (see below)
11	Spread / liquidity	Reject illiquid markets

Kelly Criterion Sizing

All positions use quarter-Kelly (0.25x) for conservative sizing:

YES side:  kelly = edge / (1 - price) * 0.25
NO  side:  kelly = edge / price * 0.25

Position capped at 15% of capital regardless of Kelly output.

Category Strategy

Category	Min Edge	Stake Multiplier	Data Source
Weather	8%	1.0x	Open-Meteo GFS 31-member ensemble
Crypto	5%	0.75x	Binance real-time price feeds
Politics	7%	0.5x	Polls, expert analysis
Economics	5%	0.75x	Nowcast data
Geopolitics	10%	0.5x	GDELT sentiment index
Sports	--	BLOCKED	Disabled in V3 (insufficient edge)

Getting Started

# Clone and set up environment
git clone https://github.com/sahnia3/polymarket-agent.git
cd polymarket-agent
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

# Configure credentials
cp .env.example .env  # Fill in your API keys (see below)

# Run a single trading cycle
python scripts/run_cycle_v2.py

# Dry run (no real trades executed)
python scripts/run_cycle_v2.py --dry-run

Required Environment Variables

Variable	Description
`POLYGON_WALLET_PRIVATE_KEY`	Polygon wallet private key for on-chain transactions
`ANTHROPIC_API_KEY`	Anthropic API key for Claude Sonnet + Haiku
`POLYMARKET_API_KEY`	Polymarket CLOB API key
`POLYMARKET_SECRET`	Polymarket API secret
`POLYMARKET_PASSPHRASE`	Polymarket API passphrase
`TRADE_WEBHOOK_URL`	Webhook URL for trade notifications (optional)
`APIFY_API_KEY`	Apify key for web scraping (optional)

Database Schema

SQLite with full audit trail across 6 tables:

trades              -- Every trade placed, with entry/exit prices and P&L
analyses            -- Full analysis output per market per cycle
portfolio_snapshots -- Point-in-time portfolio state
agent_events        -- Structured logs from every agent invocation
agent_runs          -- Metadata for each pipeline run
agent_outputs       -- Raw LLM outputs for debugging and review

Project Structure

polymarket-agent/
├── agents/                  # Multi-agent orchestration (~2,400 LOC)
│   ├── orchestrator_v2.py   # Main pipeline (914 lines)
│   ├── market_scanner.py    # Market filtering (348 lines)
│   ├── researcher_cc.py     # Web research (382 lines)
│   ├── estimators.py        # 3 probability estimators (707 lines)
│   ├── aggregator.py        # Calibration + weighting (677 lines)
│   ├── risk_manager.py      # 11 risk checks (282 lines)
│   ├── trader.py            # Trade execution (266 lines)
│   └── ...                  # Sentiment, position monitor, arb scanner
├── core/                    # Trading logic (~1,700 LOC)
│   ├── risk.py              # Pre-trade validations (270 lines)
│   ├── strategy.py          # Kelly criterion sizing (113 lines)
│   ├── market_client.py     # Polymarket API wrapper (237 lines)
│   ├── data_feeds.py        # GFS / Binance / GDELT integrations (625 lines)
│   └── ...                  # Categories, microstructure, arb calculator
├── db/                      # Database & CRUD (~600 LOC)
│   ├── models.py            # 6-table schema
│   └── store.py             # CRUD operations
├── scripts/                 # Automation (~3,800 LOC)
│   ├── run_cycle_v2.py      # Entry point
│   ├── cro_report.py        # Daily CRO PDF report
│   ├── hindsight_agent.py   # Post-mortem analysis
│   └── ...                  # Portfolio, risk checks, backtesting
├── config.py                # All risk parameters & thresholds
└── requirements.txt         # Dependencies

Data Feed Details

Open-Meteo GFS Ensemble (Weather)

Pulls 31-member GFS ensemble forecasts for weather markets. Each ensemble member represents a slightly different initial condition, giving a natural probability distribution over outcomes. The agent converts "will temperature exceed X" into a probability by counting ensemble members that satisfy the condition.

Binance (Crypto)

Real-time price feeds for crypto markets. Used by EstimatorC to anchor probability estimates to current prices and recent volatility, rather than relying on narrative-driven research.

GDELT (Geopolitics)

The GDELT Project monitors news media worldwide. The agent pulls sentiment indices for geopolitical markets to gauge media tone and event intensity, providing a quantitative signal for otherwise qualitative questions.

Architecture Decisions

Why three estimators instead of one? A single LLM probability estimate is poorly calibrated and overconfident. By forcing three independent estimates -- one research-driven, one purely quantitative, one adversarial -- and then calibrating the aggregate, the system produces meaningfully better probabilities. The adversarial estimator (EstimatorD) is particularly important: it exists solely to find reasons the trade will fail.

Why quarter-Kelly? Full Kelly sizing maximizes long-run growth rate but produces enormous variance. Quarter-Kelly sacrifices ~50% of theoretical growth for a ~75% reduction in drawdown. For a system making bets on AI-estimated probabilities, conservative sizing is the only sane choice.

Why block sports? V2 traded sports markets. Win rate was below breakeven after accounting for spread. Sports markets on Polymarket are efficiently priced by specialists with better models. The honest move was to stop.

Disclaimer

This software is provided for educational and research purposes only.

Prediction market trading involves substantial risk of loss
Past performance does not guarantee future results
The authors assume no liability for financial losses incurred through use of this software
This is not financial advice
You are solely responsible for any trades executed by this software
Ensure compliance with all applicable laws and regulations in your jurisdiction before use

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.claude/rules		.claude/rules
agents		agents
core		core
data		data
db		db
known-issues		known-issues
logs		logs
scripts		scripts
.dry_run_cycles		.dry_run_cycles
.env.example		.env.example
.gitignore		.gitignore
.schema		.schema
CLAUDE.md		CLAUDE.md
README.md		README.md
SKILL.md		SKILL.md
config.py		config.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What This Does

The Pipeline

How a Single Cycle Works

Agent Architecture

Risk Management

Kelly Criterion Sizing

Category Strategy

Getting Started

Required Environment Variables

Database Schema

Open-Meteo GFS Ensemble (Weather)

Binance (Crypto)

GDELT (Geopolitics)

Architecture Decisions

Disclaimer

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What This Does

The Pipeline

How a Single Cycle Works

Agent Architecture

Risk Management

Kelly Criterion Sizing

Category Strategy

Getting Started

Required Environment Variables

Database Schema

Open-Meteo GFS Ensemble (Weather)

Binance (Crypto)

GDELT (Geopolitics)

Architecture Decisions

Disclaimer

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages