A data pipeline and analysis framework for quantitative trading on Polymarket. Captures all market, trade, wallet, and price data into a local SQLite database, then provides analysis modules to find high win-rate patterns.
- Market Collector — Syncs all active + resolved markets from the Gamma API
- Trade Collector — Ingests per-market trades from the Orders subgraph (on-chain) (with wallet addresses)
- Price Collector — Backfills historical price timeseries from the CLOB API
- Wallet Collector — Scrapes the leaderboard, tracks top wallets' activity and positions
- Live Stream — Captures real-time trades and price changes via WebSocket
- Whale Tracker — Ranks wallets by PnL, win rate, and bot-likeness score
- Insider Detector — Flags wallets that consistently buy before big price moves
- Pattern Scanner — Finds volume spike patterns, late-stage convergence, category edges, time-of-day effects
- Copy Trade — Monitors top wallets for new trades and generates copy-trade signals
cd polymarket
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt# Sync market metadata (active + resolved, recent ~2000 resolved)
python main.py collect-markets --include-resolved
# Backfill historical resolved markets by date (for full coverage)
python main.py collect-markets --include-resolved --backfill-by-month
# Or specify date range: --backfill-start 2023-01 --backfill-end 2025-02
# Check your market coverage (counts, date range, per-month breakdown)
python main.py market-status
# Ingest trades for all markets (from subgraph on-chain data)
python main.py collect-trades
# Backfill price history
python main.py collect-prices
# Scrape leaderboard + track top wallets
python main.py collect-wallets
# Or run everything at once:
python main.py collect-all# Stream real-time trades/price changes into the database
python main.py stream --verbose# Find and rank profitable wallets
python main.py analyze-wallets --top 50
# Detect insider-like trading patterns
python main.py detect-insiders
# Scan for high win-rate patterns
python main.py scan-patterns
# Check for copy trade signals
python main.py copy-signals
# Continuously monitor for copy trade signals
python main.py copy-monitor# Check database statistics
python main.py status
# Check market coverage (date range, per-month counts)
python main.py market-statusRun daily (e.g. via cron):
# 1. Sync markets (active + recent resolved)
python main.py collect-markets --include-resolved
# 2. Optionally backfill historical gaps (run weekly or once)
python main.py collect-markets --include-resolved --backfill-by-monthUse python main.py market-status to verify:
- Total active vs resolved counts
- Date range of resolved markets
- Per-month breakdown to spot gaps
| Command | Description |
|---|---|
collect-markets |
Sync markets from Gamma API. Use --include-resolved for resolved markets, --backfill-by-month for date-range backfill |
market-status |
Show market coverage: counts, date range, per-month breakdown |
collect-trades |
Ingest trades from subgraph (on-chain data). Use --resolved-only for analysis |
collect-prices |
Backfill price history. Use --interval max --fidelity 60 |
collect-wallets |
Scrape leaderboard and collect wallet data. Use --leaderboard-only to skip activity |
stream |
WebSocket live data capture. Use -v for verbose output |
collect-all |
Run all collectors sequentially |
analyze-wallets |
Rank wallets by PnL, win rate, bot score |
detect-insiders |
Flag wallets with pre-move trading patterns |
scan-patterns |
Volume spikes, convergence, category edge, time-of-day |
copy-signals |
One-time copy trade signal check |
copy-monitor |
Continuous copy trade monitoring |
status |
Database row counts |
polymarket/
├── config.py # API URLs, DB path, tunable params
├── main.py # CLI orchestrator
├── db/
│ └── database.py # SQLite schema (8 tables) + helpers
├── api/
│ ├── gamma.py # Gamma API (markets, events)
│ ├── clob.py # CLOB API (prices, orderbooks, history)
│ ├── data_api.py # Data API (leaderboard, positions, activity)
│ └── ws_client.py # WebSocket (real-time stream)
├── collectors/
│ ├── market_collector.py # Market metadata sync
│ ├── trade_collector.py # Trade ingestion
│ ├── price_collector.py # Price history backfill
│ ├── wallet_collector.py # Leaderboard + wallet tracking
│ └── live_stream.py # WebSocket capture
└── analysis/
├── whale_tracker.py # Profitable wallet ranking
├── insider_detector.py # Pre-move detection
├── pattern_scanner.py # Pattern analysis
└── copy_trade.py # Copy trade signals
| Table | Description |
|---|---|
markets |
Market metadata (question, outcomes, volume, resolution) |
trades |
Per-market trades with wallet addresses |
price_history |
Token-level price timeseries |
wallets |
Leaderboard wallets (PnL, rank, tracked status) |
wallet_activity |
Trade-by-trade history for tracked wallets |
wallet_positions |
Current positions for tracked wallets |
live_trades |
Real-time trades captured via WebSocket |
live_price_changes |
Real-time price changes via WebSocket |
All settings are in config.py. Key tunables:
TOP_WALLETS_TO_TRACK— How many top wallets to auto-track (default: 100)INSIDER_PRICE_MOVE_THRESHOLD— Price move % to flag as significant (default: 15%)INSIDER_TIME_WINDOW_HOURS— Pre-move lookback window (default: 24h)VOLUME_SPIKE_MULTIPLIER— Volume multiple to flag as spike (default: 3x)COPY_TRADE_MIN_WALLET_SCORE— Minimum score to generate signal (default: 0.6)
| API | Base URL | Purpose |
|---|---|---|
| Gamma API | https://gamma-api.polymarket.com |
Markets, events, metadata |
| CLOB API | https://clob.polymarket.com |
Prices, orderbooks, price history |
| Data API | https://data-api.polymarket.com |
Leaderboard, positions, activity |
| Orders Subgraph | api.goldsky.com/.../orderbook-subgraph |
Trades (on-chain, complete) |
| WebSocket | wss://ws-subscriptions-clob.polymarket.com |
Real-time trade stream |
No authentication is required for public data endpoints.