Prediction markets are supposed to be perfectly calibrated probability machines. They're not.
CalibShi analyzes 8,494 historical settled weather markets from Kalshi's KXHIGHNY series to quantify systematic miscalibration in market-implied probabilities and provides a recalibration model that achieves 14.8x improvement in calibration accuracy.
- Markets Analyzed: 8,494 settled trades
- Series: KXHIGHNY (NYC daily high temperature)
- Raw Calibration Error (ECE): 0.01624
- Recalibrated ECE: 0.00109
- Improvement Factor: 14.8x
Markets systematically misprice extreme events:
- Overconfident at probabilities > 0.7 (e.g., say 80%, actually happens 65% of the time)
- Underconfident around 0.3–0.5 probabilities
- Systematically wrong across the board
Isotonic Regression recalibrates raw market probabilities to ground truth. The model learns the empirical relationship between predicted and actual outcome rates, then corrects future market prices.
Full analysis, code, and visualizations: 👉 CalibShi on Zerve Gallery
Click "See in Zerve" to:
- Run blocks live (data fetches from Kalshi API in real-time)
- Inspect calibration curves
- View model performance metrics
- Re-train the Isotonic Regression model
- Platform: Zerve AI
- Language: Python
- Data: pandas, numpy
- ML: scikit-learn (Isotonic Regression, Platt Scaling, Beta Calibration)
- Viz: matplotlib
- Data Source: Kalshi Public API (no auth required)
MIT