Iqisa is a library for handling and comparing forecasting datasets from different platforms.
The eventual success of my archives reinforced my view that public permission-less datasets are often a bottleneck to research: you cannot guarantee that people will use your dataset, but you can guarantee that they won’t use it.
—Gwern Branwen, “2019 News”, 2019
Iqisa is a collection of forecasting datasets and a simple library for handling those datasets. Code and data available here.
So far it contains data from:
- The Good Judgment Project: ~790k market trades+~3.14m survey predictions≈3.9m forecasts
- The Metaculus public API: ~210k forecasts
- PredictionBook: ~64k forecasts
for a total of ~4.2m forecasts, as well as code for handling private Metaculus data (available to researchers on request to Metaculus), but I plan to also add data from various other sources.
The documentation can be found here, but a simple example for using the library is seeing whether traders with more than 100 trades have a better Brier score than traders in general:
import numpy as np
from iqisa import gjp
import iqisa.iqisa as iqs
market_fcasts=gjp.load_markets()
def brier_score(probabilities, outcomes):
return np.mean((probabilities-outcomes)**2)
def brier_score_user(user_forecasts):
user_right=(user_forecasts['outcome']==user_forecasts['answer_option'])
probabilities=user_forecasts['probability']
return np.mean((probabilities-user_right)**2)
trader_scores=iqs.score(market_fcasts, brier_score, on=['user_id'])
filtered_trader_scores=iqs.score(market_fcasts.groupby(['user_id']).filter(lambda x: len(x)>100), brier_score, on=['user_id'])
And we can see:
>>> np.mean(trader_scores)
score 0.159194
dtype: float64
>>> np.mean(filtered_trader_scores)
score 0.159018
dtype: float64
Concluding that more experienced traders are only very slightly better at trading.
- Take questions from different platforms that are close to each other on sentence2vec, and check which platform made the better predictions on that question.
Since this is a project I'm now doing in my free time, it might not be as polished as it should be. Sorry :-/
If you decide to work with this library, feel free to contact me.
- Issues with the time fields
- The native pandas datetime format is too restricted for some time ranges in these datasets, those values might be set to
NaT
. - Not all time-related fields have timezone information attached to them.
- The native pandas datetime format is too restricted for some time ranges in these datasets, those values might be set to
- Some predictions in the dataset have occurred after question resolution. There should be a way to filter those out programmatically.
- The columns of the datasets are not sorted the same way for question DataFrames and forecast DataFrames.
- I fear that despite my best efforts, not all data frome the GJP data has been transferred.
- The default fields in the Metaculus & PredictionBook data should be
NA
more often than they are right now. - The documentation is still slightly spotty, and tests are mostly nonexistent.
- Some variables shouldn't be exposed, but are.
- Create a pip package
- Add data from more platforms
- Metaforecast API
- Foretell (CSET)
- Good Judgment Open
- Hypermind
- Augur
- Foretold
- Omen
- GiveWell
- Open Philanthropy Project
- PredictIt
- Elicit
- PolyMarket
- Iowa Electronic Markets
- INFER
- Manifold
- Smarkets
- The Odds API
Credits go to Arb Research for funding the first 80% of this work, and Misha Yagudin in particular for guidance and mentorship.