Skip to content

smkwray/geoluck

Repository files navigation

geoluck  

How much of relative country prosperity can be predicted from geography, natural endowments, resource development, and social structure — and who beats their geography?

Geoluck is an open-source research project that builds a country-decade panel (1900–2020) and trains machine learning models to predict country-level income, wellbeing, inequality, wealth, and gender outcomes from tiered feature sets. The results are published as an interactive static site.

This is explicitly about predictive association, not causal effect.

View the live site →


What the site shows

The static site models seven outcome metrics, each converted to within-decade percentile ranks:

Outcome Definition Source
Income Log GDP per capita rank Maddison Project Database 2023
Wealth Produced capital per capita rank World Bank Changing Wealth of Nations
Life expectancy Life expectancy at birth rank World Bank WDI / UN Population Division
Inequality Disposable-income Gini rank (higher = more unequal) SWIID
Gender inequality UNDP Gender Inequality Index rank (higher = more unequal) UNDP HDR 2025
Female LFPR Female labor-force participation rate rank World Bank WDI / ILO
Women, Business and the Law Women, Business and the Law index rank World Bank

Predictor features are organized into three independently toggleable tiers:

  • Nature — Pure geography: latitude, climate normals, terrain, soil, malaria ecology, seismic activity, wind/solar potential, ocean productivity, cyclone exposure.
  • Infrastructure — Resource development: dams, irrigation, oil/gas/coal/mineral extraction, agricultural land use, energy assets.
  • Society — Social and institutional structure: governance, democracy, trade openness, colonial history, ethnic/religious fractionalization, gender inequality, demographics.

All seven non-empty tier combinations are modeled independently for each outcome (28 model bundles). The site supports interactive choropleth maps, model comparison, country-level SHAP feature contributions, country-vs-country comparison, feature exploration by data source, full sortable rankings with CSV export, and shareable deep links.


Repository structure

src/           Python pipeline — ETL, feature building, modeling, export
web/           Static frontend — TypeScript, Vite, Leaflet, Chart.js
docs/          Methodology and payload documentation
web/public/data/   Precomputed JSON payloads consumed by the frontend

Data policy

Raw and intermediate research data are not stored in the public repository. Only compact, precomputed JSON payloads required by the static site are committed under web/public/data/. These are generated by the Python pipeline's export commands.


Modeling notes

  • Models are evaluated out of sample using cross-validated R², RMSE, MAE, and Spearman rank correlation.
  • User-facing predictions and residuals use cross-validated exports, not in-sample fits.
  • Feature contributions use SHAP values from fold-trained estimators.
  • Results should be interpreted as predictive structure, not causal effects. A high R² for Nature-only features means geography is a strong statistical predictor — likely because it correlates with deeper causal channels — not that geography causes prosperity.

GitHub Pages deployment

The site is deployed through GitHub Actions, not "Deploy from a branch."

In repository Settings → Pages, set the source to GitHub Actions. The workflow builds the frontend from web/ and publishes the contents of web/dist/.


Local development

# Python pipeline
make sync        # Install/sync Python dependencies
make test        # Run tests

# Frontend
make web-build   # Build the static site (output: web/dist/)

The frontend expects JSON data under web/public/data/. These payloads are committed to the repository and are generated by:

uv run geoluck export-web-data

For frontend development with hot reload:

cd web && npm run dev

Documentation


License

MIT

About

How much of country prosperity can be predicted from geography? ML models on a country-decade panel (1900–2020) with an interactive site.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors