An interactive dark map of US data centers — operational, under construction, and planned — with editorial judgment calls on each facility (purpose-built vs speculative, AI vs general compute, operator type).
Live data is pulled from OpenStreetMap and a curated layer of headline announced AI megacampuses, classified by Claude, and published as a static JSON the site reads at runtime. The whole thing is refreshed daily.
OSM Overpass API ──fetch.py──▶ data/facilities.raw.json ─┐
news + trade RSS ──discover.py─▶ candidates ─(Claude)─┐ │
data/curated.json (curated + discovered campuses) ────┼──┤
data/classifications.json (editorial cache) ──────────┼──┼─build.py─▶ site/public/data/data-centers.json
pipeline/operators.json (operator lookup) ────────────┘ ┘ + build-meta.json
+ data/unclassified.json (worklist)
Data sources for planned facilities (the hardest part — OSM barely maps them):
pipeline/discover.pyharvests Google News search feeds + trade press (DataCenterDynamics, DataCenterKnowledge, Bisnow, Data Center POST), filters to likely new US announcements, and writes a candidate worklist.- During the daily refresh, Claude reads the candidates, keeps the genuine new
projects, geocodes them (
pipeline/geocode.py, OSM Nominatim), and adds them todata/curated.json— typicallystatus: planned,purpose: speculative. - Feed URLs live in
pipeline/sources.json(brittle by nature — re-verify if a feed goes quiet).
Power-generation layer (EIA Form 860M): a toggleable second layer of ~6,200 US
power plants (all planned/under-construction + operating ≥25 MW, ~96% of capacity),
colored by fuel and sized by megawatts, to show the grid behind the buildout.
pipeline/fetch_power.py ingests it → site/public/data/power-plants.json. This is
a separate monthly step (EIA updates monthly; the workbook is ~14 MB) and is the
only part that needs a dependency (openpyxl) — the daily refresh stays stdlib-only.
Insights & geographic aggregation: an "Insights" dashboard ranks states by
data-center count vs. generation GW (supply vs. demand), with type/fuel/workload
breakdowns, and an optional state choropleth shades the map by either metric.
build.py backfills each facility's state via point-in-polygon against
pipeline/us-states.geojson (OSM only populates ~half) and emits
site/public/data/us-states.geojson for the choropleth.
- No backend, no database. The "data store" is
site/public/data/data-centers.json, committed to the repo. - Front-end: Vite + TypeScript + MapLibre GL JS, OpenFreeMap dark basemap (no API key).
- Hosting: GitHub Pages, served from the
gh-pagesbranch.scripts/deploy.shbuilds the site and publishes it there — fully self-contained, no CI required. (docs/deploy-actions.yml.exampleis an alternative GitHub Actions workflow if you prefer CI deploys; it needs a token withworkflowscope.)
# 1. Pull + build the data (Python stdlib only, no pip install)
python3 pipeline/fetch.py # pull OSM Overpass -> data/facilities.raw.json
python3 pipeline/build.py # merge -> site/public/data/data-centers.json
python3 pipeline/validate.py # sanity-check the output
# 2. (Optional, monthly) refresh the power-generation layer
pip install -r pipeline/requirements.txt
python3 pipeline/fetch_power.py # EIA-860M -> site/public/data/power-plants.json
# 3. Run the site
cd site
npm install
npm run dev # http://localhost:5173The refresh runs through Claude Code (see scripts/refresh.md for the exact checklist).
It pulls fresh OSM data, classifies any newly-seen facilities, re-checks the curated
megacampuses, rebuilds the JSON, commits the data, and republishes the site.
A launchd job (scripts/refresh-cron.sh) runs this daily on the local machine:
./scripts/refresh-cron.sh # one full refresh + deploy cycleManual deploy any time:
./scripts/deploy.sh # build + publish to gh-pages- Facility locations: © OpenStreetMap contributors, ODbL.
- Basemap: OpenFreeMap © OpenMapTiles.
- Status, capacity, and classifications are editorial estimates generated from public sources; treat them as informed approximations, not authoritative records.