Skip to content

HarryJ12/Coal2Core

Repository files navigation

Coal2Core - A Futuristic View on Saving our Planet

Ranking retired U.S. coal plant sites for Small Modular Reactor conversion using machine learning

Next.js Python Mapbox Tailwind


The Problem

Global warming is accelerating. Coal plants across the U.S. are retiring. At the same time, AI-driven electricity demand is exploding, and we're still heavily reliant on fossil fuels.

The U.S. Department of Energy has found that ~80% of screened coal sites are physically suitable for advanced nuclear. Massachusetts Governor Maura Healey recently launched an initiative to build 10 GW of new energy resources, including nuclear power. The opportunity is right in front of us.

Coal2Core is a data-driven platform that scores 374 U.S. coal plant sites for Small Modular Reactor (SMR) conversion viability, combining symbolic regression, SVR machine learning, Monte Carlo stress-testing, and financial modeling into an interactive map.


What It Does

  • ML scoring: Evaluates every coal site on grid infrastructure, cooling water access, terrain, population buffer, seismic risk, and state regulatory policy
  • Interactive map: Visualizes all 374 plants with color-coded suitability tiers on a dark Mapbox GL canvas
  • Financial modeling: Calculates CapEx, LCOE, and 40-year NPV per site across three scenarios (Optimistic / Base / Pessimistic)
  • Monte Carlo robustness: 1,000-iteration simulation validates top sites under operational and economic uncertainty
  • Environmental impact: Estimates annual CO₂ reduction and potential AI data center capacity per plant

The Model

We used PySR (symbolic regression) to derive an interpretable scoring equation from DOE/OR-SAGE rubric criteria, then trained a Support Vector Regression (SVR) model that outperformed ElasticNet, BayesianRidge, and PolynomialLasso in 5-fold nested cross-validation.

R² = 0.965

The model evaluates 10 continuous features including capacity, distance to water, distance to transmission infrastructure, population density, seismic risk, and floodplain risk — with median imputation for missing values across 374 validated coal plants.

Ground truth scores weight:

Factor Weight
Population proximity 45%
Dedicated cooling infrastructure 20%
Nameplate capacity 15%
Light Water Reactor heritage 15%
High-voltage grid (≥230 kV) 5%
Ash impoundment penalty -10%

Tech Stack

Frontend

  • Next.js 16 + React 19 (App Router, Turbopack)
  • Mapbox GL JS 3 - interactive plant map
  • Tailwind CSS 4 - dark-theme UI
  • KaTeX - formula rendering on methodology pages
  • TypeScript - end-to-end type safety

ML & Data Pipeline

  • Python - pandas, scikit-learn, numpy
  • PySR - symbolic regression (Julia backend)
  • SVR (RBF kernel) - final predictive model
  • Monte Carlo simulation - 1,000-iteration robustness testing
  • Jupyter Notebooks - exploratory analysis and pipeline documentation

Project Structure

Coal2Core/
├── src/
│   ├── app/          # Next.js pages (map, methodologies, etc.)
│   ├── components/   # React UI components
│   ├── lib/          # Data and TypeScript types
│   └── public/       # Images and static assets
├── ml_and_data_pipeline/
│   ├── training_pipeline.ipynb
│   ├── testing_pipeline.ipynb
│   ├── figures/      # Generated charts
│   └── *.csv         # Site datasets and model outputs
├── next.config.ts
└── package.json

Getting Started

Prerequisites

  • Node.js 18+
  • A Mapbox account (free tier works)

Setup

git clone https://github.com/HarryJ12/Coal2Core.git
cd Coal2Core
npm install

Create a .env.local file:

NEXT_PUBLIC_MAPBOX_TOKEN=your_mapbox_token_here
npm run dev

Open http://localhost:3000.

Running the ML Pipeline

Open the notebooks in order:

  1. ml_and_data_pipeline/testing_pipeline.ipynb - feature engineering & symbolic regression
  2. ml_and_data_pipeline/training_pipeline.ipynb - SVR training, validation, Monte Carlo, financial output

Requires: pandas, scikit-learn, numpy, pysr, matplotlib


Key Results

  • 374 U.S. coal plant sites scored
  • Top 20 Monte Carlo-validated sites identified as robust under uncertainty
  • R² = 0.965 on held-out validation set
  • Sites can individually reduce 1M–19M tons CO₂/year when converted
  • Each site can support 5–16 AI data center campuses (at 150 MW/campus)

Data Sources


License

Copyright © 2026 Coal2Core. All rights reserved.


Built by the Coal2Core Team at Tufts Data Science Club - NSDC Datathon 2026

About

ML platform (PySR, SVR, Monte Carlo) with a Next.js + Mapbox interface to rank U.S. coal plants for SMR conversion and visualize suitability, economics, and impact.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages