docs: comprehensive data catalog and gap analysis by victorquinn · Pull Request #2 · TextureHQ/commongrid

victorquinn · 2026-02-19T01:09:21Z

Data Catalog

Adds docs/DATA_CATALOG.md — a 896-line comprehensive catalog of open grid data sources.

What's in it

Currently Available (5 datasets):

Utilities (3,132), ISOs/RTOs (7), Balancing Authorities (45), Regions (3,000), Territory Boundaries (3,000+ GeoJSON)

Planned / In Progress (14 categories, 40+ sources):

Generation & Power Plants (EIA-860, EIA-923, EPA eGRID, HIFLD)
Transmission & Substations (HIFLD, OpenStreetMap)
Grid Operations & Market Data (EIA-930, gridstatus.io, ISO portals)
Emissions (EPA CEMS, WattTime, Electricity Maps)
Rates & Tariffs (NREL URDB, OpenEI)
Renewable Energy & DER (NSRDB, WIND Toolkit, LBNL Tracking the Sun)
EV Charging (DOE AFDC, Open Charge Map)
Energy Storage (DOE GESDB)
Interconnection Queues (LBNL + all 7 ISO portals)
Reliability & Outages (EIA-861, DOE OE-417, NERC)
Demand Response & EE (EIA-861 sub-files)
Regulatory & FERC (Forms 1, 714, 2)
International (ENTSO-E, Ember, IEA, IRENA)
Meta-Sources (PUDL, OEDI, Data.gov)

Gap Analysis:

🔴 High priority: Distribution infrastructure, real-time outages, machine-readable tariffs
🟡 Medium: LMP node geography, curtailment data, behind-the-meter DER
🟢 Lower: State PUC filings, microgrids, hydrogen infrastructure

Key insight: PUDL (Catalyst Cooperative) already processes most federal datasets into analysis-ready SQLite/Parquet — worth using as an upstream source.

Quick reference

All 30+ data source URLs compiled in one table at the bottom.

896-line catalog covering: - 5 currently available datasets with full schema docs - 14 categories of planned datasets (40+ sources) with URLs - Gap analysis prioritized by impact (high/medium/lower) - Quick reference URL table for all 30+ data sources Key sources identified: EIA-860/861/923/930, HIFLD, EPA eGRID/CEMS, NREL (URDB, NSRDB, ATB), DOE AFDC, LBNL trackers, gridstatus.io, ISO queue portals, and PUDL (Catalyst Cooperative) as a meta-source.

vercel · 2026-02-19T01:09:25Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
opengrid	Ready	Preview, Comment	Feb 19, 2026 1:10am

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cab765037c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-19T01:11:54Z

+| **File Count** | ~3,000 files |
+| **Naming Convention** | `{eiaId}.json` (utilities), `iso-{shortName}.json` (ISOs), `cca-{slug}.json` (CCAs), `ba-{slug}.json` (BAs) |
+| **Source** | HIFLD ArcGIS (service territories + control areas) + CEC ArcGIS (CCAs) |
+| **Format** | GeoJSON (compact, in `public/data/territories/`) |


Point territory file path to actual dataset directory

This entry documents territory GeoJSON as living in public/data/territories/, but in this repository the tracked dataset and contributor workflows use data/territories/ (see README.md and docs/CONTRIBUTING.md, plus the committed files under data/territories). Leaving this path incorrect will send contributors to a non-existent location and cause avoidable failures when they try to inspect or update boundary files.

Useful? React with 👍 / 👎.

First PR in the substations 9th-entry-point sequence. Adds the substations table with EIA/OSM source tracking, voltage ranges, substation type/status, and soft-delete columns mirroring power_plants + ev_stations patterns. Columns: - id, slug (unique), name - owner_name, owner_utility_id (FK → utilities.id, ON DELETE SET NULL) - state, county, latitude, longitude - geography(Point, 4326), geometry(Point, 4326) — PostGIS - min_voltage_kv, max_voltage_kv - substation_type ('transmission' | 'distribution' | 'hybrid' | 'unknown') - status ('in_service' | 'out_of_service' | 'planned' | 'retired' | 'unknown') - source ('eia' | 'osm' | 'manual' | 'hybrid') — lineage for ODbL attribution - source_url, eia_id, osm_id, hifld_legacy_id - search_vector (tsvector), locked_status - submitted_by, reviewed_at, reviewed_by - created_at, updated_at, deleted_at, version (soft-delete audit block) Indexes: 8 btree (slug, owner_utility_id, state, substation_type, status, source, eia_id, osm_id), 3 spatial (GIST/SPGIST on geography + GIST on geometry), 2 FTS (GIN on search_vector + GIN trigram on name). Migration applied to production Neon: substations table created with 0 rows. No data sync yet — that comes in PR #2 (meridian/substations-sync). Part of: Substations rollout (PR 1/9) Research: memory/specs/ninth-entry-point-research.md

First PR in the substations 9th-entry-point sequence. Adds the substations table with EIA/OSM source tracking, voltage ranges, substation type/status, and soft-delete columns mirroring power_plants + ev_stations patterns. Columns: - id, slug (unique), name - owner_name, owner_utility_id (FK → utilities.id, ON DELETE SET NULL) - state, county, latitude, longitude - geography(Point, 4326), geometry(Point, 4326) — PostGIS - min_voltage_kv, max_voltage_kv - substation_type ('transmission' | 'distribution' | 'hybrid' | 'unknown') - status ('in_service' | 'out_of_service' | 'planned' | 'retired' | 'unknown') - source ('eia' | 'osm' | 'manual' | 'hybrid') — lineage for ODbL attribution - source_url, eia_id, osm_id, hifld_legacy_id - search_vector (tsvector), locked_status - submitted_by, reviewed_at, reviewed_by - created_at, updated_at, deleted_at, version (soft-delete audit block) Indexes: 8 btree (slug, owner_utility_id, state, substation_type, status, source, eia_id, osm_id), 3 spatial (GIST/SPGIST on geography + GIST on geometry), 2 FTS (GIN on search_vector + GIN trigram on name). Migration applied to production Neon: substations table created with 0 rows. No data sync yet — that comes in PR #2 (meridian/substations-sync). Part of: Substations rollout (PR 1/9) Research: memory/specs/ninth-entry-point-research.md Co-authored-by: texture-coding-agent <coding-agent@texturehq.com>

…(ALL-733) Problem: The /utilities list endpoint and /utilities/{slug} detail endpoint used different code paths for sparse-fieldset projection. The list route had a local selectFields helper; the detail route had no sparse-fieldset support at all. This meant: - Detail endpoint couldn't honor ?fields= (returned the full shape). - List endpoint's sparse projection wasn't reused anywhere else. - No invariant that list and detail produce the same per-record shape when given the same inputs. Morgan caught the end-user impact in the Relay recon (2026-05-06, bug #2): for a 3,133-utility sync at the Registered 5k/hr tier, having to fall back to list-then-detail was ~38 min instead of ~2 sec. Fix: - Hoist selectFields + parseFieldsParam into lib/api/public-response.ts so every public route uses the same serializer pipeline. - Extend publicJsonResponse and publicPaginatedResponse with an optional { fields } option (accepts raw ?fields= string or pre-parsed string[]). - Enforce order: stripInternal → selectFields. Internal fields can never be resurrected via ?fields=searchVector etc. - Wire ?fields= into /utilities/{slug}. - Swap the ad-hoc selectFields in /utilities route.ts for the shared helpers. Regression tests (lib/api/__tests__/public-response.test.ts): - parseFieldsParam: null/empty/whitespace handling, de-duping w/ preserved order. - selectFields: existing-keys-only, null/0 preserved, non-object passthrough. - publicJsonResponse + publicPaginatedResponse with ?fields=. - Internal-field resurrection guard. - List/detail shape parity: same keys, same numeric values, same ?fields= projection across both envelopes. Fixes ALL-733

…(ALL-733) (#208) Problem: The /utilities list endpoint and /utilities/{slug} detail endpoint used different code paths for sparse-fieldset projection. The list route had a local selectFields helper; the detail route had no sparse-fieldset support at all. This meant: - Detail endpoint couldn't honor ?fields= (returned the full shape). - List endpoint's sparse projection wasn't reused anywhere else. - No invariant that list and detail produce the same per-record shape when given the same inputs. Morgan caught the end-user impact in the Relay recon (2026-05-06, bug #2): for a 3,133-utility sync at the Registered 5k/hr tier, having to fall back to list-then-detail was ~38 min instead of ~2 sec. Fix: - Hoist selectFields + parseFieldsParam into lib/api/public-response.ts so every public route uses the same serializer pipeline. - Extend publicJsonResponse and publicPaginatedResponse with an optional { fields } option (accepts raw ?fields= string or pre-parsed string[]). - Enforce order: stripInternal → selectFields. Internal fields can never be resurrected via ?fields=searchVector etc. - Wire ?fields= into /utilities/{slug}. - Swap the ad-hoc selectFields in /utilities route.ts for the shared helpers. Regression tests (lib/api/__tests__/public-response.test.ts): - parseFieldsParam: null/empty/whitespace handling, de-duping w/ preserved order. - selectFields: existing-keys-only, null/0 preserved, non-object passthrough. - publicJsonResponse + publicPaginatedResponse with ?fields=. - Internal-field resurrection guard. - List/detail shape parity: same keys, same numeric values, same ?fields= projection across both envelopes. Fixes ALL-733 Co-authored-by: texture-coding-agent <coding-agent@texturehq.com>

vercel Bot deployed to Preview February 19, 2026 01:10 View deployment

chatgpt-codex-connector Bot reviewed Feb 19, 2026

View reviewed changes

victorquinn merged commit e7b25d9 into main Feb 19, 2026
3 checks passed

This was referenced Apr 15, 2026

db: Add database client configuration #46

Closed

api: Add auth middleware with scoped API keys #49

Closed

db: Build database seed script with full validation #50

Closed

db: Build delta-based versioning middleware #51

Closed

texture-fleet-agent Bot mentioned this pull request May 5, 2026

feat(substations): add substations table schema (PR 1/9) #185

Merged

This was referenced May 5, 2026

feat(substations): add sync script (PR 2/9) #187

Merged

feat(substations): implement API list and detail endpoints (PR 3/9) #188

Closed

feat(substations): implement UI pages and map tile endpoint (PR 4/9) #189

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: comprehensive data catalog and gap analysis#2

docs: comprehensive data catalog and gap analysis#2
victorquinn merged 1 commit intomainfrom
talos/data-catalog

victorquinn commented Feb 19, 2026

Uh oh!

vercel Bot commented Feb 19, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

victorquinn commented Feb 19, 2026

Data Catalog

What's in it

Quick reference

Uh oh!

vercel Bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Feb 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Feb 19, 2026 •

edited

Loading