AtlasStack

AtlasStack is a modular ingestion and validation stack for UK energy and infrastructure datasets.

It converts unstable external APIs into deterministic, schema-controlled, test-validated inputs for analytics, forecasting, and ML systems.

Why AtlasStack Exists

Public energy datasets (NESO, ESO, weather APIs, interconnector feeds) are predominantly:

poorly versioned
weakly typed
prone to silent schema drift
inconsistent in cadence
rarely testable

AtlasStack treats ingestion as engineering.

It enforces structure, typing, cadence, and validation before data is allowed into analytics or ML layers.

If the foundation is unreliable, every forecast built on top of it is suspect in its trustworthiness.

System Architecture

Core Data Lineage

flowchart LR
  EXT[External APIs] --> BR[Bronze<br/>Raw JSONL<br/>dt partitions]
  BR --> SL[Silver<br/>Typed Parquet<br/>schema enforced]
  SL --> STG[dbt staging models]
  STG --> DIM[dim_date]
  STG --> FCT[fct_observations]
  FCT --> MART[mart_daily_summary]

Bronze

Raw API responses
Append-only
Partitioned by dt=YYYY-MM-DD
Never mutated / no transformation logic

Silver

Strict typing
Normalized timestamps
Explicit schema enforcement
Ready for dbt consumption
Deterministic partitioning

Warehouse (Current: DuckDB)

Local analytical engine
dbt transformations
Fact and dimension modelling
Explicit data tests

Architecture Diagram

flowchart TB

  %% Sources
  subgraph Sources
    NESO[NESO Demand API]
    WEATHER[Open-Meteo Weather API]
  end

  %% Orchestration
  subgraph Orchestration
    PREFECT[Prefect Flow]
  end

  %% Storage
  subgraph Storage
    BRONZE[Bronze JSONL<br/>data/bronze]
    SILVER[Silver Parquet<br/>data/silver]
  end

  %% Transform
  subgraph Transform
    DUCK[(DuckDB Warehouse)]
    DBT[dbt Models]
    MARTS[Marts]
  end

  %% Quality
  subgraph Validation
    PYTEST[Unit Tests]
    DBTTEST[dbt Data Tests]
    CI[GitHub Actions]
  end

  %% Future
  subgraph Optional_Cloud_Extension
    S3[S3 Object Storage]
    SNOW[Snowflake Warehouse]
  end

  NESO --> PREFECT
  WEATHER --> PREFECT

  PREFECT --> BRONZE
  BRONZE --> SILVER

  SILVER --> DUCK
  DUCK --> DBT
  DBT --> MARTS

  PYTEST --> CI
  DBTTEST --> CI

  BRONZE -. storage swap .-> S3
  SILVER -. storage swap .-> S3
  DUCK -. warehouse swap .-> SNOW

CLI Usage

Run the pipeline for the last N days:

atlasstack run --days 3

This serves to:

Extract NESO demand data to the bronze layer
Extract Open-Meteo weather data to the bronze layer
Build silver layers
Execute dbt build
Produce validated data marts

Check CLI options:

atlasstack --help
atlasstack run --help

Design Principles

AtlasStack is governed by the following set of engineering constraints:

Determinism over convenience The same date range always produces identical outputs.
Immutability Bronze level data is never mutated, being append-only. Corrections happen in the downstream layers.
Explicit schema contracts All external data is normalised and typed before consumption. Cadence and null thresholds are enforced.
Loud failure The CI fails on scheme drift, cadence breaks, or coverage degradations occur.
Layered testing Unit tests used for extractors. Data tests used for marts. CI validation tests for the full stack.
Infrastructure focus Analytics are secondary to foundational reliability.

Development Workflow

Install Locally:

pip install -e ".[dev]"

Run Lint and Tests

ruff check .
pytest

Bootstrap CI run locally:

python scripts/ci_bootstrap.py

Run dbt Manually:

cd dbt/atlasstack_dbt
dbt build --no-partial-parse --profiles-dir .

What Successful Runs Look Like:

A successful pipeline run produces:

Partitioned bronze JSONL files
Partitioned silver Parquet files
Passing dbt tests
A valid fct_observations table with:
- Half-hour cadence
- Enforced weather coverage thresholds
- Unique settlement timestamps

If dbt and pytest are green, the ingestion layer is behaving as expected for the tested range.

Storage Backends

AtlasStack has storage abstraction.

Default (Local)

Bronze: data/bronze/
Silver: data/silver/
Warehouse: DuckDB

Optional (Cloud-ready)

Bronze/Silver → S3
Warehouse → Snowflake

The cloud infrastructure is scaffolded but not required to run the project.

The entire stack runs locally without cloud billing dependencies.

Roadmap

Short-Term

Prefect deployment to managed orchestration (schedules and retries)
Run metadata: structured run reports, row counts, and freshness markers

Mid-Term

Move bronze/silver to S3 (partitioned object storage)
Switch warehouse target from DuckDB to Snowflake (raw, staging, and marts)
CI runs dbt against a temporary Snowflake schema (PR validation)

Long-Term

Minimal Terraform: S3 bucket and Snowflake roles/permissions
Dataset contracts and schema drift alerts (contract breaks fail CI)

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.dbt		.dbt
.github/workflows		.github/workflows
dbt/atlasstack_dbt		dbt/atlasstack_dbt
infra/terraform		infra/terraform
scripts		scripts
src/atlasstack		src/atlasstack
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AtlasStack

Why AtlasStack Exists

System Architecture

Core Data Lineage

Bronze

Silver

Warehouse (Current: DuckDB)

Architecture Diagram

CLI Usage

Design Principles

Development Workflow

Install Locally:

Run Lint and Tests

Bootstrap CI run locally:

Run dbt Manually:

What Successful Runs Look Like:

Storage Backends

Default (Local)

Optional (Cloud-ready)

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AtlasStack

Why AtlasStack Exists

System Architecture

Core Data Lineage

Bronze

Silver

Warehouse (Current: DuckDB)

Architecture Diagram

CLI Usage

Design Principles

Development Workflow

Install Locally:

Run Lint and Tests

Bootstrap CI run locally:

Run dbt Manually:

What Successful Runs Look Like:

Storage Backends

Default (Local)

Optional (Cloud-ready)

Roadmap

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages