Skip to content

Evaluate alternative ERA5-Land hourly data sources #35

@NewGraphEnvironment

Description

@NewGraphEnvironment

Question

Is there a better source than CDS for ERA5-Land hourly 1950-2025 over BC?

Why this data and resolution matter

ERA5-Land is the gold-standard reanalysis dataset for land-surface climate. What we need is unique because:

1. Hourly resolution at 9 km grid spacing, 1950-present

  • No other public dataset combines that temporal depth (76 years) with that spatial fidelity over land
  • Daily-only products lose the diurnal cycle — cannot compute true daily max/min, evapotranspiration, or fire weather indices without hourly data
  • Most "long records" are station-based with huge spatial gaps; ERA5-Land fills them physically-consistently

2. Reanalysis, not interpolation

  • ECMWF runs a frozen state-of-the-art weather model and assimilates billions of observations (stations, satellites, radiosondes, ships, aircraft) into a coherent global state every hour
  • Where stations exist, it is anchored to them. Where they do not (most of BC's wilderness), it is still physically realistic
  • Climate departure analysis needs a baseline that is gridded and complete — station data has gaps and biases that fake "trends" when stations open/close

3. Why tmax/tmin specifically

  • Daily max/min is what ecosystems actually feel — fish thermal stress, vapor pressure deficit, snowmelt timing, fire weather, frost dates
  • Mean temperature hides the extremes that drive ecological response
  • Trend in tmin (overnight lows warming faster than daytime highs) is one of the clearest climate-change fingerprints — losing it would gut the analysis

4. Full-territory backfill enables the cd package to ship

  • Consumer-side baseline/anomaly/trend computation requires the raw monthly time series on STAC
  • Without 1950-2025 coverage, users cannot pick custom reference periods (e.g., 1961-1990 WMO baseline vs. 1981-2010)
  • Once on S3 as COGs, anyone running cd_extract() for any AOI in BC gets full historical departure analysis in seconds — no CDS account, no batch downloads

5. Grid resolution matters for our work

  • 9 km is fine enough to resolve valley-bottom vs. ridgetop climate in places like the Bulkley, Skeena, Parsnip
  • For fish habitat, riparian restoration planning, and species range work, anything coarser smears out exactly the gradients that matter
  • Most "downscaled" products are interpolations of coarser data and add false confidence — ERA5-Land is the actual model output

Bottom line

76 years × 12 months × hourly × full BC bbox at 9 km = the substrate for every climate-context section in every report we will write for the next decade. Worth getting right.

Evaluation criteria (must-haves)

  • ERA5-Land specifically (NOT ERA5) — 9 km native resolution. ERA5 parent is ~31 km and would be a downgrade.
  • Hourly temporal resolution (so we can compute true daily max/min)
  • Coverage: 1950-present
  • Programmatic access (no manual download)
  • Reasonable quota for one-time backfill (~700 files left as of 2026-04-11)

Candidates to evaluate

  • DestinE Earth Data Hub — confirm whether it carries ERA5-Land at 9 km native, or only ERA5 at 31 km
  • Google Earth Engine — has ERA5-Land hourly. Check quota/license for non-academic commercial use
  • AWS Open Data — ERA5 mirror exists. Check ERA5-Land hourly availability
  • WeatherBench / ARCO-ERA5 — research mirrors; verify product and resolution
  • Second CDS account — boring fallback, but works
  • CDS support email — explain rate-limit history, request quota bump given legitimate scientific use case

Findings

Appended as we research.


Relates to #33 (operational backfill saga via CDS)
Relates to NewGraphEnvironment/sred-2025-2026#23

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions