Skip to content

feat: add reproject_to_instance_crs transform to zarr build pipeline#93

Merged
turban merged 1 commit intorestore/temporal-resamplingfrom
feat/reproject-transform
May 9, 2026
Merged

feat: add reproject_to_instance_crs transform to zarr build pipeline#93
turban merged 1 commit intorestore/temporal-resamplingfrom
feat/reproject-transform

Conversation

@turban
Copy link
Copy Markdown
Contributor

@turban turban commented May 9, 2026

Summary

  • Introduces climate_api/transforms/reproject.py with a reproject_to_instance_crs transform that reprojects source datasets to the instance CRS using rioxarray during the zarr build pipeline
  • The transform is a no-op when source_crs == instance CRS, so WGS84 instances incur zero overhead
  • Wires the transform into chirps3, era5_land (temperature and precipitation), and worldpop dataset YAML templates
  • Adds rioxarray>=0.17 as an explicit dependency (was previously a transitive dep)
  • Adds 4 unit tests using a mocked .rio accessor to avoid PROJ database conflicts in CI

Motivation

When an instance is configured with a non-WGS84 CRS (e.g. EPSG:25833 for Norway), previously ingested datasets would remain in their source CRS (typically WGS84). This PR ensures each ingested zarr is reprojected to the instance CRS at build time, consistent with the crs setting introduced in #92.

Test plan

  • pytest tests/test_transforms_reproject.py — all 4 tests pass
  • make lint — clean (ruff, mypy, pyright)
  • WGS84 instance: ingest any dataset and confirm no reprojection occurs (no-op path)
  • UTM33 instance (crs: EPSG:25833): ingest CHIRPS3 and confirm zarr coordinate values are in metres

Introduces a rioxarray-backed reprojection transform that converts
source datasets to the instance CRS during ingestion. The transform is
a no-op when the source CRS already matches the configured instance CRS,
so WGS84 instances incur no overhead.

- Add climate_api/transforms/reproject.py with reproject_to_instance_crs
- Wire the transform into chirps3, era5_land (both variables), and worldpop dataset YAMLs
- Add rioxarray>=0.17 as an explicit dependency
- Add tests using a mocked .rio accessor to avoid local PROJ database conflicts
@turban turban requested a review from Copilot May 9, 2026 18:10
@turban turban merged commit 771a053 into restore/temporal-resampling May 9, 2026
1 of 2 checks passed
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a new build-pipeline transform to reproject ingested rasters into the instance CRS (when configured), and wires it into several dataset templates with explicit rioxarray dependency and unit tests.

Changes:

  • Introduces reproject_to_instance_crs transform using rioxarray to reproject datasets during zarr build.
  • Adds the transform to chirps3, era5_land (temp/precip), and worldpop dataset YAMLs.
  • Adds rioxarray as a direct dependency and adds unit tests (with mocked .rio).

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/test_transforms_reproject.py Adds unit tests for no-op and reprojection behavior using a mocked rioxarray accessor.
pyproject.toml Adds rioxarray>=0.17 as an explicit dependency.
climate_api/transforms/reproject.py Implements the reproject_to_instance_crs transform and logging.
climate_api/transforms/init.py Re-exports the new transform from the package.
climate_api/data/datasets/worldpop.yaml Wires the new transform into the worldpop template.
climate_api/data/datasets/era5_land.yaml Wires the new transform into ERA5-Land temperature + precipitation templates.
climate_api/data/datasets/chirps3.yaml Wires the new transform into the CHIRPS3 template.

Comment on lines +29 to +31
target_crs = api_config.get_crs()
if target_crs == source_crs:
return ds
Comment thread pyproject.toml
"zarr>=3.1.6,<4",
"geozarr-toolkit==0.1.*",
"topozarr==0.0.*",
"rioxarray>=0.17",
monkeypatch.setenv("CLIMATE_API_CONFIG", str(config_file))

ds = _make_wgs84_dataset()
result = reproject_to_instance_crs(ds, _DATASET, source_crs="EPSG:25833")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants