feat: add reproject_to_instance_crs transform to zarr build pipeline#93
Merged
turban merged 1 commit intorestore/temporal-resamplingfrom May 9, 2026
Merged
Conversation
Introduces a rioxarray-backed reprojection transform that converts source datasets to the instance CRS during ingestion. The transform is a no-op when the source CRS already matches the configured instance CRS, so WGS84 instances incur no overhead. - Add climate_api/transforms/reproject.py with reproject_to_instance_crs - Wire the transform into chirps3, era5_land (both variables), and worldpop dataset YAMLs - Add rioxarray>=0.17 as an explicit dependency - Add tests using a mocked .rio accessor to avoid local PROJ database conflicts
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a new build-pipeline transform to reproject ingested rasters into the instance CRS (when configured), and wires it into several dataset templates with explicit rioxarray dependency and unit tests.
Changes:
- Introduces
reproject_to_instance_crstransform usingrioxarrayto reproject datasets during zarr build. - Adds the transform to
chirps3,era5_land(temp/precip), andworldpopdataset YAMLs. - Adds
rioxarrayas a direct dependency and adds unit tests (with mocked.rio).
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_transforms_reproject.py | Adds unit tests for no-op and reprojection behavior using a mocked rioxarray accessor. |
| pyproject.toml | Adds rioxarray>=0.17 as an explicit dependency. |
| climate_api/transforms/reproject.py | Implements the reproject_to_instance_crs transform and logging. |
| climate_api/transforms/init.py | Re-exports the new transform from the package. |
| climate_api/data/datasets/worldpop.yaml | Wires the new transform into the worldpop template. |
| climate_api/data/datasets/era5_land.yaml | Wires the new transform into ERA5-Land temperature + precipitation templates. |
| climate_api/data/datasets/chirps3.yaml | Wires the new transform into the CHIRPS3 template. |
Comment on lines
+29
to
+31
| target_crs = api_config.get_crs() | ||
| if target_crs == source_crs: | ||
| return ds |
| "zarr>=3.1.6,<4", | ||
| "geozarr-toolkit==0.1.*", | ||
| "topozarr==0.0.*", | ||
| "rioxarray>=0.17", |
| monkeypatch.setenv("CLIMATE_API_CONFIG", str(config_file)) | ||
|
|
||
| ds = _make_wgs84_dataset() | ||
| result = reproject_to_instance_crs(ds, _DATASET, source_crs="EPSG:25833") |
Merged
9 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
climate_api/transforms/reproject.pywith areproject_to_instance_crstransform that reprojects source datasets to the instance CRS using rioxarray during the zarr build pipelinesource_crs == instance CRS, so WGS84 instances incur zero overheadchirps3,era5_land(temperature and precipitation), andworldpopdataset YAML templatesrioxarray>=0.17as an explicit dependency (was previously a transitive dep).rioaccessor to avoid PROJ database conflicts in CIMotivation
When an instance is configured with a non-WGS84 CRS (e.g.
EPSG:25833for Norway), previously ingested datasets would remain in their source CRS (typically WGS84). This PR ensures each ingested zarr is reprojected to the instance CRS at build time, consistent with thecrssetting introduced in #92.Test plan
pytest tests/test_transforms_reproject.py— all 4 tests passmake lint— clean (ruff, mypy, pyright)crs: EPSG:25833): ingest CHIRPS3 and confirm zarr coordinate values are in metres