feat: phase 2 — S1 GRD RTC GeoTIFF ingestion pipeline#146
feat: phase 2 — S1 GRD RTC GeoTIFF ingestion pipeline#146emmanuelmathot merged 5 commits intos1-rtcfrom
Conversation
Productionise the GeoTIFF → Zarr V3 ingestion pipeline for S1Tiling gamma-naught RTC outputs. New module: src/eopf_geozarr/conversion/s1_ingest.py (~500 lines) - extract_geotiff_metadata(): rasterio-based metadata extraction with tag validation and S1Tiling datetime normalisation - create_s1_store(): Zarr V3 store creation with full GeoZarr conventions (multiscales, proj:, spatial:), sharded arrays, 1D spatial coordinate arrays at every resolution level - ingest_s1tiling_acquisition(): main API — create-or-open store, read GeoTIFFs, generate overviews (average/nearest), write all levels, append coordinate variables. CRS/shape consistency validation on append. - consolidate_s1_store(): post-batch metadata consolidation - discover_s1tiling_acquisitions(): file discovery with grouping Tests: tests/test_s1_rtc_ingest.py (27 tests) - Metadata extraction, store creation, ingestion (create + append), data integrity, xarray roundtrip, CRS/shape mismatch rejection, consolidation, file discovery Uses zarr_cm CMO dicts for convention metadata (not hardcoded UUIDs). Reuses calculate_aligned_chunk_size() from utils.py. Exports wired in conversion/__init__.py. Refs: #139
b368033 to
37769c3
Compare
…ords - Remove private _downsample_2d from s1_ingest.py; use shared downsample_2d_array from utils instead. - Enhance downsample_2d_array: add method='nearest' parameter; use ceil-based block sizes with edge-padding for non-divisible source/target ratios (no longer truncates edge pixels). - Fix spatial coordinate arrays to use pixel-center convention (half-pixel offset from edge origin) per CF/GIS standards. - Update test expectations for improved block averaging behavior.
There was a problem hiding this comment.
Pull request overview
Implements Phase 2 of the Sentinel-1 GRD RTC (γ0T) GeoTIFF → GeoZarr v3 ingestion pipeline, including store creation, acquisition appends, consolidation, and discovery, with accompanying tests.
Changes:
- Added a production ingestion module for S1Tiling RTC GeoTIFFs into sharded Zarr v3 with multiscale overviews and conventions metadata.
- Enhanced
downsample_2d_arrayto support padded block-averaging for non-divisible shapes and a"nearest"method. - Added comprehensive ingestion tests and updated existing conversion tests to match the new downsampling behavior.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
src/eopf_geozarr/conversion/s1_ingest.py |
New S1 RTC ingestion pipeline (metadata extraction, store creation, append logic, consolidation, discovery). |
src/eopf_geozarr/conversion/utils.py |
Updates downsampling behavior and adds a resampling method option used by overviews. |
src/eopf_geozarr/conversion/__init__.py |
Exposes the new S1 ingestion functions as part of the conversion public API. |
tests/test_s1_rtc_ingest.py |
New test suite covering metadata extraction, store creation, ingestion/appends, consolidation, and discovery. |
tests/test_conversion.py |
Updates downsampling test expectations for the new padded block-averaging behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ze orbit_dir - Extract _create_orbit_group helper to deduplicate orbit group creation - Add VH / border_mask alignment validation (CRS, transform, shape) before write - Add spatial_transform comparison on append (not just CRS + shape) - Normalize orbit_dir ASC→ascending / DES→descending in discover_s1tiling_acquisitions - Validate method param in downsample_2d_array, fix block-averaging condition (and→or) - Remove unused compute_multiscales_layout import in tests
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… cross-orbit validation - Add x/y spatial coordinate arrays to S1RtcOverviewResolutionMembers and S1RtcNativeResolutionMembers (schema + fixture + test) - Use bounded chunk-aligned shard dimensions via calculate_shard_dimension (extracted from geozarr.py to utils.py as shared utility) - Validate CRS/shape/transform against existing orbit groups before creating a new orbit direction group in an existing store - Add schema validation test: ingested store validates against S1RtcRoot
c0bc6d0 to
d7147a5
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
* feat: Phase 2 — S1 GRD RTC GeoTIFF ingestion pipeline Productionise the GeoTIFF → Zarr V3 ingestion pipeline for S1Tiling gamma-naught RTC outputs. New module: src/eopf_geozarr/conversion/s1_ingest.py (~500 lines) - extract_geotiff_metadata(): rasterio-based metadata extraction with tag validation and S1Tiling datetime normalisation - create_s1_store(): Zarr V3 store creation with full GeoZarr conventions (multiscales, proj:, spatial:), sharded arrays, 1D spatial coordinate arrays at every resolution level - ingest_s1tiling_acquisition(): main API — create-or-open store, read GeoTIFFs, generate overviews (average/nearest), write all levels, append coordinate variables. CRS/shape consistency validation on append. - consolidate_s1_store(): post-batch metadata consolidation - discover_s1tiling_acquisitions(): file discovery with grouping Tests: tests/test_s1_rtc_ingest.py (27 tests) - Metadata extraction, store creation, ingestion (create + append), data integrity, xarray roundtrip, CRS/shape mismatch rejection, consolidation, file discovery Uses zarr_cm CMO dicts for convention metadata (not hardcoded UUIDs). Reuses calculate_aligned_chunk_size() from utils.py. Exports wired in conversion/__init__.py. Refs: #139 * remove plan * refactor: address PR review — consolidate downsample, pixel-center coords - Remove private _downsample_2d from s1_ingest.py; use shared downsample_2d_array from utils instead. - Enhance downsample_2d_array: add method='nearest' parameter; use ceil-based block sizes with edge-padding for non-divisible source/target ratios (no longer truncates edge pixels). - Fix spatial coordinate arrays to use pixel-center convention (half-pixel offset from edge origin) per CF/GIS standards. - Update test expectations for improved block averaging behavior. * refactor: address PR review — extract helper, add validation, normalize orbit_dir - Extract _create_orbit_group helper to deduplicate orbit group creation - Add VH / border_mask alignment validation (CRS, transform, shape) before write - Add spatial_transform comparison on append (not just CRS + shape) - Normalize orbit_dir ASC→ascending / DES→descending in discover_s1tiling_acquisitions - Validate method param in downsample_2d_array, fix block-averaging condition (and→or) - Remove unused compute_multiscales_layout import in tests * refactor: address review round 2 — schema x/y coords, bounded shards, cross-orbit validation - Add x/y spatial coordinate arrays to S1RtcOverviewResolutionMembers and S1RtcNativeResolutionMembers (schema + fixture + test) - Use bounded chunk-aligned shard dimensions via calculate_shard_dimension (extracted from geozarr.py to utils.py as shared utility) - Validate CRS/shape/transform against existing orbit groups before creating a new orbit direction group in an existing store - Add schema validation test: ingested store validates against S1RtcRoot
Summary
Phase 2 of the S1 GRD RTC implementation: production GeoTIFF → Zarr V3 ingestion pipeline for S1Tiling γ0T RTC outputs.
Delivered files
src/eopf_geozarr/conversion/s1_ingest.py(~500 lines)tests/test_s1_rtc_ingest.py(27 tests)Public API
ingest_s1tiling_acquisition(vv, vh, mask, store, orbit)— create-or-append one acquisitionconsolidate_s1_store(store, orbit)— post-batch consolidationdiscover_s1tiling_acquisitions(input_dir)— file discovery with groupingextract_geotiff_metadata(path)— rasterio-based metadata extractionKey design decisions
zarr_cmCMO dicts (UUID, schema_url, spec_url)_downsample_2dhelper for overview generation (mean downsampling)structlogfor structured logging throughoutnp.linspace["y", "x"]/["time", "y", "x"]required by titiler-eopfTest results
27/27 passed
Related
Closes phase 2 of #139