feat: Pydantic-zarr V3 model for Sentinel-1 GRD γ0T RTC stores#138
feat: Pydantic-zarr V3 model for Sentinel-1 GRD γ0T RTC stores#138emmanuelmathot merged 9 commits intos1-rtcfrom
Conversation
bc5fc0c to
8cc7df1
Compare
|
Addressed reviewer feedback:
All 11 S1 RTC tests pass. Pre-commit clean (only pre-existing mypy issues in |
| model_config = {"extra": "allow", "populate_by_name": True, "serialize_by_alias": True} | ||
|
|
||
| @model_validator(mode="after") | ||
| def validate_shape(self) -> Self: |
There was a problem hiding this comment.
if you annotate spatial_shape as tuple[int, int] instead of list[int], then pydantic will check the length automatically
| return self | ||
|
|
||
| @model_validator(mode="after") | ||
| def validate_transform(self) -> Self: |
There was a problem hiding this comment.
same as above -- you can use the type annotation tuple[float, float, float, float, float, float] to declare that it must have length 6
| """One orbit direction (ascending or descending) with multiscale layout.""" | ||
|
|
||
| @model_validator(mode="after") | ||
| def validate_r10m_present(self) -> Self: |
There was a problem hiding this comment.
if r10m is required, but all the other ones are not, you can use the NotRequired type annotation for the non-required fields of S1RtcOrbitGroupMembers, and remove the total=false from the typeddict definition
There was a problem hiding this comment.
if you did that, this validator would not be necessary
| def conditions(self) -> S1RtcConditionsGroup | None: | ||
| return self.members.get("conditions") | ||
|
|
||
| def get_resolution(self, level: ResolutionLevel) -> GroupSpec[Any, Any] | None: |
There was a problem hiding this comment.
you probably also want a method for listing available resolution levels
| @model_validator(mode="after") | ||
| def validate_spatial_dimensions(self) -> Self: | ||
| if self.spatial_dimensions != ["y", "x"]: | ||
| raise ValueError( | ||
| f"spatial:dimensions must be ['y', 'x'], got {self.spatial_dimensions}" | ||
| ) | ||
| return self |
There was a problem hiding this comment.
you can remove this validator if you make spatial_dimensions: Literal[("y", "x")]
There was a problem hiding this comment.
nvm, Literal[("a", "b")] simplifies to Literal["a", "b"] which is not what we want here. Instead we want tuple[Literal["y"], Literal["x"]]
- Add src/eopf_geozarr/data_api/s1_rtc.py — Zarr V3 Pydantic models for S1 GRD γ0T RTC GeoZarr stores, using pyz.v3 GroupSpec/ArraySpec with TypedDict members (same pattern as s2.py uses pyz.v2) - Models: S1RtcRoot, S1RtcOrbitGroup, S1RtcNativeResolutionDataset, S1RtcOverviewResolutionDataset, S1RtcConditionsGroup - Validation: convention UUIDs, spatial:dimensions, multiscales layout, required data arrays (vv/vh/border_mask), gamma_area presence - Add tests/_test_data/s1_rtc_examples/s1-grd-rtc-31TCH.json — realistic fixture with 3 timesteps, 6 overview levels, 3 gamma_area conditions - Add tests/test_data_api/test_s1_rtc.py — 11 tests: round-trip, structure validation, negative cases (missing orbit, r10m, UUIDs, etc.) - Add conftest fixture s1_rtc_json_example parametrized over all fixtures
- Replace dict[str, Any] multiscales field with zcm.Multiscales import - Remove inline MultiscalesTransform/ScaleLevel/Multiscales classes - Update test assertions for Pydantic model attribute access
- Add [tool.ruff.lint.flake8-type-checking] runtime-evaluated-base-classes for pydantic.BaseModel so Pydantic field type imports aren't flagged - Remove 4 stale noqa comments auto-fixed by ruff
- spatial_dimensions: tuple[Literal['y'], Literal['x']] (removes validator) - spatial_bbox: tuple[float, float, float, float] (removes validator) - spatial_shape: tuple[int, int] (removes validator) - spatial_transform: tuple[float, ...] x6 (removes validator) - S1RtcOrbitGroupMembers: r10m required, others NotRequired (removes validator) - Add resolution_levels() method to S1RtcOrbitGroup - Apply same tuple types to S1RtcConditionsAttrs
f466006 to
a5bd1e1
Compare
Comment out pre-commit job in CI workflow
* phase 1: S1 RTC Pydantic models aligned with S2 pattern - Add src/eopf_geozarr/data_api/s1_rtc.py — Zarr V3 Pydantic models for S1 GRD γ0T RTC GeoZarr stores, using pyz.v3 GroupSpec/ArraySpec with TypedDict members (same pattern as s2.py uses pyz.v2) - Models: S1RtcRoot, S1RtcOrbitGroup, S1RtcNativeResolutionDataset, S1RtcOverviewResolutionDataset, S1RtcConditionsGroup - Validation: convention UUIDs, spatial:dimensions, multiscales layout, required data arrays (vv/vh/border_mask), gamma_area presence - Add tests/_test_data/s1_rtc_examples/s1-grd-rtc-31TCH.json — realistic fixture with 3 timesteps, 6 overview levels, 3 gamma_area conditions - Add tests/test_data_api/test_s1_rtc.py — 11 tests: round-trip, structure validation, negative cases (missing orbit, r10m, UUIDs, etc.) - Add conftest fixture s1_rtc_json_example parametrized over all fixtures * refactor: improve Pydantic model definitions and streamline imports in S1 RTC module * fix: standardize spatial dimensions to lowercase in S1 RTC models and test cases * refactor: use zcm.Multiscales typed model per reviewer feedback - Replace dict[str, Any] multiscales field with zcm.Multiscales import - Remove inline MultiscalesTransform/ScaleLevel/Multiscales classes - Update test assertions for Pydantic model attribute access * fix: configure ruff TC001 for Pydantic runtime-evaluated base classes - Add [tool.ruff.lint.flake8-type-checking] runtime-evaluated-base-classes for pydantic.BaseModel so Pydantic field type imports aren't flagged - Remove 4 stale noqa comments auto-fixed by ruff * refactor: replace validators with precise type annotations per review - spatial_dimensions: tuple[Literal['y'], Literal['x']] (removes validator) - spatial_bbox: tuple[float, float, float, float] (removes validator) - spatial_shape: tuple[int, int] (removes validator) - spatial_transform: tuple[float, ...] x6 (removes validator) - S1RtcOrbitGroupMembers: r10m required, others NotRequired (removes validator) - Add resolution_levels() method to S1RtcOrbitGroup - Apply same tuple types to S1RtcConditionsAttrs * ci: disable temporarly pre-commit job in ci.yml Comment out pre-commit job in CI workflow * ci: enable pre-commit checks in CI workflow * ci: transitive actions/cache@v4 dependency --------- Co-authored-by: Loïc Houpert <10154151+lhoupert@users.noreply.github.com>
What
Pydantic-zarr V3 schema for Sentinel-1 GRD γ0T RTC time-series stores on the MGRS grid. This follows the exact same pattern as the existing S2 model
Why
We're building an S1 GRD RTC pipeline that ingests S1Tiling GeoTIFFs into GeoZarr V3 stores. This model defines the expected store structure so we can validate outputs at write time and in CI.
Store hierarchy
Key design choices (looking for feedback on)
pyz.v3GroupSpec/ArraySpec — mirrors thepyz.v2pattern used by S2 but wrapspydantic_zarr.v3. TypedDict members withclosed=True, total=Falseenforce allowed keys while keeping optional groups flexible.zarr_conventions UUIDs — orbit-direction groups carry
multiscales,geo_proj, andspatialconvention UUIDs viazarr_cm. Validated with amodel_validator.Sharding codecs — native arrays use
sharding_indexedwith inner chunks of 366 (≈ 1 year of acquisitions along time axis). The model validates codec structure is present but doesn't constrain inner chunk sizes.Conditions as a sub-group —
conditions/gamma_area_{orbit}arrays are(Y, X) float32at native resolution, one per relative orbit. These are static geometric metadata, not time-varying.Overview levels —
r20mthroughr720mcarry only the data arrays (vv,vh,border_mask), not coordinate arrays.Files
src/eopf_geozarr/data_api/s1_rtc.pytests/_test_data/s1_rtc_examples/s1-grd-rtc-31TCH.jsontests/test_data_api/test_s1_rtc.pytests/conftest.pyHow to review
Start with
s1_rtc.py— the docstring at the top shows the full hierarchy. The JSON fixture is machine-generated but representative of real S1Tiling output over MGRS tile 31TCH.All existing S2 tests still pass.