-
Notifications
You must be signed in to change notification settings - Fork 3
feat/geozarr model #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…to feat/geozarr-model
…to feat/geozarr-model
…to feat/geozarr-model
…to feat/geozarr-model
|
@emmanuelmathot this is ready for review. The models right now are pretty narrow -- I define a A The If this looks good, maybe we merge this and then I can start working on integrating these classes with the rest of the codebase? |
emmanuelmathot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the initial pydantic model @d-v-b ! I probably have more questions as it is fully clear how this work to me. Let's have a review meeting.
| ) | ||
|
|
||
|
|
||
| CFStandardName = Annotated[str, AfterValidator(check_standard_name)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we miss the grid_mapping attribute verification defaulted to spatial_ref scalar with the EPSG code. https://zarr.dev/geozarr-spec/documents/standard/template/geozarr-spec.html#_e15d59bd-f2ec-28e8-8016-4e541c95e10f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can add these, and if they are required we should make that more clear in the spec. right now the spec says
CF Conventions – Including attributes such as standard_name, units, axis, and grid_mapping to express spatiotemporal semantics and coordinate system properties.
but it isn't clear which CF attributes are required, optional, etc
| CF_STANDARD_NAMES = get_cf_standard_names(url=CF_STANDARD_NAME_URL) | ||
|
|
||
|
|
||
| def check_standard_name(name: str) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please bear with my limited knowledge of pydantic but how is made the link with the actual standard_name field name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pydantic does most of its validation routines based on type annotations. When we annotate an attribute on a pydantic model with this type: https://github.com/d-v-b/data-model/blob/3d11af412e460993f8e603dcff0555c5342c4e8f/src/eopf_geozarr/data_api/geozarr/common.py#L70, then pydantic will run the check_standard_name function after checking that the input is a string.
| return model | ||
|
|
||
|
|
||
| class Dataset(GroupSpec[DatasetAttrs, GroupSpec[Any, Any] | DataArray]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we call this GeoZarrDataset?
Then, how does pydantic-zarr discriminate the "normal" groups from a geoZarr Dataset group?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- we could definitely call it a
GeoZarrDataset
Then, how does pydantic-zarr discriminate the "normal" groups from a geoZarr Dataset group?
see https://docs.pydantic.dev/latest/concepts/unions/. Basically there are 3 options for resolving a union: order-based, best-match-based, and discriminant-based.
Ideally a geozarr dataset group would have some specific requirements or traits that distinguish it from a regular zarr group, or a zarr group that happens to have cf metadata, but I don't think the spec as written expresses these requirements, so this might require some work
|
@d-v-b What is the status of this PR? |
|
still in progress, I was previously blocked on the semantics of the |
|
@emmanuelmathot I am getting an interesting test failure that would be helpful for you to check. Within a dataset, for each data variable that declares coordinates ( |
|
probably bugs from the EOPF CPM as we simply copy this dataset. I will give a look and report if any. I'll trace that here. |
|
@d-v-b As a more general question. Is this |
|
2 related issues created following the |
…to feat/geozarr-model
|
I added a new |
initial working pydantic models for geozarr