use declarative representation of the spec #258

d-v-b · 2023-03-07T20:53:42Z

I ran into issues viewing my ome-ngff-formatted zarr data using via the napari plugin. Napari was exiting with an error emitted from this line of this package, which is a try...except block that takes ANY exception, and doesn't handle the content of the exception at all.

Unfortunately, swallowing exceptions without a useful error message made it very hard to debug any problems with my data. I suspect this style of coding is a side-effect of taking a procedural approach to the OME-NGFF metadata (i.e., "valid metadata" means "a bunch of code ran without errors")
I think a much better approach would be to define a python class that represents the structure of the metadata and handles parsing / validation of the input. Such a declarative approach (i.e., "valid metadata" means "the program created an object structurally equivalent to the metadata") can produce useful error messages when a single field fails to validate (in my case, the problem was with the version string, but I had to figure that out manually)

For examples of declarative implementations of the OME-NGFF spec, see iohub and pydantic-ome-ngff. As the author of the second project, I would love to see this functionality under the umbrella of the official OME organization.

The text was updated successfully, but these errors were encountered:

joshmoore · 2023-03-09T07:40:47Z

imagesc-bot · 2023-03-13T21:52:58Z

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/napari-napari-ome-zarr-plugin-is-not-registered/78482/3

ziw-liu · 2023-03-20T22:05:04Z

Late to the discussion, but happy to move this code upstream for v0.4 too!

will-moore · 2023-04-21T13:38:18Z

Just starting to look at this again, thinking how ome-zarr-py might use pydantic-ome-ngff...

Not being familiar with pydantic yet... it looks like the easiest way to validate when reading an NGFF image is to parse the json?

from ome_zarr.io import parse_url
from ome_zarr.reader import Reader

from pydantic_ome_ngff.v04.multiscales import Multiscale
from pydantic import ValidationError

reader = Reader(parse_url("6001240.zarr"))
nodes = list(reader())
# first node will be the image pixel data
image_node = nodes[0]
ngff_json = image_node.zarr.root_attrs
try:
    Multiscale.parse_obj(ngff_json["multiscales"][0])
except ValidationError as e:
    print(f"Error found", e.json())

This approach keeps all the pydantic models separate from the ome-zarr-py classes.

But I'm wondering whether we should be thinking about combining them. E.g.

ome-zarr-py/ome_zarr/reader.py

Line 268 in 41bd443

class Multiscales(Spec):

and https://github.com/JaneliaSciComp/pydantic-ome-ngff/blob/176e4521acb5b0e4e9d19f3e15e72fedf87789e4/src/pydantic_ome_ngff/v04/multiscales.py#L67 seem like they're kinda equivalent, but I've not looked too closely at how that might work...?

ziw-liu · 2023-04-21T17:17:45Z

This approach keeps all the pydantic models separate from the ome-zarr-py classes.

But I'm wondering whether we should be thinking about combining them. E.g.

If combining them means that the class for array I/O is a child of a pydantic model class, I think it would generally make things cumbersome, since pydantic models are not designed to be stateful.

Edit: not

giovp · 2023-04-27T13:30:49Z

+1 for consolidating models and schema for ome-ngff with pydantic, both https://github.com/JaneliaSciComp/pydantic-ome-ngff and https://github.com/czbiohub/iohub/blob/main/iohub/ngff_meta.py are really great efforts!

clbarnes · 2023-07-09T14:27:50Z

If combining them means that the class for array I/O is a child of a pydantic model class, I think it would generally make things cumbersome, since pydantic models are not designed to be stateful.

The reader/writer class could compose over the pydantic class (e.g. at a .metadata instance variable), or pydantic could be used for validating IO and then the user-facing class could be constructed from pydantic models under the hood.

yarikoptic · 2024-02-21T15:23:30Z

cool kidz are also drinking some https://linkml.io coolaid these days as even higher level than pydantic semantic description of the model, and then producing pydantic and whatnot serializations/exports.

d-v-b · 2024-02-21T15:42:04Z

@yarikoptic can you link to a repo of said cool kidz doing this? Because while I have heard a lot about linkml from a "this looks cool" perspective, I have seen very little code, and the code I have seen looked like it was doing quite a bit more than we need.

joshmoore · 2024-02-21T17:04:21Z

@d-v-b, that's one use of linkml as opposed to anything core. Calling out some names that I know of:

I think we've discussed it elsewhere (Zürich?) but in general I don't mind the starting point of our pipeline being pydantic, but I very much think we need to be careful not to overfit to it because we will need to transform/translate out and into other representations. LinkML does that quite well. And for one of the representations I'm personally interested in (JSON-LD), it's likely one of the best, though I admit that's less critical for the core NGFF types than for additional metadata that I would like to record with/in the NGFF.

d-v-b · 2024-02-21T17:33:24Z

I think we've discussed it elsewhere (Zürich?) but in general I don't mind the starting point of our pipeline being pydantic, but I very much think we need to be careful not to overfit to it because we will need to transform/translate out and into other representations.

To be clear, the spec is JSON, so modeling the spec can be done by any python library that makes it easy to define JSON-serializable classes. With that constraint in mind, I think there's no risk that modeling the spec with pydantic, or dataclasses, or attrs, or marshmellow, would lead to any overfitting. On the other hand, not modeling the spec with one of the above libraries (or something functionally equivalent) in the reference python implementation of the spec generates friction for users and developers.

joshmoore · 2024-02-21T17:41:22Z

Sorry, I likely have mixed a few discussions. 👍 for a declarative representation at the core, I think we're clearly agreeing there, and then we can see how things shake out with pydantic from there.

satra · 2024-02-21T18:29:54Z

@joshmoore - just a note there was more work done during the workshop which brings us closer to defining the entire spec as linkml (https://github.com/linkml/linkml-arrays), which can in turn generate json, jsonld, etc.,.

melonora · 2024-02-21T19:13:55Z

@joshmoore - just a note there was more work done during the workshop which brings us closer to defining the entire spec as linkml (https://github.com/linkml/linkml-arrays), which can in turn generate json, jsonld, etc.,.

and work ongoing on named arrays:)

dpshepherd mentioned this issue Mar 9, 2023

Two questions about converting larger than memory ND data into ome-zarr #255

Open

will-moore mentioned this issue Mar 20, 2023

Don't handle exception in Multiscales init() #266

Merged

will-moore mentioned this issue Apr 20, 2023

Initial validate support, with --strict option #142

Closed

will-moore mentioned this issue May 9, 2023

POC validation of NGFF with pydantic_ome_ngff #281

Closed

tcompa mentioned this issue Jun 8, 2023

Add Pydantic model for Channels fractal-analytics-platform/fractal-tasks-core#386

Closed

tcompa mentioned this issue Sep 18, 2023

Extract attributes from ome-zarr rather than from metadata (whenever possible) fractal-analytics-platform/fractal-tasks-core#351

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use declarative representation of the spec #258

use declarative representation of the spec #258

d-v-b commented Mar 7, 2023

joshmoore commented Mar 9, 2023 •

edited by will-moore

imagesc-bot commented Mar 13, 2023

ziw-liu commented Mar 20, 2023

will-moore commented Apr 21, 2023

ziw-liu commented Apr 21, 2023 •

edited

giovp commented Apr 27, 2023

clbarnes commented Jul 9, 2023

yarikoptic commented Feb 21, 2024

d-v-b commented Feb 21, 2024

joshmoore commented Feb 21, 2024

d-v-b commented Feb 21, 2024

joshmoore commented Feb 21, 2024

satra commented Feb 21, 2024

melonora commented Feb 21, 2024

use declarative representation of the spec #258

use declarative representation of the spec #258

Comments

d-v-b commented Mar 7, 2023

joshmoore commented Mar 9, 2023 • edited by will-moore

imagesc-bot commented Mar 13, 2023

ziw-liu commented Mar 20, 2023

will-moore commented Apr 21, 2023

ziw-liu commented Apr 21, 2023 • edited

giovp commented Apr 27, 2023

clbarnes commented Jul 9, 2023

yarikoptic commented Feb 21, 2024

d-v-b commented Feb 21, 2024

joshmoore commented Feb 21, 2024

d-v-b commented Feb 21, 2024

joshmoore commented Feb 21, 2024

satra commented Feb 21, 2024

melonora commented Feb 21, 2024

joshmoore commented Mar 9, 2023 •

edited by will-moore

ziw-liu commented Apr 21, 2023 •

edited