Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow '.' dimension separator #161

Closed
wants to merge 1 commit into from
Closed

Allow '.' dimension separator #161

wants to merge 1 commit into from

Conversation

normanrz
Copy link
Contributor

I am proposing to allow the . dimension separator alongside /. In .zarray there is already the relevant metadata available, which OME-NGFF can defer to.

@github-actions
Copy link
Contributor

Automated Review URLs

@joshmoore
Copy link
Member

👍:

I think there are two remainingissues:

  • The primary one is what version of NGFF will this match. (Proposals needed.)
  • The other is that there are almost certainly datasets in the wild which use the non-default separator and which do NOT set the dimension_separator property. We might should warn as to this situation and perhaps (if possibly) validate it. cc: @will-moore

@normanrz
Copy link
Contributor Author

  • The primary one is what version of NGFF will this match. (Proposals needed.)

This could be a 0.4.2?

  • The other is that there are almost certainly datasets in the wild which use the non-default separator and which do NOT set the dimension_separator property. We might should warn as to this situation and perhaps (if possibly) validate it.

We could specify / as default in the OME-NGFF spec. Is there a default on the Zarr side?

@joshmoore
Copy link
Member

Is there a default on the Zarr side?

If there is no metadata in .zarray, then . is used. The problem stems from the fact that before dimension_separator was introduced, NestedDirectoryStore and friends wrote file sets with / without setting any metadata (which should be considered invalid/broken but requires a heuristic to detect.)

@normanrz
Copy link
Contributor Author

Is there a default on the Zarr side?

If there is no metadata in .zarray, then . is used. The problem stems from the fact that before dimension_separator was introduced, NestedDirectoryStore and friends wrote file sets with / without setting any metadata (which should be considered invalid/broken but requires a heuristic to detect.)

Do you think there is any v0.4 data out there that matches this description? That could be an argument for making this a v0.5 change.

@will-moore
Copy link
Member

The other is that there are almost certainly datasets in the wild which use the non-default separator and which do NOT set the dimension_separator property

If there are such files, these would already be considered invalid zarr files?
Or not valid v0.4 files anyway?

So, if we were to state "v0.4+ files with missing dimension_separator are invalid" that wouldn't be a breaking change?

@joshmoore
Copy link
Member

If there are such files, these would already be considered invalid zarr files?
Or not valid v0.4 files anyway?

Not valid NGFF before this PR, exactly. They are valid V2 files, but essentially the addition of dimension_separator was like a subtle v2.1 to fix the fact that it wasn't possible to detect them.

@will-moore
Copy link
Member

@normanrz
Copy link
Contributor Author

We already check that dimension separator is '/' for v0.2 -> v0.4: https://github.com/ome/ome-ngff-validator/pull/16/files#diff-b06a22b09da1dc9eba4472e0983b0c06000bb32a69693e8d8b86275a3034dd85R58

If I read the code correctly, it will yield an error if the dimension_separator field is not present.

@will-moore
Copy link
Member

@normanrz Yes. I didn't test it but I think it should do that (which is what we want, right)?
Also, it only checks for v0.2 - v0.4, so if we allow . in v0.5 it won't need updating.

@normanrz
Copy link
Contributor Author

normanrz commented Nov 11, 2022

I don't have a strong opinion about whether this should go into v0.4.2 or v0.5. However, i'm not sure if this small change is worth to trigger a v0.5 and/or to wait for bigger changes (that would definitely be a minor bump) to reach consensus.

@will-moore
Copy link
Member

I'm not really sure what a v0.4.2 means for schema validation etc.
Would it mean that previously generated 'v0.4' that use . dimension separator are now valid?
E.g. this is currently reported as invalid, but would become valid (except for the axes order):
https://ome.github.io/ome-ngff-validator/?source=https://data-humerus.webknossos.org/data/zarr/scalable_minds/skin/color

Screenshot 2022-11-11 at 13 07 00

It would also need the fix at ome/ome-zarr-py#235

@normanrz
Copy link
Contributor Author

I'm not really sure what a v0.4.2 means for schema validation etc.
Would it mean that previously generated 'v0.4' that use . dimension separator are now valid?

Yes, they would become valid.

It would also need the fix at ome/ome-zarr-py#235

Yep. Not sure what other tools would need adapting. My guess is, most already support . because of the underlying zarr implementation.

@joshmoore
Copy link
Member

@will-moore: after our recent discussion, thoughts on getting this rolled into 0.5?

cc: @kevinyamauchi

@joshmoore
Copy link
Member

@normanrz, are you still interested in pursuing this or would you prefer to focus on the move to V3?

@normanrz
Copy link
Contributor Author

@normanrz, are you still interested in pursuing this or would you prefer to focus on the move to V3?

No, closing this in favor of Zarr v3 support in #227.

@normanrz normanrz closed this Feb 21, 2024
@normanrz normanrz deleted the dim-sep branch February 21, 2024 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants