Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Labels section rewrite #165

Merged
merged 5 commits into from
Mar 1, 2023
Merged

Conversation

virginiascarlett
Copy link
Contributor

Hello all,

@bogovicj and I have co-written a new “labels” section to the OME-NGFF spec that combines the "labels" metadata and "image-label" metadata sections and hopefully improves their readability. We did our best to keep the actual spec the same, and just improve the language explaining it.

The new “labels” JSON example is simpler than the one discussed in issue #108. Hopefully the ambiguity of the previous example (it was unclear whether it was pointing to an image or a scale level) is cleared up. We wrote: “All images in the "labels" group SHOULD be listed here.” This was our interpretation of the existing statement that “Unlisted groups MAY be labels.” (BTW, any reason this is a MAY/SHOULD and not a MUST?)

Welcoming your feedback.

P.S. – For those who don’t know, I recently joined HHMI Janelia and I am new to the bioimaging field. This is my first PR ever!

@github-actions
Copy link
Contributor

github-actions bot commented Feb 9, 2023

Automated Review URLs

@joshmoore
Copy link
Member

Hi @virginiascarlett. Thanks for this! (To @bogovicj as well). It looks like the edits were done on a copy of 0.4/index.bs file but then committed as latest/index.bs which gives us a substantial number of, I think unintended, changes: https://github.com/ome/ngff/pull/165/files Any chance of getting a more minimal set?

@virginiascarlett
Copy link
Contributor Author

Oops! Yes, I had somehow copy-pasted text from the 0.4/ index into the latest/ index, causing quite a mess. I think it should be fixed now? Thanks for helping me learn.

@joshmoore
Copy link
Member

It does looked fixed, thanks! And no worries at all.

Copy link
Member

@will-moore will-moore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks.
Just one comment added...

]
}
```

Unlisted groups MAY be labels.
The `.zattrs` file within an image within a "labels" group MUST contain a `multiscales` JSON object. Within the `multiscales` object, the JSON array
associated with the `datasets` key MUST have the same number of entries (scale levels) as the original unlabeled image.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This MUST seems a bit more strict than necessary? E.g. I may have scale levels down to ~100 pixels for thumbnails of my original images, but there may be no use-case for generating any downsampling of labels or vice versa.
E.g. https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0062A/6001240.zarr has different numbers of scale levels for original image (3) and labels (4).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was our interpretation of the line in the current spec that reads, "image-label groups MUST also contain multiscales metadata and the two "datasets" series MUST have the same number of entries."

Your suggestion sounds fine to me. Maybe chage the new line to, "Within the multiscales object, the JSON array associated with the datasets key need not have the same number of entries (scale levels) as the original unlabeled image."?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies - I didn't remember that restriction from the current spec (and didn't check it before I commented).
I still feel that we shouldn't enforce the same number of datasets, and your suggestion sounds good, but I'd want to be sure that others are OK with the removal of that restriction (in case there's a reason that it was added that I'm not aware of)? cc @joshmoore?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconding @virginiascarlett's comment, this is definitely a section where the current specification is explicit about the requirement. Personally, I am not opposed to loosening this aspect of the specification but I think the details matter. While the example above brings an interesting perspective about the requirements for lower resolution levels, would we not at least recommend that the dimensions of the largest resolution should match the original image?

As this particular point differs the rest of this proposal which aims to clarifying the existing specification without changing it, I would suggest we migrate this conversation in a follow-up PR rather than burying a spec change discussion in the middle of a larger set of changes.

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the new section reads well to me. Apart from the spec change which I would propose to move elsewhere, I put inline comments for the two main points which I believe deserve clarification/consensus:

  • the rules for individual sections within this specification
  • the requirement level for the image-label key

@@ -1,25 +1,27 @@
{
"image-label": {
"version": "0.5-dev",
"version": "0.4",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is causing the tests to fail. Reverting to 0.5-dev should suffice


"image-label" metadata {#label-md}
----------------------------------
In addition to the `multiscales` key, the JSON object in this image-level `.zattrs` file SHOULD contain another key, `image-label`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image-label requirement was previously not explicitly specified. This raises the question of defining the minimal requirements for a label image. Trying to combine the various points above:

  • the label image MUST be stored within the labels group
  • the label image SHOULD be registered within the labels metadata
  • the label image MUST implement the multiscales specification (see above for a discussion about the resolution requirements)
  • the pixel type MUST be integer data types

For the image-label specification, my personal opinion would be to enforce it at a MUST level. Doing so would have the advantage of making it unambiguous and potentially reducing the number of graph operations.

@@ -444,59 +444,65 @@ for more information.
"labels" metadata {#labels-md}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of readability, I like the proposition of merging the two sections as both need to be implemented jointly in practice and it helps with the flow. Possibly we want to add another {#label-md} anchor here so that existing bookmarks still link to the relevant place in the specification.

My primary concern is that this creates 3 very different strategies across the document

  • using sections to define multiple related metadata concepts e.g. label/image-label as proposed here
  • using sections to define top-level metadata concepts e.g. well, plate, multiscales
  • using sections to define any complex metadata concept e.g. axes, coordinateTransformations

A lot of this discrepancy predates this proposal but my concern is that this amplifies the existing divergence and creates confusion. At least we should be able to discuss the options

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this warrants a Github Discussion?

]
}
```

Unlisted groups MAY be labels.
The `.zattrs` file within an image within a "labels" group MUST contain a `multiscales` JSON object. Within the `multiscales` object, the JSON array
associated with the `datasets` key MUST have the same number of entries (scale levels) as the original unlabeled image.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconding @virginiascarlett's comment, this is definitely a section where the current specification is explicit about the requirement. Personally, I am not opposed to loosening this aspect of the specification but I think the details matter. While the example above brings an interesting perspective about the requirements for lower resolution levels, would we not at least recommend that the dimensions of the largest resolution should match the original image?

As this particular point differs the rest of this proposal which aims to clarifying the existing specification without changing it, I would suggest we migrate this conversation in a follow-up PR rather than burying a spec change discussion in the middle of a larger set of changes.

@virginiascarlett
Copy link
Contributor Author

Thanks for your feedback! Here's how I propose we move forward.

First, I will make a new commit with the following changes:

  • Correct version for the color_properties file (Should be 0.5-dev. My silly mistake.).
  • Clearer language emphasizing three of Sebastien's points:
  • the label image SHOULD be registered within the labels metadata
  • the label image MUST implement the multiscales specification
  • the pixel type MUST be integer data types

Then, I will create a NEW PR with the following changes:

  • The label image MAY have the same number of scale levels as the original image. Therefore, it is acceptable for the label image to have fewer scale levels than the original image.
  • The label image MUST be stored within the labels group.
  • The label image .zattrs file MUST contain the image-label key.

In this way, I believe we can separate superficial changes to the language from substantive changes to the specification itself.

Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Barring the discussion on the section merging/restructuring, the proposed changes make sense to me. Thanks for the contribution

latest/index.bs Outdated Show resolved Hide resolved
Copy link
Member

@will-moore will-moore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the / fix I think

Copy link
Member

@joshmoore joshmoore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, everyone. Merging.

A few last minute thoughts:

  • @virginiascarlett: we haven't had huge success with the GH discussion functionality yet. (Neither here nor on the Zarr repo.) You are more than welcome to give it a try, but if it doesn't get the buy-in you want, I'd suggest an issue or image.sc.
  • For the follow-on issue, do note there are several other label issues out there. If we start to get into larger changes (e.g. that would change the spec version) we might want to take a step back and consider other options. Another possibility would be to update the schema to capture the current state and then let 0.5 progress as planned.

@joshmoore joshmoore merged commit 6ef5739 into ome:main Mar 1, 2023
github-actions bot added a commit that referenced this pull request Mar 1, 2023
…rewrite

Labels section rewrite

SHA: 6ef5739
Reason: push, by @joshmoore

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-actions bot added a commit to d-v-b/ngff that referenced this pull request Mar 1, 2023
…ls_rewrite

Labels section rewrite

SHA: 6ef5739
Reason: push, by @d-v-b

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@virginiascarlett virginiascarlett deleted the labels_rewrite branch March 7, 2023 18:06
@sbesson
Copy link
Member

sbesson commented Jun 8, 2023

Incidentally coming back to this, I realise this change has been added to the Version History under 0.4.1 but is only published on https://ngff.openmicroscopy.org/latest/index.html#labels-md.

Should it also be ported to https://ngff.openmicroscopy.org/0.4/index.html#labels-md ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants