Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Labels section rewrite #165

Merged
merged 5 commits into from
Mar 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 10 additions & 8 deletions latest/examples/label_strict/colors_properties.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,25 @@
"version": "0.5-dev",
"colors": [
{
"label-value": 1,
"rgba": [255, 255, 255, 255]
"label-value": 0,
"rgba": [0, 0, 128, 128]
},
{
"label-value": 4,
"rgba": [0, 255, 255, 128]
"label-value": 1,
"rgba": [0, 128, 0, 128]
}
],
"properties": [
{
"label-value": 1,
"label-value": 0,
"area (pixels)": 1200,
"class": "foo"
"class": "intercellular space"
},
{
"label-value": 4,
"area (pixels)": 1650
"label-value": 1,
"area (pixels)": 1650,
"class": "cell",
"cell type": "neuron"
}
],
"source": {
Expand Down
77 changes: 44 additions & 33 deletions latest/index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -444,59 +444,65 @@ for more information.
"labels" metadata {#labels-md}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of readability, I like the proposition of merging the two sections as both need to be implemented jointly in practice and it helps with the flow. Possibly we want to add another {#label-md} anchor here so that existing bookmarks still link to the relevant place in the specification.

My primary concern is that this creates 3 very different strategies across the document

  • using sections to define multiple related metadata concepts e.g. label/image-label as proposed here
  • using sections to define top-level metadata concepts e.g. well, plate, multiscales
  • using sections to define any complex metadata concept e.g. axes, coordinateTransformations

A lot of this discrepancy predates this proposal but my concern is that this amplifies the existing divergence and creates confusion. At least we should be able to discuss the options

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this warrants a Github Discussion?

------------------------------

The special group "labels" found under an image Zarr contains the key `labels` containing
the paths to label objects which can be found underneath the group:
In OME-Zarr, Zarr arrays representing pixel-annotation data are stored in a group called "labels". Some applications--notably image segmentation--produce
a new image that is in the same coordinate system as a corresponding multiscale image (usually having the same dimensions and coordinate transformations).
This new image is composed of integer values corresponding to certain labels with custom meanings. For example, pixels take the value 1 or 0
if the corresponding pixel in the original image represents cellular space or intercellular space, respectively.
Such an image is referred to in this specification as a 'label image'.

The "labels" group is nested within an image group, at the same level of the Zarr hierarchy as the resolution levels for the original image.
The "labels" group is not itself an image; it contains images. The pixels of the label images MUST be integer data types, i.e. one of
[`uint8`, `int8`, `uint16`, `int16`, `uint32`, `int32`, `uint64`, `int64`]. Intermediate groups between "labels" and the images within it are allowed,
but these MUST NOT contain metadata. Names of the images in the "labels" group are arbitrary.

The `.zattrs` file associated with the "labels" group MUST contain a JSON object with the key `labels`, whose value is a JSON array of paths to the
labeled multiscale image(s). All label images SHOULD be listed within this metadata file. For example:

```json
{
"labels": [
"orphaned/0"
"cell_space_segmentation"
]
}
```

Unlisted groups MAY be labels.
The `.zattrs` file for the label image MUST implement the multiscales specification. Within the `multiscales` object, the JSON array
associated with the `datasets` key MUST have the same number of entries (scale levels) as the original unlabeled image.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This MUST seems a bit more strict than necessary? E.g. I may have scale levels down to ~100 pixels for thumbnails of my original images, but there may be no use-case for generating any downsampling of labels or vice versa.
E.g. https://ome.github.io/ome-ngff-validator/?source=https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0062A/6001240.zarr has different numbers of scale levels for original image (3) and labels (4).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was our interpretation of the line in the current spec that reads, "image-label groups MUST also contain multiscales metadata and the two "datasets" series MUST have the same number of entries."

Your suggestion sounds fine to me. Maybe chage the new line to, "Within the multiscales object, the JSON array associated with the datasets key need not have the same number of entries (scale levels) as the original unlabeled image."?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies - I didn't remember that restriction from the current spec (and didn't check it before I commented).
I still feel that we shouldn't enforce the same number of datasets, and your suggestion sounds good, but I'd want to be sure that others are OK with the removal of that restriction (in case there's a reason that it was added that I'm not aware of)? cc @joshmoore?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconding @virginiascarlett's comment, this is definitely a section where the current specification is explicit about the requirement. Personally, I am not opposed to loosening this aspect of the specification but I think the details matter. While the example above brings an interesting perspective about the requirements for lower resolution levels, would we not at least recommend that the dimensions of the largest resolution should match the original image?

As this particular point differs the rest of this proposal which aims to clarifying the existing specification without changing it, I would suggest we migrate this conversation in a follow-up PR rather than burying a spec change discussion in the middle of a larger set of changes.


"image-label" metadata {#label-md}
----------------------------------
In addition to the `multiscales` key, the JSON object in this image-level `.zattrs` file SHOULD contain another key, `image-label`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image-label requirement was previously not explicitly specified. This raises the question of defining the minimal requirements for a label image. Trying to combine the various points above:

  • the label image MUST be stored within the labels group
  • the label image SHOULD be registered within the labels metadata
  • the label image MUST implement the multiscales specification (see above for a discussion about the resolution requirements)
  • the pixel type MUST be integer data types

For the image-label specification, my personal opinion would be to enforce it at a MUST level. Doing so would have the advantage of making it unambiguous and potentially reducing the number of graph operations.

whose value is also a JSON object. The `image-label` object stores information about the display colors, source image, and optionally,
further arbitrary properties of the label image. That `image-label` object SHOULD contain the following keys: first, a `colors` key,
whose value MUST be a JSON array describing color information for the unique label values. Second, a `version` key, whose value MUST be a
string specifying the version of the OME-NGFF `image-label` schema.

Groups containing the `image-label` dictionary represent an image segmentation
in which each unique pixel value represents a separate segmented object.
`image-label` groups MUST also contain `multiscales` metadata and the two
"datasets" series MUST have the same number of entries.
Conforming readers SHOULD display labels using the colors specified by the `colors` JSON array, as follows. This array contains one
JSON object for each unique custom label. Each of these objects MUST contain the `label-value` key, whose value MUST be the integer
corresponding to a particular label. In addition to the `label-value` key, the objects in this array MAY contain an `rgba` key whose
value MUST be an array of four integers between 0 and 255, inclusive. These integers represent the `uint8` values of red, green, and
blue that comprise the final color to be displayed at the pixels with this label. The fourth integer in the `rgba` array represents alpha,
or the opacity of the color. Additional keys under `colors` are allowed.

The `image-label` dictionary SHOULD contain a `colors` key whose value MUST be a
list of JSON objects describing the unique label values. Each color object MUST
contain the `label-value` key whose value MUST be an integer specifying the
pixel value for that label. It MAY contain an `rgba` key whose value MUST be an array
of four integers between 0 and 255 `[uint8, uint8, uint8, uint8]` specifying the label
color as RGBA. All the values under the `label-value` key MUST be unique. Clients
who choose to not throw an error SHOULD ignore all except the _last_ entry.
Next, the `image-label` object MAY contain the following keys: a `properties` key, and a `source` key.

Some implementations MAY represent overlapping labels by using a specially assigned
value, for example the highest integer available in the pixel range.
Like the `colors` key, the value of the `properties` key MUST be an array of JSON objects describing the set of unique possible pixel values.
Each object in the `properties` array MUST contain the `label-value` key, whose value again MUST be an integer specifying the pixel value for that label.
Additionally, an arbitrary number of key-value pairs MAY be present for each label value, denoting arbitrary metadata associated with that label.
Label-value objects within the `properties` array do not need to have the same keys.

The `image-label` dictionary MAY contain a `properties` key whose value MUST be a
list of JSON objects which also describes the unique label values. Each property object
MUST contain the `label-value` key whose value MUST be an integer specifying the pixel
value for that label. Additionally, an arbitrary number of key-value pairs
MAY be present for each label value denoting associated metadata. Not all label
values must share the same key-value pairs within the properties list.
The value of the `source` key MUST be a JSON object containing information about the original image from which the label image derives.
This object MAY include a key `image`, whose value MUST be a string specifying the relative path to a Zarr image group.
The default value is `../../` since most labeled images are stored in a "labels" group that is nested within the original image group.

The `image-label` dictionary MAY contain a `source` key whose value MUST be a JSON
object containing information on the image the label is associated with. If included,
it MAY include a key `image` whose value MUST be a string specifying the relative
path to a Zarr image group. The default value is "../../" since most labels are stored
under a subgroup named "labels/" (see above).

The `image-label` dictionary SHOULD contain a `version` key whose value MUST be a string
specifying the version of the image-label specification.
Here is an example of a simple `image-label` object for a label image in which 0s and 1s represent intercellular and cellular space, respectively:

<pre class=include-code>
path: examples/label_strict/colors_properties.json
highlight: json
</pre>

In this case, the pixels consisting of a 0 in the Zarr array will be displayed as 50% blue and 50% opacity. Pixels with a 1 in the Zarr array,
which correspond to cellular space, will be displayed as 50% green and 50% opacity.

"plate" metadata {#plate-md}
----------------------------

Expand Down Expand Up @@ -671,6 +677,11 @@ Version History {#history}
<td>Description</td>
</tr>
</thead>
<tr>
<td>0.4.1</td>
<td>2023-02-09</td>
<td>expand on "labels" description</td>
</tr>
<tr>
<td>0.4.1</td>
<td>2022-09-26</td>
Expand Down