Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issues with loading labels data with this plugin #99

Open
3 tasks
jni opened this issue Jan 24, 2024 · 10 comments
Open
3 tasks

issues with loading labels data with this plugin #99

jni opened this issue Jan 24, 2024 · 10 comments

Comments

@jni
Copy link
Contributor

jni commented Jan 24, 2024

I'm working on saving a dataset that contains multiple segmentations per image (manual segmentations by different people). Trying stuff out, I'm having three related issues:

  • the labels layer is invisible by default. This isn't super great for the intended use of the dataset. I understand that there may be performance reasons why that wouldn't work, but it might be worth considering the alternatives (e.g. only one labels layer is visible by default). Overall this would be minor if not for:
  • If I have more than one labels group in the same ome-zarr dataset, only the first group is loaded. (1) And,
  • If I only have labels in the ome-zarr dataset, the plugin fails to read at all. (2)

I test reading in with napari --plugin napari-ome-zarr path/to/example.zarr. I've got two scripts to reproduce the issue, both based on the write_image script from the ome-zarr-py docs.

Any advice here is appreciated! I don't know how much is intentional design vs how much is simply historical.

(1) Writing an ome-zarr image with two labels groups (only the first group is loaded)
import numpy as np
import zarr
import os

from skimage.data import binary_blobs
from ome_zarr.io import parse_url
from ome_zarr.writer import write_image

path = "test_ngff_image.zarr"
os.mkdir(path)

mean_val=10
size_xy = 128
size_z = 10
rng = np.random.default_rng(0)
data = rng.poisson(mean_val, size=(size_z, size_xy, size_xy)).astype(np.uint8)

# write the image data
store = parse_url(path, mode="w").store
root = zarr.group(store=store)
write_image(
        image=data, group=root, axes="zyx",
        scaler=None,  # don't create multiscales
        storage_options=dict(chunks=(1, size_xy, size_xy))
        )
# optional rendering settings
root.attrs["omero"] = {
    "channels": [{
        "color": "00FFFF",
        "window": {"start": 0, "end": 20, "min": 0, "max": 255},
        "label": "random",
        "active": True,
    }]
}


# add labels...
blobs = binary_blobs(length=size_xy, volume_fraction=0.1, n_dim=3).astype('int8')
blobs2 = binary_blobs(length=size_xy, volume_fraction=0.1, n_dim=3).astype('int8')
# blobs will contain values of 1, 2 and 0 (background)
blobs += 2 * blobs2

# label.shape is (size_xy, size_xy, size_xy), Slice to match the data
label = blobs[:size_z, :, :]

# write the labels to /labels
labels_grp = root.create_group("labels")
# the 'labels' .zattrs lists the named labels data
label_name = "blobs"
labels_grp.attrs["labels"] = [label_name]
label_grp = labels_grp.create_group(label_name)
# need 'image-label' attr to be recognized as label
label_grp.attrs["image-label"] = {}

write_image(label, label_grp, axes="zyx", scaler=None)


# label.shape is (size_xy, size_xy, size_xy), Slice to match the data
label2 = blobs[-size_z:, :, :]

# write the labels to /labels
labels_grp2 = root.create_group("labels2")
# the 'labels' .zattrs lists the named labels data
label_name2 = "blobs2"
labels_grp2.attrs["labels"] = [label_name2]
label_grp2 = labels_grp2.create_group(label_name)
# need 'image-label' attr to be recognized as label
label_grp2.attrs["image-label"] = {}

write_image(label2, label_grp2, axes="zyx", scaler=None)
(2) Writing an ome-zarr file with no image but with a labels group (reading fails altogether)
import numpy as np
import zarr
import os

from skimage.data import binary_blobs
from ome_zarr.io import parse_url
from ome_zarr.writer import write_image

path = "test_ngff_image_labels_only.zarr"
os.mkdir(path)

mean_val=10
size_xy = 128
size_z = 10
rng = np.random.default_rng(0)
data = rng.poisson(mean_val, size=(size_z, size_xy, size_xy)).astype(np.uint8)

# write the image data
store = parse_url(path, mode="w").store
root = zarr.group(store=store)

# add labels...
blobs = binary_blobs(length=size_xy, volume_fraction=0.1, n_dim=3).astype('int8')
blobs2 = binary_blobs(length=size_xy, volume_fraction=0.1, n_dim=3).astype('int8')
# blobs will contain values of 1, 2 and 0 (background)
blobs += 2 * blobs2

# label.shape is (size_xy, size_xy, size_xy), Slice to match the data
label = blobs[:size_z, :, :]

# write the labels to /labels
labels_grp = root.create_group("labels")
# the 'labels' .zattrs lists the named labels data
label_name = "blobs"
labels_grp.attrs["labels"] = [label_name]
label_grp = labels_grp.create_group(label_name)
# need 'image-label' attr to be recognized as label
label_grp.attrs["image-label"] = {}

write_image(label, label_grp, axes="zyx", scaler=None)
@jni
Copy link
Contributor Author

jni commented Jan 24, 2024

btw — I'm not really sure from reading the spec:

  • are labels without images allowed by the spec?
  • are multiple label groups allowed within a single file?

ie, are my issues above spec issues or implementation issues?

🙏

@jni
Copy link
Contributor Author

jni commented Jan 24, 2024

(but the fact that it talks about "the special 'labels' group" suggests that only one labels group is allowed...)

@FIrgolitsch
Copy link

FIrgolitsch commented Jan 26, 2024

(but the fact that it talks about "the special 'labels' group" suggests that only one labels group is allowed...)

To add to this:
https://ngff.openmicroscopy.org/latest/#labels-md

Multiple labels are allowed but not multiple labels groups. One labels group can be created in which one would nest all the labels the user wants. To quote the spec:

The "labels" group is nested within an image group, at the same level of the Zarr hierarchy as the resolution levels for the original image. The "labels" group is not itself an image; it contains images. The pixels of the label images MUST be integer data types, i.e. one of [uint8, int8, uint16, int16, uint32, int32, uint64, int64]. Intermediate groups between "labels" and the images within it are allowed, but these MUST NOT contain metadata. Names of the images in the "labels" group are arbitrary.

The labels group is essentially only a metadata group, not an image group. For example, I have multiple Zarr datasets where segmentation is stored in /labels/mask, not directly in /labels

This thus also means that labels without images don't make sense in the spec as the labels group directly relates to the main image in the OME-Zarr file. Conceptually this also makes sense to me, as labels without a main image are just a different image.

@imagesc-bot
Copy link

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/creating-multiple-labels-for-a-single-ome-ngff-ome-zarr-image/92343/3

@will-moore
Copy link
Member

I think we we decided not to load labels initially because we are in the OMERO mindset, and that is typical behaviour of OMERO image viewers. I would be happy to change this as the performance argument is not really valid with multiscale images (and masks are generally smaller dtypes than the image itself).

You need to nest multiple labels under a single labels group, instead of creating a sibling labels2 which will be ignored.
See this example, which allows me to load multiple labels in napari...

Updated script

# https://github.com/ome/napari-ome-zarr/issues/99

import numpy as np
import zarr
import os

from skimage.data import binary_blobs
from ome_zarr.io import parse_url
from ome_zarr.writer import write_image

path = "test_ngff_labels.zarr"
os.mkdir(path)

mean_val=10
size_xy = 128
size_z = 10
rng = np.random.default_rng(0)
data = rng.poisson(mean_val, size=(size_z, size_xy, size_xy)).astype(np.uint8)

# write the image data
store = parse_url(path, mode="w").store
root = zarr.group(store=store)
write_image(image=data, group=root, axes="zyx", storage_options=dict(chunks=(1, size_xy, size_xy)))
# optional rendering settings
root.attrs["omero"] = {
    "channels": [{
        "color": "00FFFF",
        "window": {"start": 0, "end": 20, "min": 0, "max": 255},
        "label": "random",
        "active": True,
    }]
}


def create_blobs(fraction):
    # add labels...
    blobs = binary_blobs(length=size_xy, volume_fraction=fraction, n_dim=3).astype('int8')
    blobs2 = binary_blobs(length=size_xy, volume_fraction=fraction, n_dim=3).astype('int8')
    # blobs will contain values of 1, 2 and 0 (background)
    blobs += 2 * blobs2
    # label.shape is (size_xy, size_xy, size_xy), Slice to match the data
    label = blobs[:size_z, :, :]
    return label


# write the labels to /labels
labels_grp = root.create_group("labels")
# the 'labels' .zattrs lists the named labels data

label_names = ["small_blobs", "big_blobs"]
fractions = [0.1, 0.4]

labels_grp.attrs["labels"] = label_names

for label_name, fraction, blue in zip(label_names, fractions, [0, 255]):

    label_grp = labels_grp.create_group(label_name)
    # need 'image-label' attr to be recognized as label
    label_grp.attrs["image-label"] = {
        "colors": [
            {"label-value": 1, "rgba": [255, 0, blue, 255]},
            {"label-value": 2, "rgba": [0, 255, blue, 255]},
            {"label-value": 3, "rgba": [255, 255, blue, 255]}
        ]
    }

    label_data = create_blobs(fraction)
    write_image(label_data, label_grp, axes="zyx")

Screenshot 2024-02-19 at 12 53 02

@jni
Copy link
Contributor Author

jni commented Feb 20, 2024

Conceptually this also makes sense to me, as labels without a main image are just a different image.

No, labels have very specific display and interpretation implications — they should ~never be displayed with a continuous colormap, for example. Essentially, they are categorical data, and there needs to be an indication of this, so that they are not interpreted as numerical.

Come to think of it, coining the term "categorical image" might not be a terrible idea... 🤔

See this example, which allows me to load multiple labels in napari...

Awesome, thanks @will-moore! Three questions:

  • Would you like me to submit this as an added docs example to ome-zarr-py? I think that would be handy in the future.
  • What about a standalone label image? I was able to get the plugin to read standalone label images by adding an "image-label": {}, key to the root group — which I don't mind, but it would be good to document, but I don't know whether I'm abusing the implementation and/or the spec. Personally I think it's important to be able to write out label images without their corresponding image data.
  • In such situations, I might want to link out to an external path (though of course that link could be broken if it's a filesystem link). Is this supported/planned? (I am agnostic as to whether this is a good idea. 😅)

@will-moore
Copy link
Member

  • Yes, please add to docs 👍
  • I was a bit confused by your "get the plugin to read standalone label images". What I understand now is "get the plugin to treat the [standalone] images as labels" by adding "image-label": {} alongside the "multiscales" in the root group. I don't think this is abusing the spec but yes, it could certainly do with better documentation.
  • external paths/links has been discussed a fair bit - e.g. Remote links ngff#13 but I know it's discussed elsewhere too but can't find it now.

In looking, I also found this (standalone labels) ome/ngff#179 which is relevant to this discussion.

@jni
Copy link
Contributor Author

jni commented Feb 21, 2024

Fantastic, thanks for the links and resources. I'll try to get things moving on this. 😊

@imagesc-bot
Copy link

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/save-a-single-labels-dataset-into-an-ome-zarr/93505/5

@d-v-b
Copy link

d-v-b commented Mar 14, 2024

No, labels have very specific display and interpretation implications — they should ~never be displayed with a continuous colormap, for example. Essentially, they are categorical data, and there needs to be an indication of this, so that they are not interpreted as numerical.

@jni over in ome/ngff#203 I proposed a) adding some metadata to describe the units of the values of an image, b) using that metadata to convey the "categoricalness" of an image, I'd be interested in your feedback on that idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants