# Save data as OME-ZARR
(tutorial:save_ome_zarr)=

OME-ZARR is an emerging standard for storing bioimaging data in a cloud-friendly format.
In this tutorial, we will learn how to save microscopy images as OME-ZARR files using Python.

We are going to use/explore two libraries (out of many) that support writing OME-ZARR files:

- [ome-zarr-py](https://github.com/ome/ome-zarr-py)
- [ngff-zarr](https://github.com/fideus-labs/ngff-zarr)

In [1]:
import ngff_zarr as nz
from ome_zarr.writer import write_image
from ome_zarr.io import parse_url
from ome_zarr.format import FormatV05
from ome_zarr.scale import Scaler
from dask import array as da
import zarr
import shutil
from bioio import BioImage

from skimage import data


## A simple example: ngff-zarr

Before we go big, let's test and interact with ome-zarr data from a few common python libraries.
Let's go with the [cells3d dataset from skimage](https://scikit-image.org/docs/0.25.x/api/skimage.data.html#skimage.data.cells3d) as an example.
The data comes in ZCYX format:


In [2]:
image = data.cells3d()
image.shape

(60, 2, 256, 256)

We first lazily convert the image to an `NgffImage` like this:

In [3]:
ngff_image = nz.to_ngff_image(
    data=image,
    dims=['z', 'c', 'y', 'x'],
    name='cells3d')
ngff_image

NgffImage(data=dask.array<array, shape=(60, 2, 256, 256), dtype=uint16, chunksize=(60, 2, 256, 256), chunktype=numpy.ndarray>, dims=['z', 'c', 'y', 'x'], scale={'z': 1.0, 'y': 1.0, 'x': 1.0}, translation={'z': 0.0, 'y': 0.0, 'x': 0.0}, name='cells3d', axes_units=None, axes_orientations=None, computed_callbacks=[])

We can also inspect what ngff_zarr intends to do with the data when saving it as OME-ZARR:

In [4]:
ngff_image.data

Unnamed: 0,Array,Chunk
Bytes,15.00 MiB,15.00 MiB
Shape,"(60, 2, 256, 256)","(60, 2, 256, 256)"
Dask graph,1 chunks in 1 graph layer,1 chunks in 1 graph layer
Data type,uint16 numpy.ndarray,uint16 numpy.ndarray
"Array Chunk Bytes 15.00 MiB 15.00 MiB Shape (60, 2, 256, 256) (60, 2, 256, 256) Dask graph 1 chunks in 1 graph layer Data type uint16 numpy.ndarray",60  1  256  256  2,

Unnamed: 0,Array,Chunk
Bytes,15.00 MiB,15.00 MiB
Shape,"(60, 2, 256, 256)","(60, 2, 256, 256)"
Dask graph,1 chunks in 1 graph layer,1 chunks in 1 graph layer
Data type,uint16 numpy.ndarray,uint16 numpy.ndarray


### Chunking

That's a bit boring. The real strength of OME-ZARR is its chunked nature,
which splits large datasets into smaller pieces (chunks) that can be accessed independently.
Let's specifiy chunk sizes when converting to an `NgffImage`.
For this, we turn the image into a *lazy* dask array first, and then pass it on to `to_ngff_image`.

```{hint}
Depending on the application, choosing the right chunksize is a vital choice of the workflow.
For instance, it is usually a good idea to chunk along an axis that will be accessed frequently.

For multi-channel images, it is a good idea to have channels in separate chunks, so that when visualizing a single channel,
only the relevant chunks need to be loaded.

```

In [5]:
lazy_array = da.from_array(image, chunks=(10, 1, 64, 64))  # z, c, y, x
reordered_array = lazy_array.transpose(1, 0, 2, 3)  # c, z, y, x

`ZCYX` is a bit of an uncommon dimension order, so let's also specify the dimension names when converting to an NgffImage.
In our case, we choose a chunksize of `(1, 10, 64, 64)`, meaning that each chunk will contain 10 z-slices, 1 channel, and a 64x64 pixel area in y and x.

Let's *assume* that we know the pixel scaling for the image data.
We can specify this as well when converting to an NgffImage:

In [6]:
ngff_image = nz.to_ngff_image(
    data=reordered_array,
    dims=['c', 'z', 'y', 'x'],
    name='cells3d',
    scale={'z': 0.5, 'y': 0.5, 'x': 0.5}
    )
ngff_image.data

Unnamed: 0,Array,Chunk
Bytes,15.00 MiB,80.00 kiB
Shape,"(2, 60, 256, 256)","(1, 10, 64, 64)"
Dask graph,192 chunks in 2 graph layers,192 chunks in 2 graph layers
Data type,uint16 numpy.ndarray,uint16 numpy.ndarray
"Array Chunk Bytes 15.00 MiB 80.00 kiB Shape (2, 60, 256, 256) (1, 10, 64, 64) Dask graph 192 chunks in 2 graph layers Data type uint16 numpy.ndarray",2  1  256  256  60,

Unnamed: 0,Array,Chunk
Bytes,15.00 MiB,80.00 kiB
Shape,"(2, 60, 256, 256)","(1, 10, 64, 64)"
Dask graph,192 chunks in 2 graph layers,192 chunks in 2 graph layers
Data type,uint16 numpy.ndarray,uint16 numpy.ndarray


### Multiscales

Another important feature of OME-ZARR is the support for multiscale data.
This means that multiple resolutions of the same image are stored together,
which allows for efficient visualization and analysis of large images.
The cool thing about `ngff_zarr` and `ome_zarr_py` is that they can automatically generate multiscale images for us *and* calculate the correct metadta (scale, etc)

In other words, when zooming out, lower resolution versions of the image can be used,
which are faster to load and render.
Let's try it!

In [7]:
ngff_multiscales = nz.to_multiscales(
    data=ngff_image,
    scale_factors = [2, 4]
)

ngff_multiscales

Multiscales(images=[NgffImage(data=dask.array<rechunk-merge, shape=(2, 60, 256, 256), dtype=uint16, chunksize=(2, 60, 128, 128), chunktype=numpy.ndarray>, dims=['c', 'z', 'y', 'x'], scale={'z': 0.5, 'y': 0.5, 'x': 0.5}, translation={'z': 0.0, 'y': 0.0, 'x': 0.0}, name='cells3d', axes_units=None, axes_orientations=None, computed_callbacks=[]), NgffImage(data=dask.array<setitem, shape=(2, 30, 128, 128), dtype=uint16, chunksize=(2, 30, 128, 128), chunktype=numpy.ndarray>, dims=('c', 'z', 'y', 'x'), scale={'z': 1.0, 'y': 1.0, 'x': 1.0}, translation={'z': 0.25, 'y': 0.25, 'x': 0.25}, name='image', axes_units=None, axes_orientations=None, computed_callbacks=[]), NgffImage(data=dask.array<setitem, shape=(2, 15, 64, 64), dtype=uint16, chunksize=(2, 15, 64, 64), chunktype=numpy.ndarray>, dims=('c', 'z', 'y', 'x'), scale={'z': 2.0, 'y': 2.0, 'x': 2.0}, translation={'z': 0.75, 'y': 0.75, 'x': 0.75}, name='image', axes_units=None, axes_orientations=None, computed_callbacks=[])], metadata=Metadata(

#### Exercise

Explore a bit how the multiscale data is structured.

- How do you find the individual scales?
- How do you find the scale factors?

In [8]:
ngff_multiscales.images

[NgffImage(data=dask.array<rechunk-merge, shape=(2, 60, 256, 256), dtype=uint16, chunksize=(2, 60, 128, 128), chunktype=numpy.ndarray>, dims=['c', 'z', 'y', 'x'], scale={'z': 0.5, 'y': 0.5, 'x': 0.5}, translation={'z': 0.0, 'y': 0.0, 'x': 0.0}, name='cells3d', axes_units=None, axes_orientations=None, computed_callbacks=[]),
 NgffImage(data=dask.array<setitem, shape=(2, 30, 128, 128), dtype=uint16, chunksize=(2, 30, 128, 128), chunktype=numpy.ndarray>, dims=('c', 'z', 'y', 'x'), scale={'z': 1.0, 'y': 1.0, 'x': 1.0}, translation={'z': 0.25, 'y': 0.25, 'x': 0.25}, name='image', axes_units=None, axes_orientations=None, computed_callbacks=[]),
 NgffImage(data=dask.array<setitem, shape=(2, 15, 64, 64), dtype=uint16, chunksize=(2, 15, 64, 64), chunktype=numpy.ndarray>, dims=('c', 'z', 'y', 'x'), scale={'z': 2.0, 'y': 2.0, 'x': 2.0}, translation={'z': 0.75, 'y': 0.75, 'x': 0.75}, name='image', axes_units=None, axes_orientations=None, computed_callbacks=[])]

### Saving to disk

Finally, we can save the multiscale OME-ZARR to disk like this.
An important parameter here is the `version`, which specifies the OME-NGFF version to use.
Currently, version `0.5` is the latest released version of the [ngff specification](https://ngff.openmicroscopy.org/latest/index.html).

In [9]:
nz.to_ngff_zarr(
    store='cells3d.ome.zarr',
    multiscales=ngff_multiscales,
    version='0.5'
)

```{warning}
Note that running this command will always overwrite existing data at the specified location!
```

Lastly, use your file browser of choice to navigate to the saved `cells3d.ome.zarr` folder and explore its contents.
You should see a structure similar to this:

```
cells3d.ome.zarr/
├── .zarr.json
├── scale0/
│   ├── .zarr.json
│   └── cells3d/
│          └── c/
│              └── 0/
│              └── 1/
├── scale1/
│   ├── .zarr.json
│   └── cells3d/
│          └── c/
│              └── 0/
│              └── 1/
...
```

Which essentially reflects the chunking (`1` along the channel axis) and multiscale structure (scale0, scale1, ...) we specified when creating the NgffImage.

#### Exercise

Play around with different chunk sizes and see how this affects the structure of the saved OME-ZARR file.

## *Optional*: Repeat with ome-zarr-py

The `ome-zarr-py` library provides similar functionality for saving OME-ZARR files.
Let's repeat the steps above with this library.

Ome-zarr-py mirrors the read/write functionality of [zarr-python](https://zarr.readthedocs.io/en/stable/) more closely, which may or may not be an advantage depending on your use case. First, specify a non-empty directory to save the ome-zarr file in:

In [10]:
target_folder = r'cells3d_ome_zarr_py.ome.zarr'
shutil.rmtree(target_folder, ignore_errors=True)

store = parse_url(target_folder, mode='w').store
root = zarr.group(store=store)

### Multiscales and metadata

Applying the scaling and setting the `scale` information is slightly different here. First, we need to create a `Scaler` object, which takes care of the downsampling for us. The `downscale` parameter controls the downsampling factor between each scale level, and `max_layer` specifies how many levels to create. The `method` parameter defines the downsampling method to use.

In [11]:
max_layer = 2
factor = 2
scaler = Scaler(
    downscale=factor,
    max_layer=max_layer,
    method='local_mean'
)


If we want to pass scale information, we need to create the relevant metadata structure ourselves and pass it to the writer:

In [12]:
scale = [1.0, 0.5, 0.5, 0.5]

transformations = []
for i in range(0, max_layer + 1):
    scales = [1] + [s / (factor ** i) for s in scale[1:]]
    transformations.append(
        [{'type': 'scale', 'scale': scales}]
    )

transformations

[[{'type': 'scale', 'scale': [1, 0.5, 0.5, 0.5]}],
 [{'type': 'scale', 'scale': [1, 0.25, 0.25, 0.25]}],
 [{'type': 'scale', 'scale': [1, 0.125, 0.125, 0.125]}]]

#### Exercise

Use the `write_image` function to save the multiscale OME-ZARR file and the metadata we created to disk.

In [None]:
write_image?

[]

## Reading

While you *can* use the above-libraries to read ome-zarr,
there is merit in using a more general library that can read multiple formats.
If you are using multiple formats in your work,
this allows your workflows to be format-agnostic.
One such library is [Bioio](https://github.com/bioio-devs/bioio),
which is the successor of the well-known [aicsimageio](https://allencellmodeling.github.io/aicsimageio/) library.
Here's how to read an OME-ZARR file with Bioio:

In [14]:
image = BioImage('cells3d.ome.zarr/', )
image

<BioImage [plugin: bioio-ome-zarr installed at 2025-11-06 14:52:44.497175, Image-is-in-Memory: False]>

#### Exercise

Try to find out the scaling information from the loaded image.