# Reading OME-Zarr files

[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ome/EMBL-EBI-imaging-course-04-2024/blob/main/Day_5/Reading_images.ipynb)


## Learning Objectives

* Access OME-Zarr files over https
* Learn how to access local OME-Zarr file in Python
* Learn how to access remote OME-Zarr file in Python

There are several ways to access data. For the purpose of the topics covered in this workshop, we will access files over ``https`` and use [dask](https://dask.org/)

Some sofware packages require to have all the 2D planes in memory in order to work other can work on planar data. We will now show two mechanisms to access the data depending on the needs using ``dask.array.from_zarr``.


## Launch

This notebook uses the ``environment.yml`` file.

See [Setup](./workshop.ipynb).

### Install dependencies if required

The cell below will install dependencies if you choose to run the notebook in [Google Colab](https://colab.research.google.com/notebooks/intro.ipynb#recent=true). **Do not run the cell if you are not running the notebook on Google Colab**.


If using Google Colab, **do not** use the ``Runtime>Run all`` entry.

In [None]:
%pip install aiohttp==3.8.4 zarr==2.14.2

## How to access local OME-Zarr file using Python

In the [Conversion](./Conversion.ipynb), we view the ome-zarr file generated locally using napari.
Now we will open the file using Python. This is useful when analysing the data.

We first look at an existing ``ome.zarr`` file i.e. ``mri.ome.zarr``

In [1]:
import dask
import dask.array as da
from dask.diagnostics import ProgressBar
import numpy

In [2]:
def load_binary_from_local_with_data(path):
    with ProgressBar():
        return numpy.asarray(da.from_zarr(path))

**Do not run the cell below is running in Colab**

In [4]:
%%time
image_location = 'images/mri.ome.zarr/s0'
data = load_binary_from_local_with_data(image_location)
print(data.shape)

[########################################] | 100% Completed | 106.21 ms
(27, 226, 186)
CPU times: user 171 ms, sys: 70 ms, total: 241 ms
Wall time: 1.95 s


**Exercise**: if you have generated a file locally as part of the [conversion workflow](Conversion.ipynb), change the ``image_location`` parameter to, for example, ``/tmp/conversion_out/B4_C3.ome.zarr/0/0``.

**Do not run the cell below is running in Colab**

In [5]:
import matplotlib.pyplot as plt
%matplotlib inline
from ipywidgets import *

n = 2
if len(data.shape) == 3:
    n = 0
    
def update(z=0):
    fig = plt.figure(figsize=(10, 10))
    plt.subplot(121)
    c = 1
    t = 0
    if len(data.shape) == 3: 
        plt.imshow(data[z, :, :])
    else:
        plt.imshow(data[t, c, z, :, :])
    fig.canvas.flush_events()




**Do not run the cell below is running in Colab**

In [6]:
interact(update, z= widgets.IntSlider(value=0, min=0, max=data.shape[n]-1, step=1, description="Select slice", continuous_update=False))

interactive(children=(IntSlider(value=0, continuous_update=False, description='Select slice', max=26), Output(…

<function __main__.update(z=0)>

In [7]:
import matplotlib.pyplot as plt
%matplotlib inline
from ipywidgets import *


    
def update(z=0):
    fig = plt.figure(figsize=(10, 10))
    plt.subplot(121)
    c = 1
    t = 0
    plt.imshow(data[t, c, z, :, :])
    fig.canvas.flush_events()


## How to access OME-Zarr file on S3

To view the data in S3, several options are possible. 
For the purpose of this workshop, we will view the data over ``https``.

## Read the OME-Zarr file stored in S3 using Python
We use the same image as in Day 4. The Tiff image has been converted into OME-Zarr and is available on S3.

In [8]:
image_id = 6001247

In [9]:
ENPOINT_URL = 'https://uk1s3.embassy.ebi.ac.uk/'

### Option 1: Lazy Loading

The method below will return a dask array **without** any binary data i.e. **lazy loading**. The dimension order of the array returned is ``(TCZYX)``. 

Main point to keep in mind is that binary data are not loaded until it is used, i.e. it is **lazily loaded**. 
The plane will be loaded when the slider is moved.

In [18]:
def load_binary_from_s3(name, resolution='0'):
    root = '%s/%s/' % (name, resolution)
    return da.from_zarr(ENPOINT_URL + root)

In [19]:
%%time 
name = 'idr/zarr/v0.1/%s.zarr' % image_id
data = load_binary_from_s3(name)
print(data.shape)

(1, 2, 257, 210, 253)
CPU times: user 21.1 ms, sys: 17 ms, total: 38.1 ms
Wall time: 316 ms


In [29]:
# Each plane is loaded when the slider is moved
interact(update, z= widgets.IntSlider(value=0, min=0, max=data.shape[2]-1, step=1, description="Select Z", continuous_update=False))

interactive(children=(IntSlider(value=0, continuous_update=False, description='Select Z', max=256), Output()),…

<function __main__.update(z=0)>

### Option 2: Load the binary
Load the binary. In that case, we load the 5D-image. This might be required when using a software needing to access the 5D-image to analyse the data. This approach should only be used if the 5D-image is required.

In [20]:
def load_binary_from_s3_with_data(name, resolution='0'):
    root = '%s/%s/' % (name, resolution)
    with ProgressBar():
        return numpy.asarray(da.from_zarr(ENPOINT_URL + root))

In [21]:
%%time 
name = 'idr/zarr/v0.1/%s.zarr' % image_id
data = load_binary_from_s3_with_data(name)
print(data.shape)

[########################################] | 100% Completed | 19.20 s
(1, 2, 257, 210, 253)
CPU times: user 3.8 s, sys: 728 ms, total: 4.53 s
Wall time: 19.6 s


This time when the slider is moved, the plate is loaded from disk since it has already been downloaded.

In [32]:
interact(update, z= widgets.IntSlider(value=0, min=0, max=data.shape[2]-1, step=1, description="Select Z", continuous_update=False))

interactive(children=(IntSlider(value=0, continuous_update=False, description='Select Z', max=256), Output()),…

<function __main__.update(z=0)>

### License (BSD 2-Clause)
Copyright (C) 2023 University of Dundee. All Rights Reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.