# Solution to the exercises from [Zarr notebook](1_Zarr.ipynb)

## Exercise: Load data from S3 and segment 2D-plane in parallel 

Define a method to "load" the data and one to analyze.
We use ``image_id = 4007801``

In [24]:
image_id = 4007801

The method below will return a dask array **without** any binary data. The dimension order of the array returned is ``(TCZYX)``. Data will be loaded when requested during the analysis. 

In [25]:
import dask
import dask.array as da

def load_binary_from_s3(id, resolution='4'):
    endpoint_url = 'https://uk1s3.embassy.ebi.ac.uk/'
    root = 'idr/zarr/v0.1/%s.zarr/%s/' % (id, resolution)
    return da.from_zarr(endpoint_url + root)

Analyze method to use with ``dask.delayed``.

In [26]:
import dask_image.ndfilters
import dask_image.ndmeasure

def analyze(t, c, z):
    plane = data[t, c, z, :, :] 
    smoothed_image = dask_image.ndfilters.gaussian_filter(plane, sigma=[1, 1])
    threshold_value = 0.33 * da.max(smoothed_image).compute()
    threshold_image = smoothed_image > threshold_value
    label_image, num_labels = dask_image.ndmeasure.label(threshold_image)
    name = "t:%s, c: %s, z:%s" % (t, c, z)
    return label_image, name 

Load the dask array. This is very quick since we **do not** load any binary data.

In [27]:
%time data = load_binary_from_s3(image_id)

CPU times: user 22.4 ms, sys: 46.1 ms, total: 68.5 ms
Wall time: 452 ms


Check the size of the array 

In [7]:
print(data.shape)

(532, 2, 988, 128, 135)


Now use ``dask.delayed`` to segment 2D around the middle z-section, the middle timepoint and the first channel
Again this is very quick since we build the graph and do not perform the analysis

In [8]:
%%time
lazy_results = []
middle_z = data.shape[2] // 2
middle_t = data.shape[0] // 2
range_t = 5
range_z = 5
for t in range(middle_t - range_t, middle_t + range_t):
    for z in range(middle_z - range_z, middle_z + range_z):
        lazy_result = dask.delayed(analyze)(t, 1, z)
        lazy_results.append(lazy_result)
print(lazy_results)

[Delayed('analyze-54e5c532-603c-4dbd-b496-1aaeb9ae5680'), Delayed('analyze-db5a4511-ffa6-4ea7-886b-e40f2a45643a'), Delayed('analyze-ab644f68-30fb-4ccd-b7dd-993760f23ba7'), Delayed('analyze-e66859b0-6911-4d44-b259-87d3826da0ae'), Delayed('analyze-de278989-18ea-40cf-9e23-c5f4bfef023d'), Delayed('analyze-6e096205-8eae-40e8-b128-ddcb8e93b2d6'), Delayed('analyze-32da6a70-3d3c-4deb-8f81-10d7e488e9aa'), Delayed('analyze-232ff2e5-d119-411b-a883-4edc8be70670'), Delayed('analyze-b2d0da79-7129-4eeb-8799-602ef929fb50'), Delayed('analyze-8fb6ef21-1855-4101-b09d-5021547ff08d'), Delayed('analyze-aca0a367-7592-41a2-8f77-1b169ffc9b34'), Delayed('analyze-7ab7453a-4943-4033-8db7-0451c75201e7'), Delayed('analyze-f9aa1431-282a-423b-9eff-6fc12059a30d'), Delayed('analyze-6fbfedc0-27d4-47d5-b35b-9dea4e6fb558'), Delayed('analyze-4cde3011-2fb8-4b10-9228-d8a7bb5568c3'), Delayed('analyze-0aa307c4-abff-4c5b-b5c4-7c6cbbe09874'), Delayed('analyze-b123465a-b26b-463c-9bc8-6a1aa8a78474'), Delayed('analyze-4ea15fed-f54e

Run the analysis

In [9]:
%time results = dask.compute(*lazy_results)

CPU times: user 1min 44s, sys: 5.83 s, total: 1min 50s
Wall time: 1min 51s


Display the results 

In [11]:
import matplotlib.pyplot as plt
%matplotlib inline
from ipywidgets import *

def display(i=0):
    r, name = results[i]
    fig = plt.figure(figsize=(10, 10))
    plt.subplot(121)
    plt.imshow(r)
    plt.title(name)
    fig.canvas.flush_events()

interact(display, i= widgets.IntSlider(value=0, min=0, max=len(results)-1, step=1, description="Select Plane", continuous_update=False))


interactive(children=(IntSlider(value=0, continuous_update=False, description='Select Plane', max=99), Output(…

<function __main__.display(i=0)>

If interested in scripts instead of the notebook:
* [segmentation and dask.delayed](https://github.com/ome/omero-guide-python/blob/master/scripts/idr0044_zarr_segmentation_parallel.py)
* [segmentation and cluster](https://github.com/ome/omero-guide-python/blob/master/scripts/idr0044_zarr_segmentation_cluster.py)

## Exercise:  Load data from S3 and display published labels

We will use ``image_id = 6001247``. This time we load the binary immediately, load the lables and overlay them on top of the image.

In [12]:
image_id = 6001247

First defined two methods: one to load the binary data corresponding to the image and one to load the labels. 

In [16]:
import dask
import dask.array as da
from dask.diagnostics import ProgressBar
import numpy

def load_binary_from_s3(id, resolution='0'):
    endpoint_url = 'https://uk1s3.embassy.ebi.ac.uk/'
    root = 'idr/zarr/v0.1/%s.zarr/%s/' % (id, resolution)
    with ProgressBar():
        return numpy.asarray(da.from_zarr(endpoint_url + root))

In [17]:
def load_labels_from_s3(id, resolution='0'):
    endpoint_url = 'https://uk1s3.embassy.ebi.ac.uk/'
    root = 'idr/zarr/v0.1/%s.zarr/labels/%s/' % (id, resolution)
    return da.from_zarr(endpoint_url + root)

Load the binary. This time it is not instant

In [18]:
%%time
data = load_binary_from_s3(image_id)
print(data.shape)

[########################################] | 100% Completed | 11.4s
(1, 2, 257, 210, 253)
CPU times: user 2.38 s, sys: 732 ms, total: 3.11 s
Wall time: 12 s


Load the labels note that not labels are not on all planes. The labels have been only computed for one channel (``c = 1``). You can see a difference in the ``C`` dimension.

In [20]:
%%time
labels = load_labels_from_s3(image_id)
print(labels.shape)

(1, 1, 257, 210, 253)
CPU times: user 16 ms, sys: 17.7 ms, total: 33.8 ms
Wall time: 330 ms


Display the labels

In [23]:
import matplotlib.pyplot as plt
%matplotlib inline
from ipywidgets import *

def update(z=0):
    fig = plt.figure(figsize=(10, 10))
    plt.subplot(121)
    c = 1
    plt.imshow(data[0, c, z, :, :], cmap='gray')
    try:
        # due to the fact that the dimension along the C-axis is 1. The index is set to 0.
        plt.imshow(labels[0, 0, z, :, :], cmap='jet', alpha=0.5)
    except Exception:
        print(z)
interact(update, z= widgets.IntSlider(value=0, min=0, max=data.shape[2]-1, step=1, description="Select Z", continuous_update=False))

interactive(children=(IntSlider(value=0, continuous_update=False, description='Select Z', max=256), Output()),…

<function __main__.update(z=0)>