# Practical 2: Dask with images

In the previous practical, we've seen that dask can help us parallelise computations on arrays. This can be useful for many operations typically performed on arrays like filtering.

In [None]:
# Let's load an example image

import numpy as np
from skimage import data
from scipy import ndimage
import tifffile

%matplotlib notebook

img = data.cells3d()
img = img.max(0)[1] # take only one channel and max project
img = ndimage.zoom(img, 10, order=1) # zoom in

tifffile.imshow(img)

How long does a gaussian filter take when applied to the entire image?

In [None]:
%%timeit -r 3
ndimage.gaussian_filter(img, sigma=5, mode='constant')

What if we subdivide the array into chunks and apply this filter to each chunk?

In [None]:
import dask.array as da

img_da = da.from_array(img,
                       chunks=(500, 500),
                       )
img_da

### `map_blocks`
We can use `dask.array.map_blocks` to apply a function to each chunk (or block) of the dask array.

In [None]:
filtered = da.map_blocks(
            ndimage.gaussian_filter, # the function to apply to each chunk
            img_da, # the array to apply the function to
            sigma=5, # arguments to the function
            mode='constant',
            )
filtered

Does this improve the timing?

In [None]:
%%timeit -r 3
filtered.compute(scheduler='threads')

In [None]:
%%timeit -r 3
filtered.compute(scheduler='processes')

Performance comparison: Applying the gaussian filter on each funk is faster when using multi-threading than when using multi-processing.

Why is this? While threads share memory, different processes need to send data back and forth, which can create considerable overhead.

Let's have a look at the output image.

In [None]:
print('entire image')
filtered_ndimage = ndimage.gaussian_filter(img, sigma=5, mode='constant')
tifffile.imshow(filtered_ndimage)

print('dask.array.map_blocks')
tifffile.imshow(filtered)

We can prevent these border artefacts by using `map_overlap` instead of `map_blocks`.

This:
1) adds neighboring chunk values to the borders of each chunk)
2) applies map_blocks as before
3) trims the previously added overlap from each chunk

In [None]:
filtered_overlap = \
    da.map_overlap(
            ndimage.gaussian_filter, # the function to apply to each chunk
            img_da, # the array to apply the function to
            sigma=5, # arguments to the function
            mode='constant',
            depth={0: 11, 1: 11}
            )
filtered_overlap

In [None]:
tifffile.imshow(filtered_overlap.compute())

In [None]:
%%timeit -r 3
filtered_overlap.compute(scheduler='threads')

## dask-image

There's a python package which automatically deals with these border effects and other problems that can occur when applying the functions available from scipy.ndimage to tiled dask arrays.

https://image.dask.org/en/latest/

The available `ndimage` functions:
https://image.dask.org/en/latest/coverage.html

Among others:
- affine_transform
- label
- ...

In [None]:
from dask_image import ndfilters

filtered_di = ndfilters.gaussian_filter(img_da, sigma=5, mode='constant')
filtered_di

In [None]:
tifffile.imshow(filtered_di.compute())

In [None]:
%%timeit -r 3
filtered_di.compute()

## More dask-image features

### Connected components

In [None]:
img_da = da.from_array(img, chunks=500)
seg = (ndfilters.gaussian_filter(img_da, sigma=10, mode='constant') > 10000)
tifffile.imshow(seg)

In [None]:
# Let's calculate connected components on each chunk of the segmentation image

def connected_components(im):
    return ndimage.label(im)[0]

labels = seg.map_blocks(connected_components)
tifffile.imshow(labels)

In [None]:
# Using overlap does not help in this case

def connected_components(im):
    return ndimage.label(im)[0]

labels = seg.map_overlap(
    connected_components,
    depth=100,
)
tifffile.imshow(np.array(labels))

In [None]:
# dask-image implements connected components

from dask_image import ndmeasure
labels = ndmeasure.label(seg)[0]
tifffile.imshow(labels)

### Affine transformations

In [None]:
# Define a transformation

from scipy.spatial.transform import Rotation as R

# rotation
matrix = R.from_rotvec(np.pi/4. * np.array([0, 0, 1])).as_matrix()[:2, :2]
offset = np.array([1200., -600])

print('Matrix:', matrix)
print('Offset:', offset)

In [None]:
# Transform the image using plain scipy

img_t = ndimage.affine_transform(
    img,
    matrix=matrix,
    offset=offset,
    order=1, # linear interpolation
    )

tifffile.imshow(img_t)

In [None]:
# Transform the image using dask_image.ndinterp.affine_transformation

from dask_image import ndinterp

img_t = ndinterp.affine_transform(
    img_da,
    matrix=matrix,
    offset=offset,
    order=1, # linear interpolation
    output_chunks=500,
    ).compute()
    
tifffile.imshow(img_t)

Performance comparison

In [None]:
%%timeit -r 1

img_t = ndimage.affine_transform(
    img,
    matrix=matrix,
    offset=offset,
    order=1, # linear interpolation
    )

In [None]:
%%timeit -r 1

img_t = ndinterp.affine_transform(
    img_da,
    matrix=matrix,
    offset=offset,
    order=1, # linear interpolation
    output_chunks=500,
    ).compute()

## Excercise: Apply a median filter

In [None]:
filtered = ...