## This example modified from https://examples.dask.org/applications/image-processing.html

This is a simple demonstration of image processing using dask arrays with [ghost cells](http://dask.pydata.org/en/latest/array-ghost.html).

We apply the Canny edge detection algorithm to our image. Which is suitable for ghosted arrays because it is relatively "local", that is each pixel depends on pixel only a small fixed distance away.

The algorithm applies a Gaussian filter to the image and then takes the 2D gradient. Points where the gradient is larger than some threshold are "edges". (Also see the Notes section of https://scikit-image.org/docs/stable/api/skimage.feature.html#canny)

So we create a dask array, then use it's `map_overlap` method to apply the edge detection function.

For more on image processing with dask:

- http://matthewrocklin.com/blog/work/2017/01/17/dask-images
- https://dask-image.github.io (new library, still in alpha, subject to change)

In [None]:
import numpy as np
import skimage as ski

import dask.array as da
from dask.diagnostics import ProgressBar

%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib

matplotlib.rcParams['figure.figsize'] = (15, 10)

In [None]:
! wget https://upload.wikimedia.org/wikipedia/commons/9/9b/Hs-2004-07-a-full_jpgNR.jpg

(Or try the even bigger one at https://stsci-opo.org/STScI-01EVSVTSTXG9BNAB0F66NFYXMC.png)

In [None]:
file_name = 'Hs-2004-07-a-full_jpgNR.jpg' # hubble ultra deep field
color_img = ski.io.imread(file_name) # ~ 100MB

This example is somewhat artificial, because the image does fit in memory.  However, it is quite possible that results may not.  Or, consider an image stack, where you have 1000 of these to operate on.

In [None]:
color_img.shape, color_img.nbytes * 1e-6 # still in memory here

In [None]:
# convert to greyscale
img = ski.color.rgb2gray(color_img)  # this reshapes the array, so it is 2D now.
img.shape

So we have the image in a numpy array. How does it look? We downsample this x200 so it does not crash the browser.

In [None]:
plt.imshow(img[::15, ::15], cmap='gray');

So lets create a dask array with this Numpy array.

In [None]:
arr = da.from_array(img, chunks=(1000, 1000))
arr.nbytes * 1e-6

In [None]:
arr

We wrap the scikit image canny function so we can pass it to dask array.

In [None]:
def func(block):
    return ski.feature.canny(block, sigma=1.2)

Now we can add the padding. We choose 10 pixels. And the external boundary to be periodic.

In [None]:
padding = {0: 10, 1:10}
boundary = {0: 'periodic', 1:'periodic'}
canny_array = arr.map_overlap(func, depth=padding, boundary='periodic')

In [None]:
with ProgressBar():
    out = canny_array[4600:5400, 2400:3200].compute()

We lets zoom in on an interesting section of this image. Loading the whole thing might crash the browser.

In [None]:
f, (ax0, ax1) = plt.subplots(1, 2, figsize=(10, 10))
ax0.imshow(color_img[4600:5400, 2400:3200, :])
ax1.imshow(out, cmap='gray');

If you need to compute the edges of the entire image, then skimage provides the shorthand `apply_parallel` function:

In [None]:
edges = ski.util.apply_parallel(func, img)
plt.imshow(edges[4600:5400, 2400:3200], cmap='gray')