
<center>

# dask-image: distributed image processing for large data

## Presenter: Genevieve Buckley
<img src="imgs/dask-icon.svg#thumbnail" alt="dask logo" width="100"/></center>
<!-- <img src="https://marketing.dask.org/en/latest/_images/dask-icon.svg#thumbnail" alt="dask logo" width="100"/> -->

# Who needs dask-image?

If you're using `numpy` and/or `scipy.ndimage` and are running out of RAM, dask-image is for you.

## Two main use cases
1. Batch processing
2. Large field of view

# Motivating examples


# Getting started
https://github.com/dask/dask-image/

## conda
```
conda install -c conda-forge dask-image
```

## pip

```
pip install dask-image
```


# What's included?

* imread
* ndfilters
* ndfourier
* ndmeasure
* ndmorph



# Function coverage

<img src="imgs/function-coverage-table.png#thumbnail" alt="Table of function coverage: scipy.ndimage compared to dask-image http://image.dask.org/en/latest/coverage.html" width="900"/>


# Familiar API

# Let's build a pipeline!

1. Reading in data
2. Filtering images
3. Segmenting objects
4. Morphological operations
5. Measuring objects

# 1. Reading in data

```python
from dask_image.imread import imread

images = imread("path/to/files/*.tif")
```

## Alternate: read data from zarr (or n5, or hdf5)



Image display

# 2. Filtering images

# 3. Segmenting objects

# 4. Morphological operations


# 5. Measuring objects

In [None]:
# Pipeline

```python

```

# Custom functionality

What if you want to do something that isn't included?

* scikit-image [apply_parallel()](https://scikit-image.org/docs/dev/api/skimage.util.html#skimage.util.apply_parallel)
* dask [map_overlap](https://docs.dask.org/en/latest/array-overlap.html?highlight=map_overlap#dask.array.map_overlap) / [map_blocks](https://docs.dask.org/en/latest/array-api.html?highlight=map_blocks#dask.array.map_blocks)
* dask [delayed](https://docs.dask.org/en/latest/delayed.html) 

# GPU support

Latest release includes GPU support for the modules:
* ndfilters
* ndmorph
* imread

Still to do:

* ndfourier
* ndmeasure

```python
# CPU example
import numpy as np
import dask.array as da
from dask_image.ndfilters import convolve

s = (10, 10)
a = da.from_array(np.arange(int(np.prod(s))).reshape(s), chunks=5)
w = np.ones(a.ndim * (3,), dtype=np.float32)
result = convolve(a, w)
result.compute()
```

```python
# Same example moved to the GPU
import cupy  # <- import cupy instead of numpy
import dask.array as da
from dask_image.ndfilters import convolve

s = (10, 10)
a = da.from_array(cupy.arange(int(cupy.prod(cupy.array(s)))).reshape(s), chunks=5)  # <- cupy dask array
w = cupy.ones(a.ndim * (3,))  # <- cupy dask array
result = convolve(a, w)
result.compute()
```

# GPU benchmarking

| Architecture    | Time      |
|-----------------|-----------|
| Single CPU Core | 2hr 39min |
| Forty CPU Cores | 11min 30s |
| One GPU         | 1min 37s  |
| Eight GPUs      | 19s       |

http://matthewrocklin.com/blog/work/2019/01/03/dask-array-gpus-first-steps


# dask-image

* Install: `conda` or `pip install dask-image`
* Documentation: https://dask-image.readthedocs.io
* GitHub: https://github.com/dask/dask-image/

 <img src="imgs/dask-icon.svg#thumbnail" alt="dask logo" width="100"/></center>