 # Scaling-up Deep Learning Inference to Large-Scale Bioimage Data (part 1)

## Contact info:
- Fernando Cervantes
- Systems Analyst in JAX's Research IT
- email: fernando.cervantes@jax.org

## Outcomes of this tutorial:
- Learn to use Dask library with Zarr image data
- Implement and apply deep learning inference pipelines with Dask
- Save deep learning inference outputs as Zarr files

---
# Overview of the Dask package

Dask is lazy!

Find more about this [here](https://docs.dask.org/en/stable/array.html)

![image](https://docs.dask.org/en/stable/_images/dask-array.svg)

# 1. Manipulate Dask arrays

## 1.1 Create Dask arrays

- [ ] Create a $10\times10$ dask array of type `int16`, that is formed by chunks of size $5\times5$.

- [ ] Modify the content of the dask array using slice selection.

---
## 1.3 Execute the computation graph

- [ ] Visualize the information of the dask array and note the "**Dask graph**" property.

- [ ] Use the `.compute()` method of the dask array to trigger the actual computation of the instructions.

- [ ] Add more steps to the computation graph.

- [ ] Inspect the chunks' size of the resulting dask array

---
## 1.4 Rechunk Dask arrays

- [ ] Use the `.rechunk(...)` method of the dask array to change the size of each of its chunks.

- [ ] Apply some math operations on the dask array using `numpy`.

---
## 1.5 Persist vs Compute

- [ ] Use the `.persist()` method of the dask array to partially compute the operations graph.

---

## 1.3 Delayed operations

- [ ] Create a delayed function (decorated with `@dask.delayed`) that can be applied lazily

In [None]:

def grid_x(height, width, offset = 0):
    x = np.arange(offset, offset + width)
    return np.tile(x, (height, 1))


def grid_y(height, width, offset = 0):
    y = np.arange(offset, offset + height)
    return np.tile(y[:, None], (1, width))

---
## 1.4 Stack, Concatenate, and Block

- [ ] Find a way to create a $1000\times1000$ pixels image by joining multiple $500\times500$ pixels blocks 

---
# 2. Open Zarr files with Dask

- [ ]  Use the `tifffile` library to open a `.svs` image file, treating it as if it was a `Zarr` file (`aszarr=True`).

In [None]:
import zarr
import tifffile

- [ ] Inspect the `ZarrTiffStore` by opening the store object with `zarr.open`

- [ ] Use the `Store` object that is returned by `tifffile.imread` with `dask.array.from_zarr` function to open the image as a `dask.array`.

- [ ] Rechunk the image to have chunks of size $512\times512$

- [ ] Extract a window from the image to analyze

ℹ Dask arrays already work with `matplotlib.pyplot.imshow` without calling `.compute()`

---
# 3. [Example] Perform image processing on Dask arrays

- [ ] Convert an image region from RGB color to Gray scale.

---
# 4. [Exercise] Perform image analysis on Dask arrays

- [ ] Implement an object segmentation operation to detect nuclei pixels on a $2048\times2048$ pixels region.
    - [ ] Convert the image region from RGB to Gray
    - [ ] Reduce noise in the image region with a Gaussian Filter
    - [ ] Use a Thresholding algorithm to discriminate between structures given their pixel intensity

ℹ Dask arrays already work with `skimage` functions without calling `.compute()`

In [None]:
from dask_image import ndfilters

In [1]:
from skimage.filters import threshold_multiotsu

- [ ] Visualize the results using `Matplotlib`