# Dask Integration with pyclespranto

This demo is to illustrate a proper setup of GPU workers with [Dask](https://docs.dask.org/en/stable/) client API.

pyclespranto has a very good integration with Dask. However GPU often has less VRAM than host (CPU) side. It is recommended to setup a separate GPU worker to manage pyclespranto operations.

In [1]:
# Header import with specific computing device. Here we select a GPU for the opencl operations.
import pyclesperanto as cle
cle.select_device()

(OpenCL) NVIDIA GeForce RTX 4090 (OpenCL 3.0 CUDA)
	Vendor:                      NVIDIA Corporation
	Driver Version:              535.216.01
	Device Type:                 GPU
	Compute Units:               128
	Global Memory Size:          24217 MB
	Maximum Object Size:         6054 MB
	Max Clock Frequency:         2625 MHz
	Image Support:               Yes

## Dask Crash Course

To begin with, we create a big random Dask array. This suppose to be capable for a single task run on the 8GB GPU

In [2]:
from dask import array
dask_array = array.random.random((450, 1024,1024))
# dask_array = array.random.random((128, 1024, 1024))
dask_array

Unnamed: 0,Array,Chunk
Bytes,3.52 GiB,126.51 MiB
Shape,"(450, 1024, 1024)","(255, 255, 255)"
Dask graph,50 chunks in 1 graph layer,50 chunks in 1 graph layer
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 3.52 GiB 126.51 MiB Shape (450, 1024, 1024) (255, 255, 255) Dask graph 50 chunks in 1 graph layer Data type float64 numpy.ndarray",1024  1024  450,

Unnamed: 0,Array,Chunk
Bytes,3.52 GiB,126.51 MiB
Shape,"(450, 1024, 1024)","(255, 255, 255)"
Dask graph,50 chunks in 1 graph layer,50 chunks in 1 graph layer
Data type,float64 numpy.ndarray,float64 numpy.ndarray


Perform Gaussian blur with dask_array on GPU

In [3]:
blurred_device = cle.gaussian_blur(dask_array)
blurred_device

0,1
,"cle._ image shape(450, 1024, 1024) dtypefloat32 size1.8 GB min4.201730519071134e-09max1.0"

0,1
shape,"(450, 1024, 1024)"
dtype,float32
size,1.8 GB
min,4.201730519071134e-09
max,1.0


Manage the GPU array with Dask

In [4]:
blurred_device_dask = array.from_array(blurred_device)
blurred_device_dask

Unnamed: 0,Array,Chunk
Bytes,1.76 GiB,127.36 MiB
Shape,"(450, 1024, 1024)","(322, 322, 322)"
Dask graph,32 chunks in 2 graph layers,32 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 1.76 GiB 127.36 MiB Shape (450, 1024, 1024) (322, 322, 322) Dask graph 32 chunks in 2 graph layers Data type float32 numpy.ndarray",1024  1024  450,

Unnamed: 0,Array,Chunk
Bytes,1.76 GiB,127.36 MiB
Shape,"(450, 1024, 1024)","(322, 322, 322)"
Dask graph,32 chunks in 2 graph layers,32 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Data pulling from device to host need to be operated manually. Collect result back to host before releasing GPU memory.

In [5]:
blurred_host = array.from_array(blurred_device) # pull gpu array back to host then manage by dask
blurred_host

Unnamed: 0,Array,Chunk
Bytes,1.76 GiB,127.36 MiB
Shape,"(450, 1024, 1024)","(322, 322, 322)"
Dask graph,32 chunks in 2 graph layers,32 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 1.76 GiB 127.36 MiB Shape (450, 1024, 1024) (322, 322, 322) Dask graph 32 chunks in 2 graph layers Data type float32 numpy.ndarray",1024  1024  450,

Unnamed: 0,Array,Chunk
Bytes,1.76 GiB,127.36 MiB
Shape,"(450, 1024, 1024)","(322, 322, 322)"
Dask graph,32 chunks in 2 graph layers,32 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Release GPU memory. Carefully using the buffer release, it must come with the variable deletion to avoid kernel crash due to invalid memory pointer.

In [6]:
# good practice to remove the variables for releasing the gpu memory
del blurred_device 
del blurred_device_dask

To know more on how to use dask and pyclesperanto together for multi-device tile processing approach, see [this notebook example](../examples/multi-gpu_tile_processing_with_dask.ipynb).