# Parallel calculations using Dask

From the [Dask documentation](https://docs.dask.org):

![](https://docs.dask.org/en/stable/_images/dask-overview.svg)

## 1. Create task graphs

### 1.1 Dask Delayed

In [None]:
from dask.delayed import delayed

In [None]:
@delayed
def add(x, y):
    print(f"{x} + {y}")
    return x + y

In [None]:
a_p = add(1, 2)

In [None]:
b_p = add(a_p, 3)

In [None]:
c_p = add(a_p, b_p)

In [None]:
import dask

In [None]:
dask.visualize(c_p, rankdir="LR")

In [None]:
c_p.compute()

### 1.2 Dask Arrays

In [None]:
import dask.array as da

In [None]:
x = da.random.random((2000, 1000), chunks=(500, 500))

In [None]:
x

In [None]:
y = da.dot(x, x.T)

In [None]:
y

In [None]:
z = y.mean()

In [None]:
z

In [None]:
dask.visualize(z)

In [None]:
z.compute()

### 1.3 Xarray

From the [Xarray documentation](https://docs.xarray.dev): 
![](https://docs.xarray.dev/en/stable/_images/dataset-diagram.png)

In [None]:
raster_path = '/project/stursdat/Data/RS-DAT/sentinel-2-l2a_AMS_2023-04/2023/4/30/S2B_31UFU_20230430_0_L2A/B02.tif'

In [None]:
import rioxarray

In [None]:
raster = rioxarray.open_rasterio(raster_path, chunks={"x": 2048, "y": 2048 }, lock=False)

In [None]:
raster

In [None]:
raster_max = raster.max()

In [None]:
dask.visualize(raster_max)

In [None]:
raster_max.compute()

## 2. Execute task graphs

### 2.1 Multi-threading/processing

In [None]:
%%time
raster_max.compute(scheduler="threads", num_workers=2)

In [None]:
%%time
raster_max.compute(scheduler="processes", num_workers=2)

### 2.2 Distributed scheduler

In [None]:
from dask.distributed import Client

client = Client("tcp://10.0.2.120:46409")
client

In [None]:
raster_max.compute()