# Lazy evaluation on Dask arrays


If you are unfamiliar with Dask, read
[Parallel computing with Dask](https://docs.xarray.dev/en/stable/user-guide/dask.html)
in Xarray documentation first.

Recall that the regridding process is divided in two steps : computing the
weights and applying the weights. Dask support is much more advanced for the
latter, and this what the first part of this notebook is about.

Dask allows [lazy evaluation](https://en.wikipedia.org/wiki/Lazy_evaluation) and
[out-of-core computing](https://en.wikipedia.org/wiki/External_memory_algorithm),
to allow processing large volumes of data with limited memory. You may also get
a speed-up by parallelizing the process in some cases, but a general rule of
thumb is that if the data fits in memory, regridding will be faster without
dask.


In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import dask.array as da  # need to have dask.array installed, although not directly using it here.
import xarray as xr
import xesmf as xe

## A simple example


### Prepare input data


In [2]:
ds = xr.tutorial.open_dataset("air_temperature", chunks={"time": 500})
ds

Unnamed: 0,Array,Chunk
Bytes,14.76 MiB,2.53 MiB
Shape,"(2920, 25, 53)","(500, 25, 53)"
Dask graph,6 chunks in 2 graph layers,6 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 14.76 MiB 2.53 MiB Shape (2920, 25, 53) (500, 25, 53) Dask graph 6 chunks in 2 graph layers Data type float32 numpy.ndarray",53  25  2920,

Unnamed: 0,Array,Chunk
Bytes,14.76 MiB,2.53 MiB
Shape,"(2920, 25, 53)","(500, 25, 53)"
Dask graph,6 chunks in 2 graph layers,6 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


In [3]:
ds.chunks

Frozen({'time': (500, 500, 500, 500, 500, 420), 'lat': (25,), 'lon': (53,)})

In [4]:
ds["air"].data

Unnamed: 0,Array,Chunk
Bytes,14.76 MiB,2.53 MiB
Shape,"(2920, 25, 53)","(500, 25, 53)"
Dask graph,6 chunks in 2 graph layers,6 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 14.76 MiB 2.53 MiB Shape (2920, 25, 53) (500, 25, 53) Dask graph 6 chunks in 2 graph layers Data type float32 numpy.ndarray",53  25  2920,

Unnamed: 0,Array,Chunk
Bytes,14.76 MiB,2.53 MiB
Shape,"(2920, 25, 53)","(500, 25, 53)"
Dask graph,6 chunks in 2 graph layers,6 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


### Build regridder


In [5]:
ds_out = xr.Dataset(
    {
        "lat": (["lat"], np.arange(16, 75, 1.0)),
        "lon": (["lon"], np.arange(200, 330, 1.5)),
    }
)

regridder = xe.Regridder(ds, ds_out, "bilinear")
regridder

xESMF Regridder 
Regridding algorithm:       bilinear 
Weight filename:            bilinear_25x53_59x87.nc 
Reuse pre-computed weights? False 
Input grid shape:           (25, 53) 
Output grid shape:          (59, 87) 
Periodic in longitude?      False

### Apply to xarray Dataset/DataArray


In [7]:
# only build the dask graph; actual computation happens later when calling compute()
%time ds_out = regridder(ds)
ds_out

CPU times: user 8.17 ms, sys: 2.32 ms, total: 10.5 ms
Wall time: 10.2 ms


Unnamed: 0,Array,Chunk
Bytes,57.18 MiB,9.79 MiB
Shape,"(2920, 59, 87)","(500, 59, 87)"
Dask graph,6 chunks in 8 graph layers,6 chunks in 8 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 57.18 MiB 9.79 MiB Shape (2920, 59, 87) (500, 59, 87) Dask graph 6 chunks in 8 graph layers Data type float32 numpy.ndarray",87  59  2920,

Unnamed: 0,Array,Chunk
Bytes,57.18 MiB,9.79 MiB
Shape,"(2920, 59, 87)","(500, 59, 87)"
Dask graph,6 chunks in 8 graph layers,6 chunks in 8 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


In [8]:
ds_out["air"].data

Unnamed: 0,Array,Chunk
Bytes,57.18 MiB,9.79 MiB
Shape,"(2920, 59, 87)","(500, 59, 87)"
Dask graph,6 chunks in 8 graph layers,6 chunks in 8 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 57.18 MiB 9.79 MiB Shape (2920, 59, 87) (500, 59, 87) Dask graph 6 chunks in 8 graph layers Data type float32 numpy.ndarray",87  59  2920,

Unnamed: 0,Array,Chunk
Bytes,57.18 MiB,9.79 MiB
Shape,"(2920, 59, 87)","(500, 59, 87)"
Dask graph,6 chunks in 8 graph layers,6 chunks in 8 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


In [9]:
%time result = ds_out['air'].compute()  # actually applies regridding

CPU times: user 756 ms, sys: 155 ms, total: 911 ms
Wall time: 600 ms


In [10]:
type(result.data), result.data.shape

(numpy.ndarray, (2920, 59, 87))

## Chunking behaviour

xESMF will adjust its default behaviour according to the input data. On spatial
dimensions where the data has only one chunk, the output of a `Regridder` call
will also have only one chunk, no matter the new dimension size. This like the
previous example.

However, if the input has more than one chunk along a spatial dimension, then
the regridder will try to preserve the chunk size. When upscaling data, this
means the number of chunks increases and with it the number of dask tasks added
to the graph. This can actually decrease performance if the graph becomes too
large, filled up with many small tasks.

One can always override xESMF's default behaviour by passing `output_chunks` to
the `Regridder` call.

In the example below, the input has three chunks along `lon`:


In [13]:
ds_3lon = ds.chunk({"lat": 25, "lon": 25, "time": -1})
ds_3lon.air.data

Unnamed: 0,Array,Chunk
Bytes,14.76 MiB,6.96 MiB
Shape,"(2920, 25, 53)","(2920, 25, 25)"
Dask graph,3 chunks in 3 graph layers,3 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 14.76 MiB 6.96 MiB Shape (2920, 25, 53) (2920, 25, 25) Dask graph 3 chunks in 3 graph layers Data type float32 numpy.ndarray",53  25  2920,

Unnamed: 0,Array,Chunk
Bytes,14.76 MiB,6.96 MiB
Shape,"(2920, 25, 53)","(2920, 25, 25)"
Dask graph,3 chunks in 3 graph layers,3 chunks in 3 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


In this case, the output DataArray will have the same chunk size on longitude,
but still only one chunk along latitude.


In [14]:
ds_spatial_out = regridder(ds_spatial)  # Regridding ds_spatial
ds_spatial_out["air"].data

Unnamed: 0,Array,Chunk
Bytes,57.18 MiB,16.43 MiB
Shape,"(2920, 59, 87)","(2920, 59, 25)"
Dask graph,4 chunks in 10 graph layers,4 chunks in 10 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 57.18 MiB 16.43 MiB Shape (2920, 59, 87) (2920, 59, 25) Dask graph 4 chunks in 10 graph layers Data type float32 numpy.ndarray",87  59  2920,

Unnamed: 0,Array,Chunk
Bytes,57.18 MiB,16.43 MiB
Shape,"(2920, 59, 87)","(2920, 59, 25)"
Dask graph,4 chunks in 10 graph layers,4 chunks in 10 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


Unless the `output_chunks` argument is passed to the `regridder`


In [15]:
ds_spatial_out = regridder(ds_spatial, output_chunks={"lat": 10, "lon": 10})
ds_spatial_out["air"].data

Unnamed: 0,Array,Chunk
Bytes,57.18 MiB,1.11 MiB
Shape,"(2920, 59, 87)","(2920, 10, 10)"
Dask graph,54 chunks in 10 graph layers,54 chunks in 10 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 57.18 MiB 1.11 MiB Shape (2920, 59, 87) (2920, 10, 10) Dask graph 54 chunks in 10 graph layers Data type float32 numpy.ndarray",87  59  2920,

Unnamed: 0,Array,Chunk
Bytes,57.18 MiB,1.11 MiB
Shape,"(2920, 59, 87)","(2920, 10, 10)"
Dask graph,54 chunks in 10 graph layers,54 chunks in 10 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


# Parallel weight generation with Dask


Dask can also be used to build the regridder and compute its weights in
parallel. To do so, xESMF uses the chunks on the destination grid and computes
subsets of weights on each chunk in parallel.

This feature is currently in an experimental state and it will force dask to use
processes to parallelize the computation. Moreover, it is slower than then
normal method abd thus is it _only_ useful if the **destination** grid does not
fit in memory. Recall that the parallization is done over chunks of the
destination grid and each iteration will need to load the source grid in memory.

For a more performant way to generate weights in parallel, it might be better to
use `ESMF` directly instead, assuming you have an MPI-enabled version. See the
"Solving large problems using HPC" notebook.


## Parallel weight generation example


### Prepare input data


In [16]:
ds = xr.tutorial.open_dataset("air_temperature", chunks={"time": 500})
ds

Unnamed: 0,Array,Chunk
Bytes,14.76 MiB,2.53 MiB
Shape,"(2920, 25, 53)","(500, 25, 53)"
Dask graph,6 chunks in 2 graph layers,6 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 14.76 MiB 2.53 MiB Shape (2920, 25, 53) (500, 25, 53) Dask graph 6 chunks in 2 graph layers Data type float32 numpy.ndarray",53  25  2920,

Unnamed: 0,Array,Chunk
Bytes,14.76 MiB,2.53 MiB
Shape,"(2920, 25, 53)","(500, 25, 53)"
Dask graph,6 chunks in 2 graph layers,6 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


### Prepare output dataset and chunk it


In [17]:
ds_out = xr.tutorial.open_dataset("rasm")
ds_out = ds_out.chunk({"y": 50, "x": 50})
ds_out.chunks

Frozen({'time': (36,), 'y': (50, 50, 50, 50, 5), 'x': (50, 50, 50, 50, 50, 25)})

### Create regridder, generating the weights in parallel


In [18]:
para_regridder = xe.Regridder(ds, ds_out, "bilinear", parallel=True)
para_regridder



xESMF Regridder 
Regridding algorithm:       bilinear 
Weight filename:            bilinear_25x53_205x275.nc 
Reuse pre-computed weights? False 
Input grid shape:           (25, 53) 
Output grid shape:          (205, 275) 
Periodic in longitude?      False

Attempting to build the Regridder using the option `parallel=True` with either
`reuse_weights=True` or with `weights != None` will produce a warning. In both
cases, since the weights are already generated, the regridder will be built
skipping the parallel part.


### Using a mask to chunk an empty Dataset


If the destination grid has no variables and contains 1D lat/lon coordinates,
using xarray's `.chunk()` method will not work


In [19]:
ds_out = xr.Dataset(
    {
        "lat": (["lat"], np.arange(16, 75, 1.0), {"units": "degrees_north"}),
        "lon": (["lon"], np.arange(200, 330, 1.5), {"units": "degrees_east"}),
    }
)
ds_out

In [20]:
ds_out.chunk({"lat": 25, "lon": 25})
ds_out.chunks

Frozen({})

To deal with this issue, we can create a `mask` and add it to `ds_out`. Using a
boolean mask ensures `ds_out` is not bloated by data and setting the mask to be
`True` everywhere will not affect regridding.


In [21]:
mask = da.ones((ds_out.lat.size, ds_out.lon.size), dtype=bool, chunks=(25, 25))
ds_out["mask"] = (ds_out.dims, mask)

# Now we check the chunks of ds_out
ds_out.chunks

Frozen({'lat': (25, 25, 9), 'lon': (25, 25, 25, 12)})