# Regrid

In [1]:
import xarray
import numpy
import pandas
import climtas
import xesmf
import dask.array

We have a large Dask dataset that we'd like to regrid to a different resolution, so we can compare it with a different dataset

In [2]:
time = pandas.date_range('20010101', '20040101', freq='D', closed='left')

data = dask.array.random.random((len(time),50,100), chunks=(90,25,25))
lat = numpy.linspace(-90, 90, data.shape[1])
lon = numpy.linspace(-180, 180, data.shape[2], endpoint=False)

da = xarray.DataArray(data, coords=[('time', time), ('lat', lat), ('lon', lon)], name='temperature')
da.lat.attrs['standard_name'] = 'latitude'
da.lon.attrs['standard_name'] = 'longitude'

da

Unnamed: 0,Array,Chunk
Bytes,41.77 MiB,439.45 kiB
Shape,"(1095, 50, 100)","(90, 25, 25)"
Count,104 Tasks,104 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 41.77 MiB 439.45 kiB Shape (1095, 50, 100) (90, 25, 25) Count 104 Tasks 104 Chunks Type float64 numpy.ndarray",100  50  1095,

Unnamed: 0,Array,Chunk
Bytes,41.77 MiB,439.45 kiB
Shape,"(1095, 50, 100)","(90, 25, 25)"
Count,104 Tasks,104 Chunks
Type,float64,numpy.ndarray


Here's a variable on the target grid, that we'd like to regrid `da` to

In [3]:
t_lat = numpy.linspace(-90, 90, 10)
t_lon = numpy.linspace(0, 360, 50, endpoint=False)

t_da = xarray.DataArray(dask.array.zeros((len(t_lat), len(t_lon))), coords=[('lat', t_lat), ('lon', t_lon)])

t_da

Unnamed: 0,Array,Chunk
Bytes,3.91 kiB,3.91 kiB
Shape,"(10, 50)","(10, 50)"
Count,1 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 3.91 kiB 3.91 kiB Shape (10, 50) (10, 50) Count 1 Tasks 1 Chunks Type float64 numpy.ndarray",50  10,

Unnamed: 0,Array,Chunk
Bytes,3.91 kiB,3.91 kiB
Shape,"(10, 50)","(10, 50)"
Count,1 Tasks,1 Chunks
Type,float64,numpy.ndarray


The main way to do a regridding in Xarray is to use [xesmf](https://xesmf.readthedocs.io/en/latest/), which uses the ESMF library to do the actual regridding. However data that is chunked horizontally produces an error message, which is an issue for very large grids.

In [4]:
re = xesmf.Regridder(da, t_da, method='bilinear')

try:
    re(da)
except Exception as e:
    print("Error", e)

  o = func(*args, **kwargs)


Error dimension lat on 0th function argument to apply_ufunc with dask='parallelized' consists of multiple chunks, but is also a core dimension. To fix, either rechunk into a single dask array chunk along this dimension, i.e., ``.chunk(lat: -1)``, or pass ``allow_rechunk=True`` in ``dask_gufunc_kwargs`` but beware that this may significantly increase memory usage.


  dr_out = xr.apply_ufunc(


Instead you have to remove the horizontal chunking before doing a xesmf regrid.

In [5]:
re(da.chunk({'lat': None, 'lon': None}))

Unnamed: 0,Array,Chunk
Bytes,4.18 MiB,351.56 kiB
Shape,"(1095, 10, 50)","(90, 10, 50)"
Count,156 Tasks,13 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 4.18 MiB 351.56 kiB Shape (1095, 10, 50) (90, 10, 50) Count 156 Tasks 13 Chunks Type float64 numpy.ndarray",50  10  1095,

Unnamed: 0,Array,Chunk
Bytes,4.18 MiB,351.56 kiB
Shape,"(1095, 10, 50)","(90, 10, 50)"
Count,156 Tasks,13 Chunks
Type,float64,numpy.ndarray


The [climtas.regrid](api/regrid.rst) functions support regridding horizontally chunked data. By default the regridding is bilinear, with weights generated by CDO, but you can also supply your own regridding weights or have the library generate them for you from ESMF or CDO.

In [6]:
climtas.regrid.regrid(da, t_da)

Unnamed: 0,Array,Chunk
Bytes,4.18 MiB,351.56 kiB
Shape,"(1095, 10, 50)","(90, 10, 50)"
Count,1124 Tasks,13 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 4.18 MiB 351.56 kiB Shape (1095, 10, 50) (90, 10, 50) Count 1124 Tasks 13 Chunks Type float64 numpy.ndarray",50  10  1095,

Unnamed: 0,Array,Chunk
Bytes,4.18 MiB,351.56 kiB
Shape,"(1095, 10, 50)","(90, 10, 50)"
Count,1124 Tasks,13 Chunks
Type,float64,numpy.ndarray
