## Xarray engine: chunks

First, we get 2m temperature data for a whole year on a low resolution regular latitude-longitude grid. It contains 2 fields per day (at 0 and 12 UTC). This data obviously fit into memory, so only used for demonstration purposes.

In [1]:
import earthkit.data as ekd
ds_fl = ekd.from_source("sample", "t2_1_year_hourly.grib")
len(ds_fl)

t2_1_year_hourly.grib:   0%|          | 0.00/429k [00:00<?, ?B/s]

732

Next, we convert the GRIB Fieldlist to Xarray using the chunk size of 10 fields.

In [2]:
ds = ds_fl.to_xarray(time_dim_mode="valid_time", 
                     chunks={"valid_time": 10}, 
                     add_earthkit_attrs=False)
ds["2t"]

Unnamed: 0,Array,Chunk
Bytes,1.74 MiB,24.38 kiB
Shape,"(732, 13, 24)","(10, 13, 24)"
Dask graph,74 chunks in 2 graph layers,74 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 1.74 MiB 24.38 kiB Shape (732, 13, 24) (10, 13, 24) Dask graph 74 chunks in 2 graph layers Data type float64 numpy.ndarray",24  13  732,

Unnamed: 0,Array,Chunk
Bytes,1.74 MiB,24.38 kiB
Shape,"(732, 13, 24)","(10, 13, 24)"
Dask graph,74 chunks in 2 graph layers,74 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


Finally, we compute the mean along the temporal dimension. Xarray will load data in chunks for this computation keeping the memory usage low.

In [3]:
m = ds["2t"].mean(dim="valid_time").load()
m