## Xarray engine: mono variable with remapping

This notebook demonstrates how to generate an Xarray with a single dataarray containing all the parameters from a GRIB fieldlist. This data structure is often needed for machine learning.

First, we get GRIB data containing multiple forecasts on the surface and pressure levels. We select a single forecast out of it.

In [1]:
import earthkit.data as ekd
ds_fl = ekd.from_source("sample", "mixed_pl_sfc.grib").sel(date=20240603, time=0)

mixed_pl_sfc.grib:   0%|          | 0.00/390k [00:00<?, ?B/s]

In [2]:
ds = ds_fl.to_xarray(fixed_dims=["valid_time", "param", "number"],
                     mono_variable=True,
                     chunks={"valid_time": 1},                    
                     flatten_values=True,                   
                     add_earthkit_attrs=False, 
                     remapping={"param": "{param}_{level}"}
                    )
ds

Unnamed: 0,Array,Chunk
Bytes,5.34 kiB,5.34 kiB
Shape,"(684,)","(684,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 5.34 kiB 5.34 kiB Shape (684,) (684,) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",684  1,

Unnamed: 0,Array,Chunk
Bytes,5.34 kiB,5.34 kiB
Shape,"(684,)","(684,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,5.34 kiB,5.34 kiB
Shape,"(684,)","(684,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 5.34 kiB 5.34 kiB Shape (684,) (684,) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",684  1,

Unnamed: 0,Array,Chunk
Bytes,5.34 kiB,5.34 kiB
Shape,"(684,)","(684,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,342.00 kiB,171.00 kiB
Shape,"(2, 32, 1, 684)","(1, 32, 1, 684)"
Dask graph,2 chunks in 2 graph layers,2 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 342.00 kiB 171.00 kiB Shape (2, 32, 1, 684) (1, 32, 1, 684) Dask graph 2 chunks in 2 graph layers Data type float64 numpy.ndarray",2  1  684  1  32,

Unnamed: 0,Array,Chunk
Bytes,342.00 kiB,171.00 kiB
Shape,"(2, 32, 1, 684)","(1, 32, 1, 684)"
Dask graph,2 chunks in 2 graph layers,2 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


When generating the Xarray we flattened the field values and chose the chunking so that one chunk would contain all the data belonging to a given valid time.

In [3]:
ds["data"]

Unnamed: 0,Array,Chunk
Bytes,342.00 kiB,171.00 kiB
Shape,"(2, 32, 1, 684)","(1, 32, 1, 684)"
Dask graph,2 chunks in 2 graph layers,2 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 342.00 kiB 171.00 kiB Shape (2, 32, 1, 684) (1, 32, 1, 684) Dask graph 2 chunks in 2 graph layers Data type float64 numpy.ndarray",2  1  684  1  32,

Unnamed: 0,Array,Chunk
Bytes,342.00 kiB,171.00 kiB
Shape,"(2, 32, 1, 684)","(1, 32, 1, 684)"
Dask graph,2 chunks in 2 graph layers,2 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,5.34 kiB,5.34 kiB
Shape,"(684,)","(684,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 5.34 kiB 5.34 kiB Shape (684,) (684,) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",684  1,

Unnamed: 0,Array,Chunk
Bytes,5.34 kiB,5.34 kiB
Shape,"(684,)","(684,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,5.34 kiB,5.34 kiB
Shape,"(684,)","(684,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 5.34 kiB 5.34 kiB Shape (684,) (684,) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",684  1,

Unnamed: 0,Array,Chunk
Bytes,5.34 kiB,5.34 kiB
Shape,"(684,)","(684,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
