## Xarray engine: remapping

In [1]:
import earthkit.data as ekd

### Remapping used to define a custom dimension

Let us consider 3 ensemble members: 1 control (cf) and 2 perturbed members (pf).

In [2]:
ds_fl = ekd.from_source("sample", "ens_cf_pf.grib")
ds_fl.ls()

                                                                                                                                                                                                                      

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,500,20240603,0,0,cf,0,regular_ll
1,ecmf,t,isobaricInhPa,500,20240603,0,6,cf,0,regular_ll
2,ecmf,t,isobaricInhPa,500,20240603,0,0,pf,1,regular_ll
3,ecmf,t,isobaricInhPa,500,20240603,0,0,pf,2,regular_ll
4,ecmf,t,isobaricInhPa,500,20240603,0,6,pf,1,regular_ll
5,ecmf,t,isobaricInhPa,500,20240603,0,6,pf,2,regular_ll


Suppose we want to organise this field list along a custom dimension called ``"member"``, whose coordinates are constructed by combining the metadata keys ``"dataType"`` and ``"number"``, for example: ``["cf_0", "pf_1", "pf_2"]``.

To achieve this, we

- use the ``remapping`` option to define a virtual key ``"member"``, and

- declare ``"member"`` as a new dimension.

In [3]:
ds = ds_fl.to_xarray(
    profile=None,
    remapping={"member": "{dataType}_{number}"}, 
    extra_dims="member", 
    add_earthkit_attrs=False
)
ds

Note that it is not necessary to explicitly remove the predefined
dimension ``"number"`` using the ``drop_dims`` option. The Xarray engine
automatically drops it because it is already incorporated into another
dimension — in this case, ``"member"``.

Below, we present a more elaborate example illustrating how
``remapping`` can be used in conjunction with the ``extra_dims`` and
``dims_as_attrs`` options.

In [4]:
ds2 = ds_fl.to_xarray(
    profile=None,
    squeeze=True, 
    remapping={
        "member": "{dataType}_{number}", 
        "mars": "{class}_{stream}", 
    }, 
    extra_dims=["member", "mars"], 
    dims_as_attrs="mars", 
    add_earthkit_attrs=False
)
ds2

Above, we declared ``"mars"`` as a new dimension whose coordinates combine the ``"class"`` and ``"stream"`` metadata keys. Because this dimension has size 1, it is squeezed by default. However, the ``"dims_as_attrs"`` option causes the coordinate value of this dimension to be preserved as a variable attribute.

In [5]:
ds2["t"].attrs

{'mars': 'od_enfo'}

### Remapping used to define a custom variable name

The following GRIB dataset contains the parameters ``t`` and ``u`` on both pressure levels and hybrid (model) levels.

In [6]:
ds_fl2 = ekd.from_source("sample", "mixed_pl_ml.grib")
ds_fl2.ls()

                                                                                                                                                                                                                      

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,700,20240603,0,0,fc,,regular_ll
1,ecmf,u,isobaricInhPa,700,20240603,0,0,fc,,regular_ll
2,ecmf,t,isobaricInhPa,500,20240603,0,0,fc,,regular_ll
3,ecmf,u,isobaricInhPa,500,20240603,0,0,fc,,regular_ll
4,ecmf,t,isobaricInhPa,700,20240603,0,6,fc,,regular_ll
...,...,...,...,...,...,...,...,...,...,...
59,ecmf,u,hybrid,137,20240604,1200,0,fc,,regular_ll
60,ecmf,t,hybrid,90,20240604,1200,6,fc,,regular_ll
61,ecmf,u,hybrid,90,20240604,1200,6,fc,,regular_ll
62,ecmf,t,hybrid,137,20240604,1200,6,fc,,regular_ll


When converting this field list into an Xarray dataset, we must handle
the incompatibility between the level types associated with the same
variables. One possible approach is to create a separate variable for
each combination of parameter and level, for example:
``"t__hybrid_90"``, ``"t__hybrid_137"``, ``"t__isobaricInhPa_500"``, ``"t__isobaricInhPa_700"``, and similarly for
``u``.

In [7]:
ds3 = ds_fl2.to_xarray(
    profile="grib", 
    remapping={"my_custom_var_key": "{param}__{typeOfLevel}_{level}"}, 
    variable_key="my_custom_var_key",
    add_earthkit_attrs=False
)
ds3

An alternative approach, which results in a more compact hypercube
structure, is described below:

In [8]:
ds4 = ds_fl2.to_xarray(
    profile="grib", 
    level_dim_mode="level_per_type", 
    remapping={"my_custom_var_key": "{param}_{typeOfLevel}"}, 
    variable_key="my_custom_var_key", 
    add_earthkit_attrs=False
)
ds4