# List of indexing methods

also refer to [the Xarray docs](https://docs.xarray.dev/en/stable/user-guide/indexing.html)

In [None]:
import cedalion
import cedalion.datasets
import cedalion.xrutils as xrutils
import xarray as xr

In [None]:
rec = cedalion.datasets.get_fingertapping()
ts = rec["amp"]
geo3d = rec.geo3d
stim = rec.stim

In [None]:
# normal array indexing works as expected
display(ts)
ts[:,0,:] # first item along wavelength

In [None]:
ts[:,:,::3000] # every 3000th time point

In [None]:
#lookup element by label, need to know order of dims
ts.loc["S1D1", 760, :] 

In [None]:
# order independent 
ts.sel(channel="S1D1", wavelength=760) 

In [None]:
ts.sel(time= (ts.time  > 10 ) & (ts.time < 60.))

In [None]:
# regular expression via str accessor
ts.sel(channel=ts.channel.str.match("S[2,3]D[1,2]"))

use `isin` to select a fixed list of items

In [None]:
ts.sel(channel=ts.channel.isin(["S1D1", "S8D8"]))

`.sel` relies on an index. For some  coordinates (time, channel, wavelength) indexes are built. They are printed in bold face when the DataArray is displayed. Indexes are needed for efficient lookup but are not strictly necessary. Hence, we don't build them by default.


In [None]:

try:
    ts.sel(source="S1")
except KeyError as ex:
    print(ex)



If no index is available, it can either [be build](https://docs.xarray.dev/en/v2024.07.0/generated/xarray.DataArray.set_xindex.html#xarray.DataArray.set_xindex):

In [None]:
ts_with_index = ts.set_xindex("source")
ts_with_index.sel(source="S1")

 ... or one can resort to selecting with boolean masks:

In [None]:
ts[ts.source == "S1"]

Using coordinates from one array to index another. Here we use `ts.source` to select in `geo3d` values along the 'label' dimension. Because `ts.source` belongs to the 'channel' dimension of `ts`, the resulting `xr.DataArray` has dimensions 'channel' (from ts.source) and 'digitized' (from geo3d)

In [None]:
display(geo3d)
display(ts.source)
geo3d.loc[ts.source]

`.sel` accepts dictionaries. Useful when dimension name is a variable

In [None]:
dim = 'wavelength'
dim_value = 760
ts.sel({dim : dim_value})

`geo3d` uses the name of one dimension to denote the coordinate reference system (CRS)

In [None]:
display(geo3d)
display(geo3d.points.crs) # get the name of the dimension that also names the CRS

In [None]:
xrutils.norm(geo3d, dim=geo3d.points.crs) # works for geo3d independent of crs

Splitting by distance criterium

In [None]:
dists = xrutils.norm(
        geo3d.loc[ts.source] - geo3d.loc[ts.detector], dim=geo3d.points.crs
)

mask = dists < 1.5 * cedalion.units.cm
display(ts.sel(channel=mask))
display(ts.sel(channel=~mask)) # logical not on boolean mask

changing the order of dimensions

In [None]:
display(ts)
display(ts.transpose("time", "wavelength", "channel"))

don't have to name all dims. use `...` 

In [None]:
display(ts.transpose(..., 'wavelength'))

In [None]:
# coordinates are xarrays, too
display(ts.source)

In [None]:
# often this is not what one wants:
for src in ts.source[:3]:
    print(src)

In [None]:
# use .values
for src in ts.source.values[:3]:
    print(src)

In [None]:
# values used on DataArray with units yields a UnitStrippedWarning
ts.values

In [None]:
# use .pint.dequantify to move units into .attrs and avoid UnitStrippedWarning
ts_no_units = ts.pint.dequantify()
ts_no_units.values

In [None]:
# single item is still an xarray with coordinates
ts[0,0,0]

In [None]:
ts[0,0,0].item()

Sometimes it is useful to stack the wavlength and channel dimension:

In [None]:
ts_flat = ts.stack({"flat_channel" : ["channel", "wavelength"]})
display(ts_flat)

The resulting dimension 'flat_channel' has a `MultiIndex` that combines the former 'channel' and 'wavelength' dimension and facilitates lookups on the dimensions.

In [None]:
ts_flat.sel(channel="S1D1")

it also allows for easy unstacking:

In [None]:
# first operate on the flattened array then unstack
ts_flat.sel(time=ts_flat.time < 60).unstack()

In [None]:
# get an empty array with same dimensions and coordinates
xr.zeros_like(ts)

In [None]:
# rename a dimension
geo3d.rename({geo3d.points.crs : "new_crs"})

In [None]:
# works also for renaming a coordinate
ts.rename({"samples" : "counter"})

In [None]:
ts.drop_vars(["source", "detector"])