# Data selection using pyfesom2's xarray-accesor.

## 1. Introduction
A key feature of pyfesom2 is to facilitate spatial selections on its unstructured triangular grid, that is currently not supported by commonly used python libraries. Xarray provides a powerful label based selection on datasets and variables (dataarray). A similar interface is provided (with additional convinient features) for FESOM data through `pyfesom2` accessor.


Xarray's sel method takes dimension names of a dataset or datarray as arguments for selection. For rectiliner grids this provides easy interface to select arbitary points and rectangular regions (using slices) in latitudes and longitudes. In case of FESOM unstructured grid, latitudes, longitudes are not orthogonal to each other and are not part of dataset's dimensions, insted they are coordinates of a common dimension `nod2`. This means the convinent spatial selection in xarray using latitude and longitude as indexers is not possible for FESOM data using regular `sel(lat=..., lon=...)` method. The `dataset.pyfesom2.select(lat=..., lon=...)` method of `pyfesom2` provides alternative to do such selections on FESOM data. Being in a constrained data environment also allows us to more add convinient features such as selecting arbitarary polygons. 


Because of differences in functionality, arguments, and to minimize confusion with xarray's `sel` method, accessor's selection methods are prefixed with `select`.

### Load a tutorial dataset


For more examples loading datasets see [datasets](https://nbviewer.jupyter.org/github/FESOM/pyfesom2/blob/master/notebooks/remote_datasets.ipynb) example notebook. 

In [1]:
from pyfesom2.datasets import core
fesom_ds = core.load()
fesom_ds

Unnamed: 0,Array,Chunk
Bytes,1.01 MB,253.72 kB
Shape,"(126858,)","(31715,)"
Count,52 Tasks,4 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.01 MB 253.72 kB Shape (126858,) (31715,) Count 52 Tasks 4 Chunks Type float64 numpy.ndarray",126858  1,

Unnamed: 0,Array,Chunk
Bytes,1.01 MB,253.72 kB
Shape,"(126858,)","(31715,)"
Count,52 Tasks,4 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.01 MB,253.72 kB
Shape,"(126858,)","(31715,)"
Count,52 Tasks,4 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.01 MB 253.72 kB Shape (126858,) (31715,) Count 52 Tasks 4 Chunks Type float64 numpy.ndarray",126858  1,

Unnamed: 0,Array,Chunk
Bytes,1.01 MB,253.72 kB
Shape,"(126858,)","(31715,)"
Count,52 Tasks,4 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.93 MB,487.80 kB
Shape,"(243899, 3)","(60975, 2)"
Count,9 Tasks,8 Chunks
Type,uint32,numpy.ndarray
"Array Chunk Bytes 2.93 MB 487.80 kB Shape (243899, 3) (60975, 2) Count 9 Tasks 8 Chunks Type uint32 numpy.ndarray",3  243899,

Unnamed: 0,Array,Chunk
Bytes,2.93 MB,487.80 kB
Shape,"(243899, 3)","(60975, 2)"
Count,9 Tasks,8 Chunks
Type,uint32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GB,23.85 MB
Shape,"(144, 126858, 47)","(1, 126858, 47)"
Count,145 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GB 23.85 MB Shape (144, 126858, 47) (1, 126858, 47) Count 145 Tasks 144 Chunks Type float32 numpy.ndarray",47  126858  144,

Unnamed: 0,Array,Chunk
Bytes,3.43 GB,23.85 MB
Shape,"(144, 126858, 47)","(1, 126858, 47)"
Count,145 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.43 GB,23.85 MB
Shape,"(144, 126858, 47)","(1, 126858, 47)"
Count,145 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.43 GB 23.85 MB Shape (144, 126858, 47) (1, 126858, 47) Count 145 Tasks 144 Chunks Type float32 numpy.ndarray",47  126858  144,

Unnamed: 0,Array,Chunk
Bytes,3.43 GB,23.85 MB
Shape,"(144, 126858, 47)","(1, 126858, 47)"
Count,145 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,73.07 MB,507.43 kB
Shape,"(144, 126858)","(1, 126858)"
Count,145 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 73.07 MB 507.43 kB Shape (144, 126858) (1, 126858) Count 145 Tasks 144 Chunks Type float32 numpy.ndarray",126858  144,

Unnamed: 0,Array,Chunk
Bytes,73.07 MB,507.43 kB
Shape,"(144, 126858)","(1, 126858)"
Count,145 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,73.07 MB,507.43 kB
Shape,"(144, 126858)","(1, 126858)"
Count,145 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 73.07 MB 507.43 kB Shape (144, 126858) (1, 126858) Count 145 Tasks 144 Chunks Type float32 numpy.ndarray",126858  144,

Unnamed: 0,Array,Chunk
Bytes,73.07 MB,507.43 kB
Shape,"(144, 126858)","(1, 126858)"
Count,145 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,73.07 MB,507.43 kB
Shape,"(144, 126858)","(1, 126858)"
Count,145 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 73.07 MB 507.43 kB Shape (144, 126858) (1, 126858) Count 145 Tasks 144 Chunks Type float32 numpy.ndarray",126858  144,

Unnamed: 0,Array,Chunk
Bytes,73.07 MB,507.43 kB
Shape,"(144, 126858)","(1, 126858)"
Count,145 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,73.07 MB,507.43 kB
Shape,"(144, 126858)","(1, 126858)"
Count,145 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 73.07 MB 507.43 kB Shape (144, 126858) (1, 126858) Count 145 Tasks 144 Chunks Type float32 numpy.ndarray",126858  144,

Unnamed: 0,Array,Chunk
Bytes,73.07 MB,507.43 kB
Shape,"(144, 126858)","(1, 126858)"
Count,145 Tasks,144 Chunks
Type,float32,numpy.ndarray


## 2. Region selection

<img src="images/region_selection.png"
     alt="Region Selection" width=30% 
     style="width=10px; float: left; margin-right: 10px;" /> 
     
To select a rectangular region, bounds of bounding rectangle defined as `(minlon, minlat, maxlon, maxlat)` can be passed to `region` argument. To select arbitary polygons, Shapely's Polygon object can be used as an argument to `region`. The region selection returns reindexed faces corresponding to selected points. This subsetted data may be saved to disk using regular xarray saving methods such as `.to_netcdf(path_to_file.nc)`.


Select a rectangular region.

In [2]:
fesom_ds.pyfesom2.select(region=(-20, 60, 50, 80))

Unnamed: 0,Array,Chunk
Bytes,53.59 kB,24.38 kB
Shape,"(6699,)","(3048,)"
Count,56 Tasks,4 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 53.59 kB 24.38 kB Shape (6699,) (3048,) Count 56 Tasks 4 Chunks Type float64 numpy.ndarray",6699  1,

Unnamed: 0,Array,Chunk
Bytes,53.59 kB,24.38 kB
Shape,"(6699,)","(3048,)"
Count,56 Tasks,4 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,53.59 kB,24.38 kB
Shape,"(6699,)","(3048,)"
Count,56 Tasks,4 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 53.59 kB 24.38 kB Shape (6699,) (3048,) Count 56 Tasks 4 Chunks Type float64 numpy.ndarray",6699  1,

Unnamed: 0,Array,Chunk
Bytes,53.59 kB,24.38 kB
Shape,"(6699,)","(3048,)"
Count,56 Tasks,4 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,181.36 MB,1.26 MB
Shape,"(144, 6699, 47)","(1, 6699, 47)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 181.36 MB 1.26 MB Shape (144, 6699, 47) (1, 6699, 47) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",47  6699  144,

Unnamed: 0,Array,Chunk
Bytes,181.36 MB,1.26 MB
Shape,"(144, 6699, 47)","(1, 6699, 47)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,181.36 MB,1.26 MB
Shape,"(144, 6699, 47)","(1, 6699, 47)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 181.36 MB 1.26 MB Shape (144, 6699, 47) (1, 6699, 47) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",47  6699  144,

Unnamed: 0,Array,Chunk
Bytes,181.36 MB,1.26 MB
Shape,"(144, 6699, 47)","(1, 6699, 47)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.86 MB,26.80 kB
Shape,"(144, 6699)","(1, 6699)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.86 MB 26.80 kB Shape (144, 6699) (1, 6699) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",6699  144,

Unnamed: 0,Array,Chunk
Bytes,3.86 MB,26.80 kB
Shape,"(144, 6699)","(1, 6699)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.86 MB,26.80 kB
Shape,"(144, 6699)","(1, 6699)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.86 MB 26.80 kB Shape (144, 6699) (1, 6699) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",6699  144,

Unnamed: 0,Array,Chunk
Bytes,3.86 MB,26.80 kB
Shape,"(144, 6699)","(1, 6699)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.86 MB,26.80 kB
Shape,"(144, 6699)","(1, 6699)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.86 MB 26.80 kB Shape (144, 6699) (1, 6699) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",6699  144,

Unnamed: 0,Array,Chunk
Bytes,3.86 MB,26.80 kB
Shape,"(144, 6699)","(1, 6699)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.86 MB,26.80 kB
Shape,"(144, 6699)","(1, 6699)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 3.86 MB 26.80 kB Shape (144, 6699) (1, 6699) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",6699  144,

Unnamed: 0,Array,Chunk
Bytes,3.86 MB,26.80 kB
Shape,"(144, 6699)","(1, 6699)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray


Select a triangular region defined by Shapely's Polygon

In [3]:
from shapely.geometry import Polygon
polygon_region = Polygon([(-70, 30), (-10, 0), (-10, 60)])  # a triangle in atlantic
fesom_ds.pyfesom2.select(region=polygon_region)

Unnamed: 0,Array,Chunk
Bytes,31.10 kB,10.21 kB
Shape,"(3887,)","(1276,)"
Count,56 Tasks,4 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 31.10 kB 10.21 kB Shape (3887,) (1276,) Count 56 Tasks 4 Chunks Type float64 numpy.ndarray",3887  1,

Unnamed: 0,Array,Chunk
Bytes,31.10 kB,10.21 kB
Shape,"(3887,)","(1276,)"
Count,56 Tasks,4 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,31.10 kB,10.21 kB
Shape,"(3887,)","(1276,)"
Count,56 Tasks,4 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 31.10 kB 10.21 kB Shape (3887,) (1276,) Count 56 Tasks 4 Chunks Type float64 numpy.ndarray",3887  1,

Unnamed: 0,Array,Chunk
Bytes,31.10 kB,10.21 kB
Shape,"(3887,)","(1276,)"
Count,56 Tasks,4 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,105.23 MB,730.76 kB
Shape,"(144, 3887, 47)","(1, 3887, 47)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 105.23 MB 730.76 kB Shape (144, 3887, 47) (1, 3887, 47) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",47  3887  144,

Unnamed: 0,Array,Chunk
Bytes,105.23 MB,730.76 kB
Shape,"(144, 3887, 47)","(1, 3887, 47)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,105.23 MB,730.76 kB
Shape,"(144, 3887, 47)","(1, 3887, 47)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 105.23 MB 730.76 kB Shape (144, 3887, 47) (1, 3887, 47) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",47  3887  144,

Unnamed: 0,Array,Chunk
Bytes,105.23 MB,730.76 kB
Shape,"(144, 3887, 47)","(1, 3887, 47)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.24 MB,15.55 kB
Shape,"(144, 3887)","(1, 3887)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.24 MB 15.55 kB Shape (144, 3887) (1, 3887) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",3887  144,

Unnamed: 0,Array,Chunk
Bytes,2.24 MB,15.55 kB
Shape,"(144, 3887)","(1, 3887)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.24 MB,15.55 kB
Shape,"(144, 3887)","(1, 3887)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.24 MB 15.55 kB Shape (144, 3887) (1, 3887) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",3887  144,

Unnamed: 0,Array,Chunk
Bytes,2.24 MB,15.55 kB
Shape,"(144, 3887)","(1, 3887)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.24 MB,15.55 kB
Shape,"(144, 3887)","(1, 3887)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.24 MB 15.55 kB Shape (144, 3887) (1, 3887) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",3887  144,

Unnamed: 0,Array,Chunk
Bytes,2.24 MB,15.55 kB
Shape,"(144, 3887)","(1, 3887)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.24 MB,15.55 kB
Shape,"(144, 3887)","(1, 3887)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.24 MB 15.55 kB Shape (144, 3887) (1, 3887) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",3887  144,

Unnamed: 0,Array,Chunk
Bytes,2.24 MB,15.55 kB
Shape,"(144, 3887)","(1, 3887)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray


Other orthogonal dimensions to lat,lon such as time, nz1, may also be used similar to xarray's `sel` method. Note that `select` method allows mixing of arrays for lat, lon as indexer and slices for other dimensions unlike Xarray for convinience.

In [4]:
fesom_ds.pyfesom2.select(region=polygon_region, time=slice('1950-01-01', '1950-05-02'))

Unnamed: 0,Array,Chunk
Bytes,31.10 kB,10.21 kB
Shape,"(3887,)","(1276,)"
Count,56 Tasks,4 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 31.10 kB 10.21 kB Shape (3887,) (1276,) Count 56 Tasks 4 Chunks Type float64 numpy.ndarray",3887  1,

Unnamed: 0,Array,Chunk
Bytes,31.10 kB,10.21 kB
Shape,"(3887,)","(1276,)"
Count,56 Tasks,4 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,31.10 kB,10.21 kB
Shape,"(3887,)","(1276,)"
Count,56 Tasks,4 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 31.10 kB 10.21 kB Shape (3887,) (1276,) Count 56 Tasks 4 Chunks Type float64 numpy.ndarray",3887  1,

Unnamed: 0,Array,Chunk
Bytes,31.10 kB,10.21 kB
Shape,"(3887,)","(1276,)"
Count,56 Tasks,4 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.92 MB,730.76 kB
Shape,"(4, 3887, 47)","(1, 3887, 47)"
Count,293 Tasks,4 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.92 MB 730.76 kB Shape (4, 3887, 47) (1, 3887, 47) Count 293 Tasks 4 Chunks Type float32 numpy.ndarray",47  3887  4,

Unnamed: 0,Array,Chunk
Bytes,2.92 MB,730.76 kB
Shape,"(4, 3887, 47)","(1, 3887, 47)"
Count,293 Tasks,4 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,2.92 MB,730.76 kB
Shape,"(4, 3887, 47)","(1, 3887, 47)"
Count,293 Tasks,4 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 2.92 MB 730.76 kB Shape (4, 3887, 47) (1, 3887, 47) Count 293 Tasks 4 Chunks Type float32 numpy.ndarray",47  3887  4,

Unnamed: 0,Array,Chunk
Bytes,2.92 MB,730.76 kB
Shape,"(4, 3887, 47)","(1, 3887, 47)"
Count,293 Tasks,4 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,62.19 kB,15.55 kB
Shape,"(4, 3887)","(1, 3887)"
Count,293 Tasks,4 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 62.19 kB 15.55 kB Shape (4, 3887) (1, 3887) Count 293 Tasks 4 Chunks Type float32 numpy.ndarray",3887  4,

Unnamed: 0,Array,Chunk
Bytes,62.19 kB,15.55 kB
Shape,"(4, 3887)","(1, 3887)"
Count,293 Tasks,4 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,62.19 kB,15.55 kB
Shape,"(4, 3887)","(1, 3887)"
Count,293 Tasks,4 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 62.19 kB 15.55 kB Shape (4, 3887) (1, 3887) Count 293 Tasks 4 Chunks Type float32 numpy.ndarray",3887  4,

Unnamed: 0,Array,Chunk
Bytes,62.19 kB,15.55 kB
Shape,"(4, 3887)","(1, 3887)"
Count,293 Tasks,4 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,62.19 kB,15.55 kB
Shape,"(4, 3887)","(1, 3887)"
Count,293 Tasks,4 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 62.19 kB 15.55 kB Shape (4, 3887) (1, 3887) Count 293 Tasks 4 Chunks Type float32 numpy.ndarray",3887  4,

Unnamed: 0,Array,Chunk
Bytes,62.19 kB,15.55 kB
Shape,"(4, 3887)","(1, 3887)"
Count,293 Tasks,4 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,62.19 kB,15.55 kB
Shape,"(4, 3887)","(1, 3887)"
Count,293 Tasks,4 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 62.19 kB 15.55 kB Shape (4, 3887) (1, 3887) Count 293 Tasks 4 Chunks Type float32 numpy.ndarray",3887  4,

Unnamed: 0,Array,Chunk
Bytes,62.19 kB,15.55 kB
Shape,"(4, 3887)","(1, 3887)"
Count,293 Tasks,4 Chunks
Type,float32,numpy.ndarray


## 2. Point selection

<img src="images/point_selection.png"
     alt="Point Selection" width=30% 
     style="width=10px; float: left; margin-right: 10px;" /> 



To select a data closest to a single point in latitude and longitude, `select` method can be supplied with lat, lon values as indexers such as: 

> `fesom_ds.select(lon=-20, lat=10)`


To select more points, that may define a transect in latitude and longitude, they can be provided as lat and lon arguments (and have to be of same size) containing sequence of values.

<div class="alert alert-block alert-info"> Note that point selection defaults to nearest neighbor selection on geocentric projection. This is equivalent to using tunnel distance for evaluating nearest neighbor.</div>



Define points in latitude and longitude as  numpy arrays and select closest data.

In [5]:
import numpy as np
sel_lats = np.linspace(-90,90,10) 
sel_lons = np.linspace(-180,180,10)
fesom_ds.pyfesom2.select(lat= sel_lats, lon= sel_lons)

Unnamed: 0,Array,Chunk
Bytes,80 B,24 B
Shape,"(10,)","(3,)"
Count,59 Tasks,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 80 B 24 B Shape (10,) (3,) Count 59 Tasks 7 Chunks Type float64 numpy.ndarray",10  1,

Unnamed: 0,Array,Chunk
Bytes,80 B,24 B
Shape,"(10,)","(3,)"
Count,59 Tasks,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,80 B,24 B
Shape,"(10,)","(3,)"
Count,59 Tasks,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 80 B 24 B Shape (10,) (3,) Count 59 Tasks 7 Chunks Type float64 numpy.ndarray",10  1,

Unnamed: 0,Array,Chunk
Bytes,80 B,24 B
Shape,"(10,)","(3,)"
Count,59 Tasks,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,270.72 kB,1.88 kB
Shape,"(144, 10, 47)","(1, 10, 47)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 270.72 kB 1.88 kB Shape (144, 10, 47) (1, 10, 47) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",47  10  144,

Unnamed: 0,Array,Chunk
Bytes,270.72 kB,1.88 kB
Shape,"(144, 10, 47)","(1, 10, 47)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,270.72 kB,1.88 kB
Shape,"(144, 10, 47)","(1, 10, 47)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 270.72 kB 1.88 kB Shape (144, 10, 47) (1, 10, 47) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",47  10  144,

Unnamed: 0,Array,Chunk
Bytes,270.72 kB,1.88 kB
Shape,"(144, 10, 47)","(1, 10, 47)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 5.76 kB 40 B Shape (144, 10) (1, 10) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",10  144,

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 5.76 kB 40 B Shape (144, 10) (1, 10) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",10  144,

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 5.76 kB 40 B Shape (144, 10) (1, 10) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",10  144,

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 5.76 kB 40 B Shape (144, 10) (1, 10) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",10  144,

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray



<div class="alert alert-block alert-info">Note that point selection does not return faces in coordinates, because faces contains indices and after selection they have no relavence, in future we may return lat and lon bnds </div>

Additionally other dimensions can also be passed as indexers similar to Xarray's sel method, in that case an orthogonal selection, ie., transect at selected times and levels is made.

In [6]:
fesom_ds.pyfesom2.select(lat= sel_lats, lon= sel_lons, nz1=slice(0, -1000)) 

Unnamed: 0,Array,Chunk
Bytes,80 B,24 B
Shape,"(10,)","(3,)"
Count,59 Tasks,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 80 B 24 B Shape (10,) (3,) Count 59 Tasks 7 Chunks Type float64 numpy.ndarray",10  1,

Unnamed: 0,Array,Chunk
Bytes,80 B,24 B
Shape,"(10,)","(3,)"
Count,59 Tasks,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,80 B,24 B
Shape,"(10,)","(3,)"
Count,59 Tasks,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 80 B 24 B Shape (10,) (3,) Count 59 Tasks 7 Chunks Type float64 numpy.ndarray",10  1,

Unnamed: 0,Array,Chunk
Bytes,80 B,24 B
Shape,"(10,)","(3,)"
Count,59 Tasks,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,144.00 kB,1000 B
Shape,"(144, 10, 25)","(1, 10, 25)"
Count,433 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 144.00 kB 1000 B Shape (144, 10, 25) (1, 10, 25) Count 433 Tasks 144 Chunks Type float32 numpy.ndarray",25  10  144,

Unnamed: 0,Array,Chunk
Bytes,144.00 kB,1000 B
Shape,"(144, 10, 25)","(1, 10, 25)"
Count,433 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,144.00 kB,1000 B
Shape,"(144, 10, 25)","(1, 10, 25)"
Count,433 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 144.00 kB 1000 B Shape (144, 10, 25) (1, 10, 25) Count 433 Tasks 144 Chunks Type float32 numpy.ndarray",25  10  144,

Unnamed: 0,Array,Chunk
Bytes,144.00 kB,1000 B
Shape,"(144, 10, 25)","(1, 10, 25)"
Count,433 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 5.76 kB 40 B Shape (144, 10) (1, 10) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",10  144,

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 5.76 kB 40 B Shape (144, 10) (1, 10) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",10  144,

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 5.76 kB 40 B Shape (144, 10) (1, 10) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",10  144,

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 5.76 kB 40 B Shape (144, 10) (1, 10) Count 289 Tasks 144 Chunks Type float32 numpy.ndarray",10  144,

Unnamed: 0,Array,Chunk
Bytes,5.76 kB,40 B
Shape,"(144, 10)","(1, 10)"
Count,289 Tasks,144 Chunks
Type,float32,numpy.ndarray


Note that mixing arrays and slices is allowed in `select` method. For convinience, select method also takes path argument that can be a shapely's LineString achieving same selection as above but also returns distance along path as an additional coordinate.

### 2.1 Trajectory selection
<img src="images/trajectory_selection.png"
     alt="Trajectory Selection" width=30% 
     style="width=10px; float: left; margin-right: 10px;" /> 


To select points along a transect defined in more then just lat-lon (trajectory). it is often convinient to use `select_points` method. Arrays defining a trajectory can directly be passed as arguments to `select_points` to make a trajectory like selection. However `select` method may also be used by using  
advanced indexing of Xarray. Apart from this simplification, `select_points` also returns additional diagnostics related to point selection such as distance along the trajectory, that can be useful for plotting. In future we intend to add more diagnostics around selection like error estimates from selection.

In [7]:
import pandas as pd
# Define selection values in time, depth, lat, lon
sel_times = pd.date_range('1950-01-01', freq='M', periods=5)
sel_levels = [0, -10, -5, -30, -15]
sel_lats = np.linspace(-90,90,5) 
sel_lons = np.linspace(-180,180,5)

In [8]:
fesom_ds.pyfesom2.select_points(lon=sel_lons, lat=sel_lats, nz1=sel_levels, time=sel_times)

Unnamed: 0,Array,Chunk
Bytes,40 B,8 B
Shape,"(5,)","(1,)"
Count,57 Tasks,5 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 40 B 8 B Shape (5,) (1,) Count 57 Tasks 5 Chunks Type float64 numpy.ndarray",5  1,

Unnamed: 0,Array,Chunk
Bytes,40 B,8 B
Shape,"(5,)","(1,)"
Count,57 Tasks,5 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,40 B,8 B
Shape,"(5,)","(1,)"
Count,57 Tasks,5 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 40 B 8 B Shape (5,) (1,) Count 57 Tasks 5 Chunks Type float64 numpy.ndarray",5  1,

Unnamed: 0,Array,Chunk
Bytes,40 B,8 B
Shape,"(5,)","(1,)"
Count,57 Tasks,5 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20 B,20 B
Shape,"(5,)","(5,)"
Count,295 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 20 B 20 B Shape (5,) (5,) Count 295 Tasks 1 Chunks Type float32 numpy.ndarray",5  1,

Unnamed: 0,Array,Chunk
Bytes,20 B,20 B
Shape,"(5,)","(5,)"
Count,295 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20 B,20 B
Shape,"(5,)","(5,)"
Count,295 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 20 B 20 B Shape (5,) (5,) Count 295 Tasks 1 Chunks Type float32 numpy.ndarray",5  1,

Unnamed: 0,Array,Chunk
Bytes,20 B,20 B
Shape,"(5,)","(5,)"
Count,295 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20 B,20 B
Shape,"(5,)","(5,)"
Count,295 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 20 B 20 B Shape (5,) (5,) Count 295 Tasks 1 Chunks Type float32 numpy.ndarray",5  1,

Unnamed: 0,Array,Chunk
Bytes,20 B,20 B
Shape,"(5,)","(5,)"
Count,295 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20 B,20 B
Shape,"(5,)","(5,)"
Count,295 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 20 B 20 B Shape (5,) (5,) Count 295 Tasks 1 Chunks Type float32 numpy.ndarray",5  1,

Unnamed: 0,Array,Chunk
Bytes,20 B,20 B
Shape,"(5,)","(5,)"
Count,295 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20 B,20 B
Shape,"(5,)","(5,)"
Count,295 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 20 B 20 B Shape (5,) (5,) Count 295 Tasks 1 Chunks Type float32 numpy.ndarray",5  1,

Unnamed: 0,Array,Chunk
Bytes,20 B,20 B
Shape,"(5,)","(5,)"
Count,295 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,20 B,20 B
Shape,"(5,)","(5,)"
Count,295 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 20 B 20 B Shape (5,) (5,) Count 295 Tasks 1 Chunks Type float32 numpy.ndarray",5  1,

Unnamed: 0,Array,Chunk
Bytes,20 B,20 B
Shape,"(5,)","(5,)"
Count,295 Tasks,1 Chunks
Type,float32,numpy.ndarray


Note in such trajectory selection, all the dimensions used as indexers need to be of same size. The return dataset has these selected indexers on a common dimentions `nod2`. See accessor_plotting.ipynb example for opinionated ploting such transect.

## 3. Advanced selections

<img src="images/advanced_selection.png"
     alt="Multi Trajectory Selection" width=30% 
     style="width=10px; float: left; margin-right: 10px;"
 />
 Both `select` and `select_points` methods support Xarray's advanced indexing, this can be leveraged to select multiple trajectories at once. To use Xarray's advanced indexing, indexers have to be defined as dataarrays with a common dimentions.

Define three sample trajectories in lat, lon, time and depth. 

In [9]:
import xarray as xr
lons_ref = np.array([-25., -33., -36., -37., -45])
lons = xr.DataArray([lons_ref, lons_ref-5., lons_ref+5.], dims=('ntraj', 'trajectory'))

lats_ref = [ 2.,  7., 11. , 16., 20.]
lats = xr.DataArray([lats_ref, lats_ref, lats_ref], dims=('ntraj', 'trajectory'))

levs = np.array([[-800., -700., -750., -900., -950.],
                 [-800., -500., -400., -600., -700.],
                 [-800., -900., -750., -600., -700.]])
levs = xr.DataArray(levs, dims=('ntraj', 'trajectory'))

times_ref = pd.date_range('1950-01-01', freq='M', periods=5)
times = xr.DataArray([times_ref, times_ref, times_ref], dims=('ntraj', 'trajectory'))

Select the trajectories at once. Dimension `ntraj` can be used to identify the trajectory.

In [10]:
fesom_ds.pyfesom2.select_points(lon=lons, lat=lats, time =times, nz1=levs) 

Unnamed: 0,Array,Chunk
Bytes,120 B,120 B
Shape,"(3, 5)","(3, 5)"
Count,57 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 120 B 120 B Shape (3, 5) (3, 5) Count 57 Tasks 1 Chunks Type float64 numpy.ndarray",5  3,

Unnamed: 0,Array,Chunk
Bytes,120 B,120 B
Shape,"(3, 5)","(3, 5)"
Count,57 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,120 B,120 B
Shape,"(3, 5)","(3, 5)"
Count,57 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 120 B 120 B Shape (3, 5) (3, 5) Count 57 Tasks 1 Chunks Type float64 numpy.ndarray",5  3,

Unnamed: 0,Array,Chunk
Bytes,120 B,120 B
Shape,"(3, 5)","(3, 5)"
Count,57 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,60 B,60 B
Shape,"(3, 5)","(3, 5)"
Count,728 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 60 B 60 B Shape (3, 5) (3, 5) Count 728 Tasks 1 Chunks Type float32 numpy.ndarray",5  3,

Unnamed: 0,Array,Chunk
Bytes,60 B,60 B
Shape,"(3, 5)","(3, 5)"
Count,728 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,60 B,60 B
Shape,"(3, 5)","(3, 5)"
Count,728 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 60 B 60 B Shape (3, 5) (3, 5) Count 728 Tasks 1 Chunks Type float32 numpy.ndarray",5  3,

Unnamed: 0,Array,Chunk
Bytes,60 B,60 B
Shape,"(3, 5)","(3, 5)"
Count,728 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,60 B,60 B
Shape,"(3, 5)","(3, 5)"
Count,728 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 60 B 60 B Shape (3, 5) (3, 5) Count 728 Tasks 1 Chunks Type float32 numpy.ndarray",5  3,

Unnamed: 0,Array,Chunk
Bytes,60 B,60 B
Shape,"(3, 5)","(3, 5)"
Count,728 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,60 B,60 B
Shape,"(3, 5)","(3, 5)"
Count,728 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 60 B 60 B Shape (3, 5) (3, 5) Count 728 Tasks 1 Chunks Type float32 numpy.ndarray",5  3,

Unnamed: 0,Array,Chunk
Bytes,60 B,60 B
Shape,"(3, 5)","(3, 5)"
Count,728 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,60 B,60 B
Shape,"(3, 5)","(3, 5)"
Count,728 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 60 B 60 B Shape (3, 5) (3, 5) Count 728 Tasks 1 Chunks Type float32 numpy.ndarray",5  3,

Unnamed: 0,Array,Chunk
Bytes,60 B,60 B
Shape,"(3, 5)","(3, 5)"
Count,728 Tasks,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,60 B,60 B
Shape,"(3, 5)","(3, 5)"
Count,728 Tasks,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 60 B 60 B Shape (3, 5) (3, 5) Count 728 Tasks 1 Chunks Type float32 numpy.ndarray",5  3,

Unnamed: 0,Array,Chunk
Bytes,60 B,60 B
Shape,"(3, 5)","(3, 5)"
Count,728 Tasks,1 Chunks
Type,float32,numpy.ndarray


## 4. Selection on variables

In above examples selections were performed on entire dataset, which is convininent to make selections on entire dataset but the `.pyfesom2` accessor can also be used on individual data variables (dataarrays) of datasets. 

In [11]:
fesom_ds.pyfesom2.fesom_var.select_points(lon=sel_lons, lat=sel_lats, nz1=sel_levels).compute()

AttributeError: 'FESOMDataset' object has no attribute 'fesom_var'


Here select_points is used on variable `temp` to select a transect defined using `sel_lons`, `sel_lats` and `sel_levs`.

<div class="alert alert-block alert-info" style="font-size:120%"> 
Region selection on a data variable returns a Xarray dataset. This is unlike selection on data variables in Xarray, which always returns a dataarray. Returning dataset for region selection was necessary to retain `faces` coordinate variable which otherwise would not be not possible to retain in a dataarray. The coordinate variable, faces is necessary for spatial plots and to be able to save the region subset as a standalone dataset. 
</div>    