# Notes - Xarray

## Terminology

- DataArray: A multi-dimensional array with labeled or named dimensions. DataArray objects add metadata such as dimension names, coordinates, and attributes. For example, an array is var(time, level, lat, lon).
- DataSet: A dict-like collection of DataArray objects with aligned dimensions. For example, a dataset contains temperature(time, level, lat, lon) and precipitation(time, lat, lon).

## References

- Unidata Xarray Introduction, https://unidata.github.io/python-training/workshop/XArray/xarray-introduction/
- Xarray quick overview, https://docs.xarray.dev/en/stable/getting-started-guide/quick-overview.html
- Xarray computation, https://docs.xarray.dev/en/stable/user-guide/computation.html


In [66]:
import numpy as np
import xarray as xr
import io, os, sys, types

## Create a DataArray

Xarray - https://docs.xarray.dev/en/stable/getting-started-guide/quick-overview.html#create-a-dataarray

Unidata - https://unidata.github.io/python-training/workshop/XArray/xarray-introduction/#DataArray

In [67]:
#--- Create some sample "temperature" data
data = 283 + 5 * np.random.randn(5, 3, 4)

time = np.arange(0,5)
lat = np.linspace(-120., 60., 3)
lon = np.linspace(25.,55.,4)

#--- create a DataArray & set attributes
temp = xr.DataArray(data, dims=['time', 'lat', 'lon'], coords=[time, lat, lon])

temp.attrs['units'] = "K"
temp.attrs['long_name'] = "Temperature"

with xr.set_options(keep_attrs=True):  # keep attributes after operation
  temp_degC = temp - 273.15
temp_degC.attrs['units'] = "C"
temp_degC

## Selection

Unidata - https://unidata.github.io/python-training/workshop/XArray/xarray-introduction/#Selection

### Selection Method 1: use indexing

In [68]:
#--- Method 1: use indexing
var = temp[0, 1:2, :]
var

### Selection Method 2: Use name dimension & slicing

In [69]:
#--- Method 2; use name dimension
temp.coords  # check out variable dimension
print(temp.coords)

#--- select specific values in coordinates
var = temp.sel(time=1, lat=-30., lon=25)
print('------------')
print(var)

var = temp.sel(time=1, lon=25)
print('------------')
print(var)

#--- Slicing with Selection
var = temp.sel(time=slice(0,2), lat=-30., lon=slice(-1000.,1000.))
print('------------')
print(var)

Coordinates:
  * time     (time) int64 0 1 2 3 4
  * lat      (lat) float64 -120.0 -30.0 60.0
  * lon      (lon) float64 25.0 35.0 45.0 55.0
------------
<xarray.DataArray ()>
array(287.35977527)
Coordinates:
    time     int64 1
    lat      float64 -30.0
    lon      float64 25.0
Attributes:
    units:      K
    long_name:  Temperature
------------
<xarray.DataArray (lat: 3)>
array([283.15851494, 287.35977527, 281.09325351])
Coordinates:
    time     int64 1
  * lat      (lat) float64 -120.0 -30.0 60.0
    lon      float64 25.0
Attributes:
    units:      K
    long_name:  Temperature
------------
<xarray.DataArray (time: 3, lon: 4)>
array([[284.89077439, 291.46640506, 281.36236479, 285.02193589],
       [287.35977527, 284.30500622, 284.78217759, 284.89501531],
       [280.96763131, 293.1217846 , 283.01422977, 278.15127847]])
Coordinates:
  * time     (time) int64 0 1 2
    lat      float64 -30.0
  * lon      (lon) float64 25.0 35.0 45.0 55.0
Attributes:
    units:      K
    long_n

### Selection Method 3: use .loc

In [70]:
#*** Useful if already knowing the range to each coordinate

# temp is temp(time, lat, lon)
var = temp.loc[0:4, -120:30, :]
print(var)

<xarray.DataArray (time: 5, lat: 2, lon: 4)>
array([[[277.24715421, 285.4167252 , 282.77669352, 287.99049392],
        [284.89077439, 291.46640506, 281.36236479, 285.02193589]],

       [[283.15851494, 286.26781164, 283.24297013, 283.05391983],
        [287.35977527, 284.30500622, 284.78217759, 284.89501531]],

       [[276.02595033, 283.66636883, 276.76042649, 275.57431526],
        [280.96763131, 293.1217846 , 283.01422977, 278.15127847]],

       [[288.44392064, 286.14862726, 283.10960537, 290.72594854],
        [271.52238872, 285.93310275, 281.97704154, 277.67534916]],

       [[280.57054103, 281.02944126, 290.51813653, 286.0801705 ],
        [289.66821092, 281.04924208, 275.10471027, 284.43271723]]])
Coordinates:
  * time     (time) int64 0 1 2 3 4
  * lat      (lat) float64 -120.0 -30.0
  * lon      (lon) float64 25.0 35.0 45.0 55.0
Attributes:
    units:      K
    long_name:  Temperature


## Computation &

## Plotting