# General Dataset Utilities

Author: [Tom Vo](https://github.com/tomvothecoder/) 

Date: 05/26/22

This notebook demonstrates the use of general utility methods available in `xcdat`, including
the reorientation of the longitude axis, centering of time coordinates using time bounds, and 
adding and getting bounds. 

In [1]:
import xcdat

## Open a multi-file dataset

In [2]:
dataset_links = [
    "https://esgf-data2.llnl.gov/thredds/dodsC/user_pub_work/E3SM/1_0/amip_1850_aeroF/1deg_atm_60-30km_ocean/atmos/180x360/time-series/mon/ens2/v3/TS_187001_189412.nc",
    "https://esgf-data2.llnl.gov/thredds/dodsC/user_pub_work/E3SM/1_0/amip_1850_aeroF/1deg_atm_60-30km_ocean/atmos/180x360/time-series/mon/ens2/v3/TS_189501_191912.nc",
]


In [3]:
ds = xcdat.open_mfdataset(dataset_links)

In [4]:
ds

Unnamed: 0,Array,Chunk
Bytes,2.81 kiB,2.81 kiB
Shape,"(180, 2)","(180, 2)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 2.81 kiB 2.81 kiB Shape (180, 2) (180, 2) Count 5 Tasks 1 Chunks Type float64 numpy.ndarray",2  180,

Unnamed: 0,Array,Chunk
Bytes,2.81 kiB,2.81 kiB
Shape,"(180, 2)","(180, 2)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,5.62 kiB,5.62 kiB
Shape,"(360, 2)","(360, 2)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 5.62 kiB 5.62 kiB Shape (360, 2) (360, 2) Count 5 Tasks 1 Chunks Type float64 numpy.ndarray",2  360,

Unnamed: 0,Array,Chunk
Bytes,5.62 kiB,5.62 kiB
Shape,"(360, 2)","(360, 2)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.41 kiB,1.41 kiB
Shape,"(180,)","(180,)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.41 kiB 1.41 kiB Shape (180,) (180,) Count 5 Tasks 1 Chunks Type float64 numpy.ndarray",180  1,

Unnamed: 0,Array,Chunk
Bytes,1.41 kiB,1.41 kiB
Shape,"(180,)","(180,)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,9.38 kiB,4.69 kiB
Shape,"(600, 2)","(300, 2)"
Count,6 Tasks,2 Chunks
Type,object,numpy.ndarray
"Array Chunk Bytes 9.38 kiB 4.69 kiB Shape (600, 2) (300, 2) Count 6 Tasks 2 Chunks Type object numpy.ndarray",2  600,

Unnamed: 0,Array,Chunk
Bytes,9.38 kiB,4.69 kiB
Shape,"(600, 2)","(300, 2)"
Count,6 Tasks,2 Chunks
Type,object,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,506.25 kiB,506.25 kiB
Shape,"(180, 360)","(180, 360)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 506.25 kiB 506.25 kiB Shape (180, 360) (180, 360) Count 5 Tasks 1 Chunks Type float64 numpy.ndarray",360  180,

Unnamed: 0,Array,Chunk
Bytes,506.25 kiB,506.25 kiB
Shape,"(180, 360)","(180, 360)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,148.32 MiB,74.16 MiB
Shape,"(600, 180, 360)","(300, 180, 360)"
Count,6 Tasks,2 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 148.32 MiB 74.16 MiB Shape (600, 180, 360) (300, 180, 360) Count 6 Tasks 2 Chunks Type float32 numpy.ndarray",360  180  600,

Unnamed: 0,Array,Chunk
Bytes,148.32 MiB,74.16 MiB
Shape,"(600, 180, 360)","(300, 180, 360)"
Count,6 Tasks,2 Chunks
Type,float32,numpy.ndarray


## Reorient the longitude axis from (0, 360) to (-180, 180)

* API: ``xcdat.swap_long_axis()``
* Alternative solution: ``xcdat.open_mfdataset(dataset_links, lon_orient=(-180, 180))``


In [5]:
ds.lon

In [6]:
ds2 = xcdat.swap_lon_axis(ds, to=(0, 360))

In [7]:
ds2.lon

## Center the time coordinates
* API: ``ds.temporal.center_times()``
* Alternative solution: ``xcdat.open_mfdataset(dataset_links, center_times=True)``

In [8]:
ds.time

In [9]:
ds3 = ds.temporal.center_times()

In [10]:
ds3.time

## Add bounds

* API: ``ds.bounds.add_bounds("time")``
* Alternative solution: ``xcdat.open_mfdataset(dataset_links, add_bounds=True)``, assuming the file doesn't already have time bounds

In [11]:
# We are dropping the existing bounds to demonstrate adding bounds.
ds4 = ds.drop_vars("time_bnds")

In [12]:
try:
    ds4.bounds.get_bounds("time")
except KeyError as e:
    print(e)

'T bounds were not found, they must be added.'


In [13]:
# A `width` kwarg can be specified, which is width of the bounds relative to 
# the position of the nearest points. The default value is 0.5.
ds4 = ds4.bounds.add_bounds("time", width=0.5)

In [14]:
ds4.bounds.get_bounds("time")

## Add missing bounds for all axes supported by xcdat (X, Y, T, Z)

In [15]:
# We drop the dataset axes bounds to demonstrate generating missing bounds.
ds5 = ds.drop_vars(["time_bnds", "lat_bnds", "lon_bnds"])

In [16]:
ds5

Unnamed: 0,Array,Chunk
Bytes,1.41 kiB,1.41 kiB
Shape,"(180,)","(180,)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.41 kiB 1.41 kiB Shape (180,) (180,) Count 5 Tasks 1 Chunks Type float64 numpy.ndarray",180  1,

Unnamed: 0,Array,Chunk
Bytes,1.41 kiB,1.41 kiB
Shape,"(180,)","(180,)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,506.25 kiB,506.25 kiB
Shape,"(180, 360)","(180, 360)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 506.25 kiB 506.25 kiB Shape (180, 360) (180, 360) Count 5 Tasks 1 Chunks Type float64 numpy.ndarray",360  180,

Unnamed: 0,Array,Chunk
Bytes,506.25 kiB,506.25 kiB
Shape,"(180, 360)","(180, 360)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,148.32 MiB,74.16 MiB
Shape,"(600, 180, 360)","(300, 180, 360)"
Count,6 Tasks,2 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 148.32 MiB 74.16 MiB Shape (600, 180, 360) (300, 180, 360) Count 6 Tasks 2 Chunks Type float32 numpy.ndarray",360  180  600,

Unnamed: 0,Array,Chunk
Bytes,148.32 MiB,74.16 MiB
Shape,"(600, 180, 360)","(300, 180, 360)"
Count,6 Tasks,2 Chunks
Type,float32,numpy.ndarray


In [17]:
ds5 = ds5.bounds.add_missing_bounds(width=0.5)

In [18]:
ds5

Unnamed: 0,Array,Chunk
Bytes,1.41 kiB,1.41 kiB
Shape,"(180,)","(180,)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.41 kiB 1.41 kiB Shape (180,) (180,) Count 5 Tasks 1 Chunks Type float64 numpy.ndarray",180  1,

Unnamed: 0,Array,Chunk
Bytes,1.41 kiB,1.41 kiB
Shape,"(180,)","(180,)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,506.25 kiB,506.25 kiB
Shape,"(180, 360)","(180, 360)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 506.25 kiB 506.25 kiB Shape (180, 360) (180, 360) Count 5 Tasks 1 Chunks Type float64 numpy.ndarray",360  180,

Unnamed: 0,Array,Chunk
Bytes,506.25 kiB,506.25 kiB
Shape,"(180, 360)","(180, 360)"
Count,5 Tasks,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,148.32 MiB,74.16 MiB
Shape,"(600, 180, 360)","(300, 180, 360)"
Count,6 Tasks,2 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 148.32 MiB 74.16 MiB Shape (600, 180, 360) (300, 180, 360) Count 6 Tasks 2 Chunks Type float32 numpy.ndarray",360  180  600,

Unnamed: 0,Array,Chunk
Bytes,148.32 MiB,74.16 MiB
Shape,"(600, 180, 360)","(300, 180, 360)"
Count,6 Tasks,2 Chunks
Type,float32,numpy.ndarray
