# Analysis and Visualization of E3SM Data using UXarray (Experimental)

E3SM Tutorial Workshop 2024

05/07/2024

Authors: [Tom Vo](https://github.com/tomvothecoder) and [Stephen Po-Chedley](https://github.com/pochedls)


## Overview

This exercise notebook will walk you through the core UXarray data models and a few
analysis and visualization features to help you gain practical hands-on experience. Please visit the [UXarray documentation](https://uxarray.readthedocs.io/en/stable/) for more information on all of the available features.

**Please note, UXarray is actively being developed, hence this is an "experimental"
notebook exploring some of the current features and capabilities.**

### Sections

1. Prerequisite: Set up the Conda Environment and select the Python Kernel for this notebook
2. Setup Code
3. Open E3SM Data with Grid Files
4. View Grid Information
5. Visualize Grid Topology
6. Face Area Calculations
7. Visualize Data as Polygons
8. Interoperability with xCDAT

### Helpful Prequisite Knowledge

If you are unfamiliar and interested in the topics below, click the links for
more information.

- [Differences between structured and unstructured grids](https://github.com/ProjectPythia/unstructured-grid-viz-cookbook/blob/main/notebooks/01-intro/01-unstructured-grid-overview.ipynb)
- [Data Mapping](https://github.com/ProjectPythia/unstructured-grid-viz-cookbook/blob/main/notebooks/01-intro/03-data-mapping.ipynb)
- [Plotting Libraries](https://github.com/ProjectPythia/unstructured-grid-viz-cookbook/blob/main/notebooks/02-methods/01-plotting-libraries.ipynb)
- [Rendering Techniques](https://github.com/ProjectPythia/unstructured-grid-viz-cookbook/blob/main/notebooks/02-methods/02-rendering-techniques.ipynb)

### Resources

- [UXarray documentation](https://uxarray.readthedocs.io/en/stable/)
- This notebook was adapted from the [UXarray Usage Examples](https://uxarray.readthedocs.io/en/stable/examples.html) and the [Project Pythia Notebooks](https://projectpythia.org/unstructured-grid-viz-cookbook/README.html).


## Prerequisite: Set up the Conda Environment and select the Python Kernel for this notebook

1. Open a terminal in Jupyter Hub.
2. Run the commands below to add the kernel to NERSC Jupyter Hub.
   ```bash
    
   module load conda
   conda create --name uxarray_practicum -c conda-forge python uxarray spatialpandas antimeridian ipykernel
    
   python -m ipykernel install \
   --user --name uxarray_practicum --display-name uxarray_practicum
   ```
    &mdash; <cite>https://docs.nersc.gov/services/jupyter/how-to-guides/#how-to-use-a-conda-environment-as-a-python-kernel</cite>

3. Refresh this page
4. Select the kernel for this notebook by clicking the current kernel in the top-right
   (where it says NERSC Python in the screenshot).

   <img src="kernel-instructions-1.png" width=500px/>

5. Select `uxarray_practicum` from the list of environments.

   <img src="kernel-instructions-3.png" width=500px/>


## Setup Code


In [None]:
import glob

import numpy as np
import xarray as xr
import uxarray as ux

# The data directory containing the NetCDF files.
data_dir = "/global/cfs/cdirs/e3sm/www/Tutorials/2024/simulations/extendedOutput.v3.LR.historical_0101/archive/atm/hist/*.h0.*.nc"

# The absolute paths to each NetCDF file in the data directory.
data_paths = glob.glob(data_dir)

# The path to the grid file.
grid_path = "/global/cfs/cdirs/e3sm/diagnostics/grids/ne30pg2.nc"

## First off, what are `ux.Dataset`, `ux.DataArray`, and `ux.Grid` objects?

- A [ux.UxDataset](https://uxarray.readthedocs.io/en/stable/user_api/generated/uxarray.UxDataset.html#uxarray.UxDataset) object is an `xarray.Dataset-like`, multi-dimensional, in memory, array database. This object inherits from `xarray.Dataset` and has its own unstructured grid-aware dataset operators and attributes through the `uxgrid` accessor.
- A [ux.UxDataArray](https://uxarray.readthedocs.io/en/stable/user_api/generated/uxarray.UxDataArray.html) object is an N-dimensional `xarray.DataArray-like` array. It inherits from `xarray.DataArray` and has its own unstructured grid-aware array operators and attributes through the `uxgrid` accessor.
- A [ux.Grid](https://uxarray.readthedocs.io/en/stable/user_api/generated/uxarray.Grid.html#) object represents a two-dimensional unstructured grid encoded following the UGRID conventions and provides grid-specific functionality.
  - Can be used standalone to work with unstructured grids, or can be paired with either a `ux.UxDataArray` or `ux.UxDataset` and accessed through the `.uxgrid` attribute.
  - For constructing a grid from non-UGRID datasets or other types of supported data, see our `ux.open_grid` method or specific class methods (`Grid.from_dataset`, `Grid.from_face_verticies`, etc.)


## Open E3SM Dataset with Grid Files using UXarray

When working with Unstructured Grids, the grid definition and data variables are often stored as separate files. This means that there are multiple separate files that need to be read and linked together to represent the entire dataset.


#### 💻 Your turn:

Use `ux.open_mfdataset()` to open the grid file and the first three netCDF files as a `ux.Dataset` object. We are only loading the first three netCDF files for the purpose of this notebook because the dataset is large
in its entirety.

- Documentation: https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.open_mfdataset.html
- Hint: Use `grid_path` and `data_paths[0:2]` as function arguments.


In [None]:
# Your code here. When ready, click on the three dots below for the solution.

In [None]:
uxds = ux.open_mfdataset(grid_path, data_paths[0:2])

#### 💻 Your turn:

Access the `TREFHT` variable by indexing the `UxDataset` object to obtain a `UxDataArray` object.


In [None]:
# Your code here. When ready, click on the three dots below for the solution.

In [None]:
uxds["TREFHT"]

## Viewing Grid Information


#### 💻 Your turn:

View the grid information stored in `uxds` through the `uxgrid` attribute.

- Documentation: https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.UxDataset.uxgrid.html


In [None]:
# Your code here. When ready, click on the three dots below for the solution.

In [None]:
grid = uxds.uxgrid
grid

#### Grid Attributes

If our input grid contained additional attributes that were not representable by the UGRID conventions, they would be stored here.


In [None]:
grid.parsed_attrs

#### Grid Coordinates

The coordinates by default are represented in terms of longitude and latitude.


Documentation:

- https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.Grid.node_lon.html
- https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.Grid.node_lat.html


In [None]:
grid.node_lon

In [None]:
grid.node_lat

If you wish to use the Cartesian coordinate system, you can access the following attributes, which will internally construct a set of Cartesian coordinates derived from the previous set.

Documentation:

- https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.Grid.node_x.html
- https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.Grid.node_y.html
- https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.Grid.node_z.html


In [None]:
grid.node_x

In [None]:
grid.node_y

In [None]:
grid.node_z

#### Grid Connectivity

Connectivity variables are used to describe how various geometric elements (nodes, faces, edges) can be manipulated and interconnected to represent the topology of the unstructured grid.

As described in the UGRID conventions, these connectivity variables are stored as integer arrays and may contain a Fill Value. UXarray standardizes both of these at the data loading step, meaning that the data type and fill value can always be guaranteed to be the following:



In [None]:
ux.INT_DTYPE

In [None]:
ux.INT_FILL_VALUE

Below we can see how to access these connectivity variables.


In [None]:
grid.face_node_connectivity

In [None]:
grid.n_nodes_per_face

## Visualize the Grid Topology


#### Using the `Grid.plot()` Accessor

Each Grid object is initialized with a plotting accessor, which enables plotting routines to be called directly on the object. By default, calling `.plot()` on a `Grid` instance plots all the edges of a grid.

All of the plotting methods are built around the Holoviews package, so you can select between Matplotlib and Bokeh backends if desired (Bokeh is the default and is suggested).


#### 💻 Your turn:

Extract the grid topology from the `grid` and plot it with `height=350` and `width=700`.

- Documentation: https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.Grid.plot.html


In [None]:
# Your code here. When ready, click on the three dots below for the solution.

In [None]:
grid.plot(title="Default Grid Plot Method", height=350, width=700)

## Face Area Calculations

This section covers the different area calculation options provided by `uxarray`.
Note, this is a only subset of the available options.


#### 💻 Your turn:

Calculate the total face area for the grid.

- Documentation: https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.Grid.calculate_total_face_area.html
- Hint: Use `.calculate_total_face_area()`


In [None]:
# Your code here. When ready, click on the three dots below for the solution.

In [None]:
t4_area = grid.calculate_total_face_area()
t4_area

#### 💻 Your turn:

Calculate the total face area using the Quadrature Rule and Order of 4.

- Documentation: https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.Grid.calculate_total_face_area.html
- Hint: Use `.calculate_total_face_area()`

Order:

```
   1 to 10              for gaussian
   1, 4, 8, 10 and 12   for triangular
```


In [None]:
# Your code here. When ready, click on the three dots below for the solution.

In [None]:
t1_area = grid.calculate_total_face_area(quadrature_rule="triangular", order=1)

#### 💻 Your turn:

View the individual face areas using `Grid.face_areas`.

- Documentation: https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.Grid.face_areas.html


In [None]:
# Your code here. When ready, click on the three dots below for the solution.

In [None]:
grid.face_areas

#### 💻 Your turn:

Calculate the area using `Grid.compute_face_areas()` and get the sum of all the face areas.
Make sure the `quadrature_rule="gaussian"` and `order=4`.

- Documentation: https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.Grid.compute_face_areas.html
- Hint: `compute_face_areas()` returns two arrays: 1. area of all faces in the mesh and 2. jacobian of all the faces in the mesh. You only need the first array, then call `sum()` on it.


In [None]:
# Your code here. When ready, click on the three dots below for the solution.

In [None]:
all_face_areas, all_face_jacobians = grid.compute_face_areas(
    quadrature_rule="gaussian", order=4
)
g4_area = all_face_areas.sum()
g4_area

Now we compare the values with actual know value and report error for each of the three cases above.

Just execute the cell below to view the outputs.


In [None]:
actual_area = 4 * np.pi
diff_t4_area = np.abs(t4_area - actual_area)
diff_t1_area = np.abs(t1_area - actual_area)
diff_g4_area = np.abs(g4_area - actual_area)

diff_t1_area, diff_t4_area, diff_g4_area

## Visualizing E3SM Data as Polygons

Polygon plotting is the primary method for visualizing face-centered data variables in UXarray.

 <div class="alert alert-block alert-info">
<b>Info:</b> UXarray’s Plotting API is built around the <a href="https://holoviews.org/">Holoviews</a> package. For details about customization and accepted parameters, pleases refer to their documentation.
</div>


#### 💻 Your turn:

Visualize the first time coordinate of the `"TREFHT"` variable using polygons,
with a `line_width=0.1` and `title="Vector Polygon Plot`".

- Documentation: https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.UxDataArray.plot.polygons.html
- Hint: Use `isel` with the time label, and `plot.polygons`.


In [None]:
# Your code here. When ready, click on the three dots below for the solution.

In [None]:
uxds["TREFHT"].isel(time=0).plot.polygons(line_width=0.1, title="Vector Polygon Plot")

### Excluding Antimeridian

For larger datasets, its suggested to keep exclude_antimeridian=True as a parameter. This will exclude polygons that would require expensive recomputations for splitting along the antimeridian.


#### 💻 Your turn:

Exclude the antimeridian for the same plot as above.

- Documentation: https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.UxDataArray.plot.polygons.html


In [None]:
# Your code here. When ready, click on the three dots below for the solution.

In [None]:
uxds["TREFHT"].isel(time=0).plot.polygons(
    line_width=0.1,
    title="Vector Polygon Plot (Excluding Antimeridian)",
    exclude_antimeridian=True,
)

### Rasterized Polygon Plots


#### 💻 Your turn:

Generate the rasterized version of the polygon plot.

- Documentation: https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.UxDataArray.plot.rasterize.html
- Hint: Use `plot.rasterize` with a `method="polygon"` and `"Raster Polygon Plot"`.


In [None]:
# Your code here. When ready, click on the three dots below for the solution.

In [None]:
uxds["TREFHT"].isel(time=0).plot.rasterize(
    method="polygon", title="Raster Polygon Plot"
)

### Dynamic Rasterized Polygon Plots

By using the dynamic=True paramter, the plot will automatically re-rasterize itself when zooming or panning, leading to better data fidelity. It is also suggested to set a static clim=(min, max) to prevent the colorbar from also changing

#### 💻 Your turn:

Generate the dynamic rasterized version of the polygon plot.

- Documentation: https://uxarray.readthedocs.io/en/latest/user_api/generated/uxarray.UxDataArray.plot.rasterize.html
- Hint: Add `dynamic=True` to the same call to `rasterize()` in the above exercise.


In [None]:
uxds['TREFHT'].isel(time=0).plot.rasterize(method='polygon', title='Raster Polygon Plot (Dynamic)', dynamic=True)

## Interoperability with xCDAT

Since `ux.UxDataset` and `ux.UxDataArray` extend the `xr.Dataset` and `xr.DataArray` classes,
_most_ xCDAT APIs are interoperable with UXarray objects.

- The exception is xCDAT's [spatial averager](https://xcdat.readthedocs.io/en/latest/generated/xarray.Dataset.spatial.average.html), which requires data on rectilinear grids. The data must first be remapped from unstructured to rectilinear grid using another tool like `nco` (`ncremap`).
- There are plans to support unstructured to structured regridding in UXarray in the future.

Resources:

- [xCDAT Documentation Homepage](https://xcdat.readthedocs.io/en/stable/)
- [xCDAT API Reference Guide](https://xcdat.readthedocs.io/en/stable/api.html)

## Next Steps

Feel free to jump over to the `xcdat_practicum_notebook.ipynb` to work with `nco` and `xcdat`.
