# CSS 120: Environmental Data Science

## Xarray Operations

# Packages

In [None]:
# imports
from datetime import timedelta
import numpy as np
import pandas as pd
import xarray as xr
from matplotlib import pyplot as plt
from pythia_datasets import DATASETS

##  Some Environmental Sciences

### Oceanic Climate Systems

Like the atmospheric climate systems, the oceanic climate system is key to understand the environment.

One process that we have to study is the surface system process.

- In the upper 100 meters of the ocean, these processes are driven by wind.

    + These currents transport nutrients and heat to different places.

- In deeper parts of the ocean, the currents are determined by water density and salinity of the ocean.

    + The deep water currents mostly form in the polar regions of the planet.

##  Some Environmental Sciences

### Oceanic Climate Systems

![](l11img01.png)

##  Some Environmental Sciences

### Oceanic Climate Systems

![](l11img02.png)

##  Some Environmental Sciences

### Oceanic Climate Systems

![](l11img03.png)

##  Some Environmental Sciences

### Oceanic Climate Systems

![](l11img04.png)

## Some Environmental Sciences

Alongside the latitudinal temperature gradients set by solar radiation, the large-scale ocean circulation patterns are one of the main controls on global sea surface temperature (SST, or tos). 

The surface currents distort this meridional gradient and can transport heat globally. In this tutorial, we'll use a series of tools in Xarray to interpret sea surface temperature data. 

Specifically, we’ll import monthly SST data from the Community Earth System Model  v2 (CESM2), which is a Global Climate Model. 

A climate model is a mathematical representation of Earth's climate system components and their interactions. 

Climate models are based on well-documented physical processes to simulate the transfer of energy and materials through the climate system. 

## Some Environmental Sciences

You'll learn more about climate models later this week and next week, but for now, we're going to be working with SST data produced from a climate model. 

To assess global variations in this SST dataset, we will practice using multiple attributes of Xarray:

*   Arithmetic methods to convert temperatures from Celsius to Kelvin
*   Aggregation methods to calculate mean, median, minimum and maximum values of the data.

Finally, we'll create a map of global mean annual SST to visualize spatial variations in SST.

# Arithmetic Operations

Arithmetic operations with a single `DataArray` automatically apply over all array values (like NumPy). This process is called **vectorization**.  

First, let's open the monthly sea surface temperature (SST) data from the Community Earth System Model v2 (CESM2), which is a Global Climate Model.

In [None]:
filepath = DATASETS.fetch("CESM2_sst_data.nc")
ds = xr.open_dataset(filepath)
ds

# Arithmetic Operations

And look at the temeprature variable `tos`.

In [None]:
ds.tos

# Arithmetic Operations

Note in the attributes that the units are 'degC'. One arithmetic operation we can do is to the  convert the temperature from degrees Celsius to Kelvin:

In [None]:
ds.tos + 273.15

# Arithmetic Operations

You may notice that there are a lot of NaN values in the DataArray for `tos`. NaN isn’t a bad thing and it just means there isn’t data for those coordinates. 

In this case, there's no `tos` data for areas with land since this dataset only contains SST values.


Just to practice another arithmetic operation, lets's square all values in `tos`:

In [None]:
ds.tos**2

# Aggregation Methods

A very common step during data analysis is to summarize the data in question by computing aggregations like `sum()`, `mean()`, `median()`, `min()`, `max()` in which reduced data provide insight into the nature of the large dataset. For example, in the introductory video for this tutorial, we saw maps of the mean annual sea surface temperature and sea surface density. 


The following table summarizes some other built-in xarray aggregations:

| Aggregation              | Description                     |
|--------------------------|---------------------------------|
| ``count()``              | Total number of items           |
| ``mean()``, ``median()`` | Mean and median                 |
| ``min()``, ``max()``     | Minimum and maximum             |
| ``std()``, ``var()``     | Standard deviation and variance |
| ``prod()``               | Compute product of elements            |
| ``sum()``                | Compute sum of elements                |
| ``argmin()``, ``argmax()``| Find index of minimum and maximum value |


Let's explore some of these aggregation methods.


# Arithmetic Operations

Compute the temporal minimum:

In [None]:
ds.tos.min(dim="time")

# Arithmetic Operations

Compute the spatial sum:

In [None]:
ds.tos.sum(dim=["lat", "lon"])

# Arithmetic Operations

Compute the temporal median:

In [None]:
ds.tos.median(dim="time")

# Arithmetic Operations

Compute the mean SST:

In [None]:
ds.tos.mean()

# Arithmetic Operations

Because we specified no `dim` argument, the function was applied over all dimensions, computing the mean of every element of `tos` across time and space. 

It is possible to specify a dimension along which to compute an aggregation. 

For example, to calculate the mean in time for all locations (i.e. the global mean annual SST), specify the time dimension as the dimension along which the mean should be calculated:

In [None]:
ds.tos.mean(dim="time").plot(size=7, vmin=-2, vmax=32, cmap="coolwarm")

### Question: Climate Connection

Observe the spatial patterns in SST and consider the following in the context of the components of the ocean climate system we learned:

Recall that upwelling commonly occurs off the west coast of continents, for example, in the eastern tropical Pacific off the west coast of South America. 

Do you see evidence for upwelling in this region? How do you think the mean SST in this region would change if you looked at a specific season rather than the annual mean? Would upwelling be more or less evident?