# Calculating ENSO with Xarray



---

## Overview

In this notebook (adapted from Project Pythia), we will:

1. Load SST data from the CESM2 model
2. Mask data using `.where()`
3. Compute climatologies and anomalies using `.groupby()`
4. Use `.rolling()` to compute moving average
5. Compute, normalize, and plot the Niño 3.4 Index

## Prerequisites


| Concepts | Importance | Notes |
| --- | --- | --- |
| [Introduction to Xarray](xarray-intro) | Necessary | |
| [Computation and Masking](computation-masking) | Necessary | |



- **Time to learn**: 20 minutes

##Introduction to El Niño Southern Oscillation (ENSO)
During many of the tutorials yesterday, we practiced using Xarray packages to plot and interpret monthly global sea surface temperature (SST) data from the Community Earth System Model v2 (CESM2). With this data, we:


*   Applied arithmetic and aggregation methods
*   Explored climatology
*   Computed climate anaomalies running averages
*   Masked data with one and multiple conditions 


During the final tutorial yesterday, we learned how to mask data using multiple conditions. In particular, we used `.where()` to isolate SST data between 5ºN-5ºS and 190ºW-240ºW. This geographic region, known as the Niño 3.4 region, is in the tropical Pacific Ocean and is commonly used as a metric for determining the phase of the El Niño-Southern Oscillation (ENSO). ENSO is a recurring climate pattern involving changes in SST in the central and eastern tropical Pacific Ocean, which two alternating phases:

*    **El Niño:** the phase of ENSO characterized by warmer than average SSTs in the central and eastern tropical Pacific Ocean, weakened east to west equatorial winds and increased rainfall in the eastern tropical Pacific. 
*   **La Niña:** the phase of ENSO is characterized by cooler than average SSTs in the central and eastern tropical Pacific Ocean, stronger east to west equatorial winds and decreased rainfall in the eastern tropical Pacific.

## Tropical Pacific Climate Processes
To better understand the climate system processes that result in El Nino and La Nina events, let's first consider typical climate conditions in the tropical Pacific Ocean. Recall from W1D1, **trade winds** are winds that blow east to west just north and south of the equator (these are sometimes referred to as "easterly" winds since the winds are originating from the east and blowing toward the west). And as we discussed yesterday, the reason that the trade winds blow from east to west is related to Earth's rotation, which causes the winds in the Northern Hemisphere to curve to the right and winds in the Southern Hemisphere curves to the left. This is known as the **Coriolis effect**. 

If Earth's rotation affects air movement, do you think it also influences surface ocean water movement? It does! As trade winds blow across the tropical Pacific Ocean, they move water because of friction at the ocean surface. But because of the Coriolis effect, surface water moves to the right of the wind direction in the Northern Hemisphere and to the left of the wind direction in the Southern Hemisphere. However, the speed and direction of water movement changes with depth. Ocean surface water moves at an angle to the wind, and the water under the surface water moves at a slightly larger angle, and the water below that turns at an even larger angle. The average direction of all this turning water is about a right angle from the wind direction. This average is known as **Ekman transport**. Since this process is driven by the trade winds, the strength of this ocean water transport varies in response to changes in the the srength of the trade winds. 


## Ocean-Atmosphere Interactions during El Niño and La Niña
So, how does all of this relate to El Niño and La Niña? Changes in the strength of Pacific Ocean trade winds and the resulting impact on Ekman transport create variations in the tropical Pacific Ocean SST, which further results in changes to atmospheric circulation patterns and rainfall.


During an El Niño event, ***easterly trade winds are weaker***. As a result, less warm surface water is  transported to the west via Ekman transport, which causes a build-up of warm surface water in the eastern equatorial Pacific. This creates warmer than avrage SSTs in the eastern equatorial Pacific Ocean. The atmosphere responds to this warming with increased rising air motion and above-average rainfall in the eastern Pacific. In contrast, during a La Niña event, ***easterly trade winds are stronger***. As a result, more warm surface water is transported to the west via Ekman transport, and cool water from deeper in the ocean rises up in the eastern Pacific during a process known as upwelling. This creates cooler than avrage SSTs in the eastern equatorial Pacific Ocean. This cooling decreases rising air movement in the eastern Pacific, resulting in drier than average cdonsitions. 


In this tutorial, we'll examine SST temperature to further explore variations in the climate system that occur during El Nino and La Nina events. Specifically, we will plot and interpet CESM2 SST data from the Niño 3.4 region.

In [None]:
# #<Yosmely Bermúdez> comments
# #Install the dependencies
# !pip install pythia_datasets
# !pip install cartopy
# !pip install xarray

---

## Imports 


In [None]:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import xarray as xr
from pythia_datasets import DATASETS

## The Niño 3.4 Index


In this notebook, we are going to combine several topics we've covered so far to compute the [Niño 3.4 Index](https://climatedataguide.ucar.edu/climate-data/nino-sst-indices-nino-12-3-34-4-oni-and-tni) for the CESM2 submission for the [CMIP6 project](https://esgf-node.llnl.gov/projects/cmip6/). 

> Niño 3.4 (5N-5S, 170W-120W): The Niño 3.4 anomalies may be thought of as representing the average equatorial SSTs across the Pacific from about the dateline to the South American coast. The Niño 3.4 index typically uses a 5-month running mean, and El Niño or La Niña events are defined when the Niño 3.4 SSTs exceed +/- 0.4C for a period of six months or more.

> Nino X Index computation: (a) Compute area averaged total SST from Niño X region; (b) Compute monthly climatology (e.g., 1950-1979) for area averaged total SST from Niño X region, and subtract climatology from area averaged total SST time series to obtain anomalies; (c) Smooth the anomalies with a 5-month running mean; (d) Normalize the smoothed values by its standard deviation over the climatological period.

![](https://www.ncdc.noaa.gov/monitoring-content/teleconnections/nino-regions.gif)

At the end of this notebook, you should be able to produce a plot that looks similar to this [Oceanic Niño Index plot](https://climatedataguide.ucar.edu/sites/default/files/styles/extra_large/public/2022-03/indices_oni_2_2_lg.png):

![ONI index plot from NCAR Climate Data Guide](https://climatedataguide.ucar.edu/sites/default/files/styles/extra_large/public/2022-03/indices_oni_2_2_lg.png)

Open the SST and areacello datasets, and use Xarray's `merge` method to combine them into a single dataset:

In [None]:
import pandas as pd
filepath = DATASETS.fetch('CESM2_sst_data.nc')
# <Yosmely Bermúdez> comments
# decode_times=False fix up the time variable "manually".
# You can use xr.decode_cf() or simply assign a new pandas time index to your time variable.
# Is better assign a new pandas time index because after you need a groupby with month and this is the solution.
data = xr.open_dataset(filepath,decode_times=False)
data['time'] = pd.DatetimeIndex(data['time'].values)
filepath2 = DATASETS.fetch('CESM2_grid_variables.nc')
areacello = xr.open_dataset(filepath2).areacello

ds = xr.merge([data, areacello])
ds

In [None]:
# # Sloane Garelick comments
# # When I ran this cell it caused the rest of the notebook to crash. I think maybe we should remove this cell?

# # <Yosmely Bermúdez> comments
#    # You have to be careful with shapely as it has problems with cartopy, so you have to install other dependencies
# !apt-get install libproj-dev proj-data proj-bin
# !apt-get install libgeos-dev
# !pip install cython
# !apt-get -qq install python-cartopy python3-cartopy
# !pip uninstall -y shapely    # cartopy and shapely aren't friends (early 2020)
# !pip install shapely --no-binary shapely

In [None]:
# Sloane Garelick comments
# When I ran this cell it caused the rest of the notebook to crash. I think maybe we should remove this cell?


# <Yosmely Bermúdez> comments
   #This library is needed to be able to graph with cartopy
import cartopy.io.shapereader as shapereader

Visualize the first time slice to make sure the data looks as expected:

In [None]:
fig = plt.figure(figsize=(12, 6))
ax = plt.axes(projection=ccrs.Robinson(central_longitude=180))
ax.coastlines()
ax.gridlines()
ds.tos.isel(time=0).plot(
    ax=ax, transform=ccrs.PlateCarree(), vmin=-2, vmax=30, cmap='coolwarm'
);

## Select the Niño 3.4 region 

There are a couple ways to select the Niño 3.4 region:

1. Use `sel()` or `isel()`
2. Use `where()` and select all values within the bounds of interest

In [None]:
tos_nino34 = ds.sel(lat=slice(-5, 5), lon=slice(190, 240))
tos_nino34

The other option for selecting our region of interest is to use 

In [None]:
tos_nino34 = ds.where(
    (ds.lat < 5) & (ds.lat > -5) & (ds.lon > 190) & (ds.lon < 240), drop=True
)
tos_nino34

Let's plot the selected region to make sure we are doing the right thing.

In [None]:
fig = plt.figure(figsize=(12, 6))
ax = plt.axes(projection=ccrs.Robinson(central_longitude=180))
ax.coastlines()
ax.gridlines()
tos_nino34.tos.isel(time=0).plot(
    ax=ax, transform=ccrs.PlateCarree(), vmin=-2, vmax=30, cmap='coolwarm'
)
ax.set_extent((120, 300, 10, -10))

## Compute the anomalies

We first group by month and subtract the mean SST at each point in the Niño 3.4 region. We then compute the weighted average over the region to obtain the anomalies:

In [None]:
gb = tos_nino34.tos.groupby('time.month')
tos_nino34_anom = gb - gb.mean(dim='time')
index_nino34 = tos_nino34_anom.weighted(tos_nino34.areacello).mean(dim=['lat', 'lon'])

Now, smooth the anomalies with a 5-month running mean:

In [None]:
index_nino34_rolling_mean = index_nino34.rolling(time=5, center=True).mean()

In [None]:
index_nino34.plot(size=8)
index_nino34_rolling_mean.plot()
plt.legend(['anomaly', '5-month running mean anomaly'])
plt.title('SST anomaly over the Niño 3.4 region');

#**Sloane's ideas of climate content to add:**

Looking at the time series of SST anomaly over the Nino 3.4 region, consider the following questions:


1.   If the SST anomaly is greater that zero, would that be an El Nino event or a La Nina event?
2.   Based on this data, how frequently do El Nino and La Nina events occur? 




Compute the standard deviation of the SST in the Nino 3.4 region, over the entire time period of the data array:

In [None]:
std_dev = tos_nino34.tos.std()
std_dev

Then we'll normalize the values by dividing the rolling mean by the standard deviation of the SST in the Niño 3.4 region:

In [None]:
normalized_index_nino34_rolling_mean = index_nino34_rolling_mean / std_dev

## Visualize the computed Niño 3.4 index

We will highlight values in excess of $\pm$0.5, roughly corresponding to El Niño (warm) and La Niña (cold) events.

In [None]:
fig = plt.figure(figsize=(12, 6))

plt.fill_between(
    normalized_index_nino34_rolling_mean.time.data,
    normalized_index_nino34_rolling_mean.where(
        normalized_index_nino34_rolling_mean >= 0.4
    ).data,
    0.4,
    color='red',
    alpha=0.9,
)
plt.fill_between(
    normalized_index_nino34_rolling_mean.time.data,
    normalized_index_nino34_rolling_mean.where(
        normalized_index_nino34_rolling_mean <= -0.4
    ).data,
    -0.4,
    color='blue',
    alpha=0.9,
)

normalized_index_nino34_rolling_mean.plot(color='black')
plt.axhline(0, color='black', lw=0.5)
plt.axhline(0.4, color='black', linewidth=0.5, linestyle='dotted')
plt.axhline(-0.4, color='black', linewidth=0.5, linestyle='dotted')
plt.title('Niño 3.4 Index');

#**Sloane's ideas of climate content to add:**

Now that we've normalized the data and highlighted SST anomalies that correspond to El Niño (warm) and La Niña (cold) events, consider the following questions:


1.   When were the strongest El Nino and La Nina events over this time period? 
2.   Considering the ocean-atmosphere interactions that cause El Nino and La Nina events, can you hypothesize potential reasons one El Nino or La Nina event may be stronger than others? 




---

## Summary

We have applied a variety of Xarray's selection, grouping, and statistical functions to compute and visualize an important climate index. 

## Resources and References

- [Niño 3.4 index](https://climatedataguide.ucar.edu/climate-data/nino-sst-indices-nino-12-3-34-4-oni-and-tni)
- [Matplotlib's `fill_between` method](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.fill_between.html)
- [Matplotlib's `axhline` method](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.axhline.html) (see also its analogous `axvline` method)