# Lab 3: The El Niño-Southern Oscillation (ENSO)

## Overview

The El Niño–Southern Oscillation (ENSO) is a coupled ocean‐atmosphere climate phenomenon with global impacts on temperature, sea level pressure, and precipitation patterns (Bjerknes, 1969; Ropelewski & Halpert, 1987). This lab explores observed ENSO variability from 1950-Present. 

Please carefully read the dataset descriptions prior to starting the lab and use comments (`#`) throughout to organize your code and make it more readable. **This lab is worth 60 points.**

### Learning objectives

**In this lab you will learn and practice the following skills:**

- Interpreting global atmospheric and sea surface temperature patterns associated with ENSO
- Geospatial analysis 
- Statistical analysis
- Manipulating gridded data with Xarray
- Data visualization with Cartopy, Matplotlib, and cmocean

### To submit via Canvas:

To foster a collaborative learning environment, you are encouraged to work in groups of 2-3, but each person must write and submit their own code (and comments), and individually answer all interpretation questions. All students must complete all problems - it is against the honor code to divide the problems up among different individuals. 

- Prior to submitting your lab assignments on Canvas, please name your Notebook files using the following syntax: *LastName_FirstName_EV333_Lab3.ipynb.*
- Type you responses to the short-answer questions in this Notebook. Add a cell and select 'Markdown' instead of 'Code' from the drop down menu. Add **Answer:** and provide your written responses.
- Re-run all cells to ensure that the Notebook runs completely through without errors and that all figures are displayed.
- Upload your final Jupyter Notebook (.ipynb file) to Canvas

---
## Description of Datasets

**Required data:** (Download from GitHub)
- **Nino3.4 sea surface temperature anomaly:** detrend.nino34.monthly.txt
- **ERA5 Sea surface temperature:** ERA5_monthly_sst_regrid.nc 
- **ERA5 Mean sea level pressure:** ERA5_monthly_msl_regrid.nc

### Niño3.4 Monthly SST Anomalies

Sea surface temperature anomalies (SSTA) averaged across the Niño 3.4 region (5°N-5°S, 120°-170°W) in the central equatorial Pacific are used to define El Niño (warm phase) and La Niña (cool phase) events. Monthly SST anomalies from January 1950- February 2024 are provided in `detrend.nino34.monthly.txt`. The anomalies are based on centered 30-year reference periods updated every 5 years (i.e., sliding climatologies). This approach detrends the data, removing the long-term observed warming signal. Data downloaded from [NOAA/CPC](https://www.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/detrend.nino34.ascii.txt).


### ERA5 Reanalysis

ERA5 is produced by the Copernicus Climate Change Service at the European Centre for Medium-Range Weather Forecasts (ECMWF). The data was accessed from the [ECMWF Climate Data Portal](https://www.ecmwf.int/en/forecasts/dataset/ecmwf-reanalysis-v5).

The full reanalysis product covers the period from January 1940 to present. The original monthly datasets were regridded to a much coarser  ~4 $^\circ$ latitude x 4$^\circ$ longitude horizontal resolution (16.7 MB per file). The 4$^\circ$ resolution will still show the broad global patterns. You will work with sea surface temperature (SST) and mean sea level pressure (SLP) data in this lab.

**Sea surface temperature (K):** This parameter (SST) is the temperature of sea water near the surface. In ERA5, this parameter is a foundation SST, which means there are no variations due to the daily cycle of the sun (diurnal variations). SST, in ERA5, is given by two external providers. Before September 2007, SST from the HadISST2 dataset is used and from September 2007 onwards, the OSTIA dataset is used. This parameter has units of kelvin (K). Temperature measured in kelvin can be converted to degrees Celsius by subtracting 273.15.

**Mean sea level pressure (Pa):** This parameter is the pressure (force per unit area) of the atmosphere at the surface of the Earth, adjusted to the height of mean sea level. It is a measure of the weight that all the air in a column vertically above a point on the Earth's surface would have, if the point were located at mean sea level. It is calculated over all surfaces - land, sea and inland water. Maps of mean sea level pressure are used to identify the locations of low and high pressure. Contours of mean sea level pressure also indicate the strength of the wind. Tightly packed contours show stronger winds. The units of this parameter are pascals (Pa). Mean sea level pressure is often measured in hPa and sometimes is presented in units of millibars, mb (1 hPa = 1 mb = 100 Pa).

In [None]:
# import Python packages
import xarray as xr
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import cmocean
import cartopy.crs as ccrs
import cartopy.feature as cfeature
from cartopy.util import add_cyclic_point

## Part 1. Load & plot Niño3.4 monthly SSTA

The NOAA/CPC monthly Niño 3.4 detrended anomalies have already been calculated for you. 

**<span style='color:Red'> Insert a cell below (`+`).  </span> Define a variable called `fileName`** specifying the full file path for the `detrend.nino34.monthly.txt` dataset that contains the monthly Niño 3.4 SST anomalies:

```
fileName = '<add your path here>'
```

Load the file as a Pandas dataframe called `df`. The goal of this lab is to practice manipulating and plotting the data, so the pre-processing steps are completed for you. The processed `df` DataFrame is displayed. Please consult Lab 0 or the other Python tutorials if you need a refresher about working with Pandas dataframes. 

In [None]:
# load data as a Pandas DataFrame (df)
df = pd.read_fwf(fileName)

# remove data that isn't needed
df = df.drop(['TOTAL','ClimAdjust'], axis=1)

# convert to datetime
time = pd.to_datetime(dict(year=df.YR, month=df.MON, day=1))
df = df.drop(['MON'], axis=1)
df = df.rename(columns={'YR': 'time'})
df['time'] = time

df

Let's average out some of the shorter-term "weather" noise in the data to better highlight variations on year-to-year or "interannual" timescales. Here, we define a variable called `df_smooth` and calculate the 5-month running mean. This step has been done for you in the cell below and the output is displayed.

In [None]:
df_smooth = df.ANOM.rolling(window=5, center=True).mean().to_frame()
df_smooth.insert(0,'time',df.time)
df_smooth = df_smooth.dropna()
df_smooth

### Generate a plot of Niño 3.4 SSTA versus time

Each column in a **DataFrame** is a **Series**. To select a single column, use square brackets `[]` with the name of the column of interest as a string. For example: `ssta = df['ANOM']` where `df` is the DataFrame and `ANOM` is the name of a variable in the DataFrame. Both single and double quotes will work for strings.

**[6pts] <span style='color:Red'> Insert a cell below (`+`).  </span> and generate an x, y line plot of Niño3.4 SSTA vs. time. Plot the monthly data as a red line and the 5-month running mean (smoothed) data as a thicker black line.** Add a horizontal line at 0 deg C to help the viewer easily differentiate positive and negative anomalies. This is done with the code: `plt.axhline(y = 0, color = 'k', linestyle = '-')`

When defining the figure, set `figsize=[12,4]`. Add a title and axes labels (with units!) to your plot. Save the figure as `Lab3_Nino34_SSTA.png`. Some sample code is provided for you to build upon.

```
# define figure
plt.figure(figsize=[12,4])

# add a horizontal line at 0 deg C
plt.axhline(y = 0, color = 'k', linestyle = '-') 

# plot Nino34 SSTA vs. time (monthly)
...add your code here...

# plot Nino34 SSTA vs. time (5-month running mean)
...add your code here...

# add a title and axes labels (with units)
...add your code here...

# save the figure
plt.savefig('Lab3_Nino34_SSTA.png', bbox_inches='tight')

```

## Part 2: Load ERA5 sea-surface temeprature (SST) data and generate a map of global mean SST

Part 2 builds upon the skills you developed in Lab 2 General Atmospheric Circulation, but with a new dataset.

1. Define a variable `SST` and load the ERA5 SST data using the Python xarray package (see Lab 2 for a refresher)
2. Subset the data to January 1950-Present: `SST = SST.isel(time=(SST.time.dt.year > 1949))`
3. Convert kelvin to degrees Celsius

For this lab we want to use the SST variable (abbreviated `sst`). We can access `sst` by treating our DataSet like a dictionary: `SST['sst']`.

**[3pts] <span style='color:Red'> Insert a cell below (`+`).  </span> Load, subset, and convert the SST data following the steps above.** An outline with sample code is provided below:

```
# load the data using xarray
SST = ...add your code here...

# subset the data to 1950-Present
SST = SST.isel(time=(SST.time.dt.year > 1949))

# convert from K to deg C
SST = ...add your code here...

SST

```

### Calculate mean SST and generate a Pacific-centered global map

**[1pt] <span style='color:Red'> Insert a cell below (`+`).** Insert a cell and define a variable `m_SST` and calculate mean SST.

Recall for plotting maps using Cartopy we need to specific a map projection, a colormap variable (e.g., `cmap`) from the cmocean package, and the levels for the colorbar (e.g., `lev`). Example code is provided below. Please refer to Lab 2 for additional examples.

```
# map projection
proj = ccrs.Robinson(central_longitude=180)

# selected color map from cmocean colormaps for oceanography
cmap = cmocean.cm.thermal

# NumPy array for the color bar levels, here from 0 degC to +30 degC, in increments of 2 degC 
lev = np.arange(0, 30, 2);
```

**<span style='color:Red'> Insert a cell below (`+`)** to specify the mapping variables.

**[3pts] <span style='color:Red'> Insert a cell below (`+`)** Generate a map of mean SST, using the sample code below as a guide.

```
# define figure and axes, figure size, and resolution (300 dpi))
fig = plt.figure(figsize=(9, 4.5), dpi=300)
ax = plt.axes(projection = proj)

# filled contour map of mean temperature
m_SST.plot.contourf(
    x = 'lon',
    y = 'lat',
    ax=ax,
    transform=ccrs.PlateCarree(),
    levels=lev,
    extend='both',
    colors=cmap,
    add_colorbar=True,
    cbar_kwargs = {"label":"<update the color bar label with units>"})

# add coastlines
ax.coastlines(
   resolution='110m')  #Currently can be one of “110m”, “50m”, and “10m”.

# add coastlines masking the land in lightgray
ax.add_feature(cfeature.NaturalEarthFeature('physical', 'land', '110m', edgecolor='k', facecolor='lightgray'))

# add grid lines
gl = ax.gridlines(crs=ccrs.PlateCarree(),
                  draw_labels=True,
                  linewidth=1,
                  color='gray',
                  alpha=0.5,
                  linestyle='--')

# add title
ax.set_title("UPDATE TITLE")

# save figure 
fig.savefig('Lab3_ERA5_mean_SST.png', facecolor = 'white', transparent = False, bbox_inches ='tight')
```

## Part 3: Load ERA5 mean sea level pressure (SLP) data and generate a global map of mean SLP

Repeat the steps in Part 2 for mean sea level pressure:

1. Define a variable `SLP` and load the ERA5 mean sea level ('msl') pressure data using the Python xarray package
2. Subset the data to January 1950-Present: `SLP = SLP.isel(time=(SLP.time.dt.year > 1949))`
3. Convert Pa to hPa. We can access the `msl` variable by treating our DataSet like a dictionary: `SLP['msl']`.
4. Define a variable `m_slp` and calculate the mean
5. Generate a global map, similar to SST. Use an appropriate cmocean colormap.
6. Format the map by adding a title, axes labels with units, coastlines, etc. Do not fill in the coastlines like with SST.
7. Save the figure as `Lab3_ERA5_mean_slp.png`

**[5pts] <span style='color:Red'> Insert a cell below (`+`) and generate a map of mean sea level pressure.**

## Part 4: Calculate climatology-removed monthly SST anomalies (SSTA) 

Which month *on average* has the highest SST? Which month has the highest SLP? In Lab0 you learned how to answer these questions by calculating an average seasonal cycle, also known as the climatology. This involved calculating the average January value, the average February value, the average March value, ..., etc. 

This will be accomplished using the [groupby](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html) operation that involves splitting the object, applying a function, and combining the results. This is useful for grouping large amounts of data and computing operations on these groups. To group a DataArray, use the syntax: `ds.groupby(ds['Year_Month'].dt.month)` where `ds` is a generic variable name meant to represent the DataArray. This says to group the data by `month`. Then you can calculate the mean by performing the `.mean()` method on the grouped DataArray. 

In this lab, we want to *remove* the seasonal cycle from the SST and SLP data to yield monthly anomalies. But first we need to remove the long-term warming trend from the data. A future question will ask you why this is the case.

The cell below contains a function that will by default remove a linear trend from a dataset. You do not need to modify the function.

In [None]:
def detrend(da, dim, deg=1):
    # detrend along a single dimension
    p = da.polyfit(dim=dim, deg=deg)
    fit = xr.polyval(da[dim], p.polyfit_coefficients)
    return da - fit

**[2pts] <span style='color:Red'> Insert a cell below (`+`)** and use the detrend function to linearly detrend the gridded SST dataset. Call the variable `SST_Filt`.

Now, using the detrended data `SST_Filt`, calculate the climatology-removed monthly SSTA as a variable called `anom_SST`. You will first need to calculate the climatology (average seasonal cycle) and then subtract it. The groupby operation is shown in the example code below.

Important! Essentially this step means that the average January value is being subtracted from every january value. The average February is being subtracted from every February value, etc. Example code is provided below, but please make sure you understand what the different lines do before proceeding.

```
# climatology (average seasonal cycle) of the filtered data
clim_SST = SST_Filt.groupby(SST_Filt['time'].dt.month).mean()

# monthly anomalies
anom_SST = SST_Filt.groupby(SST_Filt['time'].dt.month) - clim_SST

anomSST
```

**[3pts] <span style='color:Red'> Insert a cell below (`+`)** and calculate the climatology-removed monthly SST anomalies.

## Part 5: Calculate climatology-removed monthly SLP anomalies (SLP_anom) 

Repeat part 3 for SLP:
1. Detrend the data using the `detrend` function
2. Calculate the climatology as a variable called `clim_SLP`
3. Calculate the monthly anomalies

**[4pts] <span style='color:Red'> Insert a cell below (`+`)** and calculate the monthly SLP anomalies. Call this variable `anom_SLP`.

## Part 6. Identify SST and SLP anomalies during select ENSO events

Now, let's explore the observed patterns of SST and SLP anomalies during El Niño and La Niña events! 

Using your Niño3.4 anomaly time series, identify a year with a large El Niño event. ENSO events peak during Northern Hemisphere winter (e.g., November-January or December-February) and thus bridge calendar years. For example the 1982-83 El Niño would be December 1982-February 1983. 

Define two variables `start_time` and `end_time`. These variables will be assigned a datetime string following the convention `YYYY-MM-DD`. Specify a starting Year/Month and ending Year/Month for select ENSO events. For the 1982-83 El Niño you would use:

```
start_time = '1982-11-01'
end_time = '1983-01-31'
```

Use the `.sel(time=slice())` method to subset the SSTA and the SLP anomalies to the time interval specified by `start_time` and `end_time` and then calculate the mean using the `mean.()` method. For example:

```
ENSO_SSTA = anom_SST.sel(time=slice(start_time, end_time)).mean(dim='time')

```

**<span style='color:Red'> Insert a cell below (`+`)**. Define the `start_time` and `end_time` variables. Subset the data to the specified time interval. Let's start with the 1982-83 El Niño. Subset the data to November 1982 through January 1983: `1982-11-01` to `1983-01-31`.

Now generate Pacific-centered maps of mean SSTA and SLP anomalies for the selected ENSO event. Specify a colormap appropriate for anomalies and ensure that the colorbar is centered on zero (no change). Adjust the colorbar levels such that your final maps are well-formatted and visually appealing.

Mask all the land masses with another color for the SSTA map only. Display all the data for the SLP anomaly map.

Save your SSTA figure using the following code: 
```
fig.savefig('Lab3_ERA5_sst_anom_' + start_time + '_' + end_time + '.png', facecolor = 'white', transparent = False, bbox_inches ='tight')
```

Save your SLP anom figure using the following code: 
```
fig.savefig('Lab3_ERA5_slp_anom_' + start_time + '_' + end_time + '.png', facecolor = 'white', transparent = False, bbox_inches ='tight')
```

**[8pts, 4 pts per variable] <span style='color:Red'> Insert a cell(s) below (`+`)** and generate your two anomaly maps. 

## Part 7. Synthesis questions

Insert a Markdown cell below and type your short-answer responses. Please save your figures, add them to a word document, and upload the file to Canvas as a PDF. 

1. [1pt] What is a climatology? Describe.
2. [2pts] What is a climatology-removed anomaly? Describe.
3. [0.5pt] What does a 5-month running mean do to the time series?
4. [3pts] For the 1982-83 El Niño, please describe the global SST and SLP anomaly patterns you observe. 
5. [4pts] How may a negative SLP anomaly impact other atmospheric properties we've learned about in class? What about a positive SLP anomaly? 
6. [4.5pts, 1.5 each] Re-run the code to generate your anomaly maps for *at least* 3 other El Niño events. You should ONLY change the start and ending time periods. Do not copy and paste any mapping code. How do the patterns compare? Please add your figures to a word document. *Hint: If you are having trouble identifying an event you may consult the [NOAA Oceanic Nino Index for inspiration](https://origin.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/ONI_v5.php).*
7. [3pts, 1.5 each] Re-run the code to generate your anomaly maps for *at least* 2 La Niña events. How do the patterns compare between different La Niña events? What about to El Niño? Please add your figures to a word document. 
8. [2pts] How do the magnitudes of the anomalies compare between warm phase and cool phase events? Which tend to yield larger anomalies in the Nino3.4 region? El Niño or La Niña?
9. [5pts] What are some of the global impacts of ENSO? You may need to consult your notes, the assigned readings, or the ENSO blog to answer this question. Please provide at least 4 examples of impacts.

## Congratulations! You completed your final EV333 Python lab! 

**To submit your lab:**
1. Set the selected ENSO event back to the 1982-83 El Niño
2. Run the entire Notebook from the beginning and check that it generates all figures and does not have any errors.
3. Save your Jupyter Notebook and upload it to Canvas with the following file name: *LastName_FirstName_EV333_Lab0.ipynb*
4. Upload the PDF with your figures to Canvas.