# CSS 120: Environmental Data Science

## Carbon Cycles, Feedback Loops, and Greenhouse Effect

### Umberto Mignozzetti (UCSD)

(Based on [ClimateMatch Academy](https://comptools.climatematch.io/tutorials/W1D1_ClimateSystemOverview/student/W1D1_Tutorial7.html))

# Packages

In [None]:
# imports
from datetime import timedelta
import numpy as np
import pandas as pd
import xarray as xr
from matplotlib import pyplot as plt
from pythia_datasets import DATASETS

##  Some Environmental Sciences

### Carbon Cycles

![](l14img01.png)

##  Some Environmental Sciences

### Carbon Cycles

![](l14img02.png)

##  Some Environmental Sciences

### Carbon Cycles

![](l14img03.png)

##  Some Environmental Sciences

### Feedback Loops

![](l14img04.png)

##  Some Environmental Sciences

![](l14img05.png)

# Data Resolution and Resampling Methods

To study these phenomena, we need to deal with the resolution of the data. Here are three methods:

1. `resample`: Useful for temporal upsampling and downsampling
    + E.g.: Go from hourly to every six hours<br><br>

1. `rolling`: Useful for aggregations on moving windows
    + E.g.: Moving averages (smooth out short-term fluctuations and highlight longer-term trends or cycles)<br><br>

1. `coarsen`: Downsample the data
    + E.g.: Block means for a given window<br><br>
    
Each resampling strategy will have a different effect, and will be useful for different issues. Let us see each of them in practice.

# Data Resolution and Resampling Methods

In [None]:
filepath = DATASETS.fetch("CESM2_sst_data.nc")
ds = xr.open_dataset(filepath)
ds

# Resample

The data we have is monthly, and we can downsample it yearly:

In [None]:
tos_yearly = ds.tos.resample(time="YS")
tos_yearly

# Resample

And we can use it to compute the global mean:

In [None]:
annual_mean = tos_yearly.mean()
annual_mean_global = annual_mean.mean(dim=["lat", "lon"])
annual_mean_global.plot()

# Rolling

If you remember COVID, daily data showed lots of variability. This is why the info we received was usually grouped in moving averages.

We can do the same with our data, for example, computing moving averages by six months time windows.

In [None]:
tos_m_avg = ds.tos.rolling(time=6, center=True).mean()
tos_m_avg

# Rolling

In [None]:
tos_m_avg_global = tos_m_avg.mean(dim=["lat", "lon"])
tos_m_avg_global.plot()

# Coarsening

This works like block aggregation along multiple dimensions.

For instance, let us block for a 4 months period in each latitude and longitude points:

In [None]:
coarse_data = ds.coarsen(time=4, lat=len(ds.lat), lon=len(ds.lon)).mean()
coarse_data

# Coarsening

And this gives us:

In [None]:
coarse_data.tos.plot()

# Resampling Methods

The sampling methods aggregated give us the following:

In [None]:
ds.mean(dim = ["lat", "lon"]).tos.plot(size = 6); coarse_data.tos.plot()
tos_m_avg_global.plot(); annual_mean_global.plot()

plt.legend(["original data (monthly)", "coarsened (4 months)", 
            "moving average (6 months)","annually resampled (12 months)",])

# Masking Data

Suppose we want to analyze one data period, focusing on a particular set of values. We can isolate it using the `isel()` method:

In [None]:
sample = ds.tos.sel(time='2014-09')
sample

# Masking Data

But to mask the data we want to analyze, we need to isolate it.

We use the `.where()` to get this done. Unlike `sel`, that changes the shape of the data, `where` masks the data by putting missing values in places where the `where` condition is false.

For instance, suppose further that we want to mask the places with temperature lower the zero degrees celsius (less than 32 F, or frozen).

Here is what we do:

In [None]:
masked_sample = sample.where(sample <= 0.0)
masked_sample

# Masking Data

And this is what we get:

In [None]:
fig, axes = plt.subplots(ncols = 2, figsize = (19, 6))
sample.plot(ax = axes[0]); masked_sample.plot(ax = axes[1])

# Masking Data

And sometimes, we need to mask in more than one condition.

To do that, we can use the `where` with multiple expressions enclosed in parenthesis. To combine the expressions, we can use:

- & (and)
- | (or)
- ~ (not)

For example, suppose we are studying the [El Niño](https://www.pmel.noaa.gov/elnino/what-is-el-nino).

![](l14img06.png)

# Masking Data

In [None]:
# Last time period
sample = ds.tos.isel(time=-1)

# And consider only temperatures in this window.
sample.where((sample > 25) & (sample < 30)).plot(size=6)

# Masking Data

Now, let us add specific coordinates, so that we isolate the ENSO (El Niño Southern Oscillation).

In [None]:
sample.where(
    (sample.lat < 5) & (sample.lat > -5) & (sample.lon > 190) & (sample.lon < 240)
).plot(size=6)

# Masking Data

And after zooming in:

In [None]:
sample.where(
    (sample.lat < 5) & (sample.lat > -5) & (sample.lon > 190) & (sample.lon < 240)
).plot(size = 6)
plt.xlim(180,250); plt.ylim(-10,10)

# Masking Data

We can also look at a time series of averages in the region:

In [None]:
nino = ds.tos.where(
    (sample.lat < 5) & (sample.lat > -5) & (sample.lon > 190) & (sample.lon < 240)
)

nino_mean = nino.mean(dim=["lat", "lon"])
nino_mean

# Masking Data

We can also look at a time series of averages in the region:

In [None]:
nino_mean.plot()

## Questions?

## See you next class!