# Time Series Analysis: Comparing Two Different Global Models

##### By Amanda Overbye

## Import Libraries

In [1]:
import s3fs
import intake
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature

ERROR 1: PROJ: proj_create_from_database: Open of /opt/anaconda3/envs/eds296-stevenson/share/proj failed


## Catalog Loading and Choosing data

In [None]:
# Open catalog and store as 'catalog'
catalog = intake.open_esm_datastore('https://cmip6-pds.s3.amazonaws.com/pangeo-cmip6.json')

In [None]:
# View catalog
catalog

In [None]:
# Convert the catalog to a Pandas dataframe
cat_df = catalog.df

In [None]:
# View unique CMIP6 activities
catalog.df.source_id.unique()

In [None]:
# Catalog search
res = catalog.search(activity_id = ["CMIP", "ScenarioMIP"],
                     source_id = ["GFDL-CM4", "GFDL-ESM4"],
                     experiment_id = ["historical", "ssp370", "ssp585"],
                     table_id = "Amon", 
                     member_id = "r1i1p1f1",
                     variable_id = "tas")
# Display contents of the catalog
display(res.df)

In [None]:
# Project type: CMIP (historical) and ScenarioMIP (future)
activity_ids = ['ScenarioMIP', 'CMIP']

# Models: CanESM5 and CESM2
source_id = ['CanESM5', 'CESM2']

# Experiments: historical, SSP3-7.0, SSP5-8.5
experiment_ids = ['historical', 'ssp370', 'ssp585']

# Ensemble member: r10i1p1f1
member_id = 'r10i1p1f1'

# Data type: monthly atmospheric output
table_id = 'Amon'

# Variable: surface air temperature (tas)
variable_id = 'tas'


In [None]:
# Search cataloge based on specified terms
res = catalog.search(activity_id=activity_ids, source_id = source_id, experiment_id=experiment_ids, 
                     member_id=member_id, table_id=table_id, variable_id=variable_id)

# Display data frame associated with results
display(res.df)

In [None]:
# Read in the historical data file for CanESM5
hist_data = xr.open_zarr(res.df["zstore"][1], storage_options = {"anon": True})

# Read in the high climate scenario data file for CanESM5
ssp585_data = xr.open_zarr(res.df["zstore"][2], storage_options = {"anon": True})

# Read in the low climate scenario data file for CM4
ssp370_data = xr.open_zarr(res.df["zstore"][3], storage_options = {"anon": True})

## Choosing Geographic Area
For my area of interest, I have chosen to do the Southeast United States. The southeast has a unique mix of variable temperatures, human activities, and diverse ecology. Additionally, I used the live there and I would like to see how these models predict climate emission scenarios impact that area. The Southeast is generalized as a temperate forest zone, however, it includes coastal planes and part of the Appalachian mountain range. Weatherwise, the Southeast is warn, rainy, and humid. The winters in this area tend to be mild, however spring and fall weather can bring in severe storms. I am choosing to focus on surface air temperature because the region is known for its warmth.

## Choosing Models

For this project, I use the Community Earth System Model 2 (CESM2) and the Canadian Earth System Model version 5 (CanESM5). The first thing I looked for when choosing the models was the level of emission I could choose from. Both of these models had ssp585 and ssp370. Additionally, I appreciate how the CanESM5 model includes the Canadian Terrestrial Ecosystem Model (CTEM). I thought that would be useful for this region considering how ecologically rich the southeast is, particularly in the Smokey Mountains. The CESM2 model seems to be more focused on abiotic factors but does take into account anthropogenic land use. I thought it would be interesting to compare the two models based on these differences. 

## Choosing Scenarios

I am using the ssp370 and the ssp585 experiments. Ssp585 represents the scenario with the highest amount of emissions. Ssp370 represents a middle to high amount of emissions and warming. I chose the look at the more severe end of the spectrum because I think it is a more probable scenario at this point than milder experiments. 

In [None]:
# Define area of interest
lat_min, lat_max = 25.0, 37.5 
lon_min, lon_max = 265.0, 285.0 

### Plot Area of Interest

In [None]:
# Create plot to ensure the coordinates are where I want 
fig = plt.figure(figsize=(10, 6))
ax = plt.axes(projection=ccrs.PlateCarree(central_longitude=180)) 

# Add features
ax.set_extent([lon_min, lon_max, lat_min, lat_max], crs=ccrs.PlateCarree())
ax.coastlines()
ax.add_feature(cfeature.BORDERS)
ax.add_feature(cfeature.STATES, linestyle=':')

# Add the bounding box
ax.plot([lon_min, lon_max, lon_max, lon_min, lon_min],
        [lat_min, lat_min, lat_max, lat_max, lat_min],
        transform=ccrs.PlateCarree(),
        color='Black', linewidth=2, label='Southeast U.S.')

## Prepare CanESM5 Model Data

In [None]:
# Merge historical and future projection data
canesm5_585_data = xr.concat([hist_data, ssp585_data], dim="time")
canesm5_370_data = xr.concat([hist_data, ssp370_data], dim="time")

# Convert time to datetime64 format
time = canesm5_585_data.time.astype('datetime64[ns]')
time = canesm5_370_data.time.astype('datetime64[ns]')

# Define logical mask for Southeast U.S.
tas_SE_lat = (canesm5_585_data.lat >= lat_min) & (canesm5_585_data.lat <= lat_max)
tas_SE_lon = (canesm5_585_data.lon >= lon_min) & (canesm5_585_data.lon <= lon_max)
tas_SE_370_lat = (canesm5_370_data.lat >= lat_min) & (canesm5_370_data.lat <= lat_max)
tas_SE_370_lon = (canesm5_370_data.lon >= lon_min) & (canesm5_370_data.lon <= lon_max)

# Apply mask and average
tas_SE_585 = canesm5_585_data.where(tas_SE_lat & tas_SE_lon, drop=True).mean(dim=["lat", "lon"])
tas_SE_370 = canesm5_370_data.where(tas_SE_370_lat & tas_SE_370_lon, drop=True).mean(dim=["lat", "lon"])

# Calculate annual mean temperature
canesm5_mean_ssp370 = tas_SE_370.groupby("time.year").mean()
canesm5_mean_ssp585 = tas_SE_585.groupby("time.year").mean()

# Convert to Celsius
canesm5_mean_ssp370 = canesm5_mean_ssp370 - 273.15
canesm5_mean_ssp585 = canesm5_mean_ssp585 - 273.15

## Create plots for CanESM5 Model Data

In [None]:
# Create figure
fig, ax = plt.subplots(figsize=(20, 8))

# Plot SSP585 (high emissions)
ax.plot(canesm5_mean_ssp585.year, canesm5_mean_ssp585.tas, 
        label='SSP585 (Temperature)', color='#990000')

# Titles and axis labels
ax.set_title("Projected Surface Air Temperature (SSP585 - High Emissions) in Southeastern U.S. (1850–2100) under CANESM5 Model", fontsize=20)
ax.set_xlabel("Year", fontsize=20)
ax.set_ylabel("Temperature (°C)", fontsize=20)

# Add legend and grid
ax.legend(fontsize=20)
ax.grid()

# Show plot
plt.show()

In [None]:
# Create figure
fig, ax = plt.subplots(figsize=(20, 8))

# Plot SSP370 (moderate emissions)
ax.plot(canesm5_mean_ssp370.year, canesm5_mean_ssp370.tas, 
        label='Temperature', color='#4F4789')

# Titles and axis labels
ax.set_title("Projected Surface Air Temperature (SSP370 - Moderate Emissions) in Southeastern U.S. (1850–2100) under CANESM5 Model", fontsize=20)
ax.set_xlabel("Year", fontsize=20)
ax.set_ylabel("Temperature (°C)", fontsize=20)

# Add legend and grid
ax.legend(fontsize=20)
ax.grid()

# Show plot
plt.show()

In [None]:
# Create figure
fig, ax = plt.subplots(figsize=(14, 7))

# Plot SSP370 (moderate emissions)
ax.plot(canesm5_mean_ssp370.year, canesm5_mean_ssp370.tas, 
        label="SSP370 (Moderate Emissions)", color="#4F4789")

# Plot SSP585 (high emissions)
ax.plot(canesm5_mean_ssp585.year, canesm5_mean_ssp585.tas, 
        label="SSP585 (High Emissions)", color="#990000")

# Add major and minor grid lines
ax.grid(which='major', color='lightgray', linewidth=0.8)
ax.grid(which='minor', color='lightgray', linestyle=':', linewidth=0.5)

# Titles and axis labels
ax.set_title("Projected Surface Air Temperature in Southeastern U.S. (1850–2100) under CANESM5 Model", fontsize=20)
ax.set_xlabel("Year", fontsize=16)
ax.set_ylabel("Temperature (°C)", fontsize=16)

# Add legend
ax.legend(fontsize=14)

# Show plot
plt.show()

## Prepare CESM2 Model Data

In [None]:
# Read historical and future scenario data for CESM2
hist_data = xr.open_zarr(res.df["zstore"][0], storage_options={"anon": True})
ssp370_data = xr.open_zarr(res.df["zstore"][4], storage_options={"anon": True})
ssp585_data = xr.open_zarr(res.df["zstore"][5], storage_options={"anon": True})

# Merge historical with future projection data
cesm2_370_data = xr.concat([hist_data, ssp370_data], dim="time")
cesm2_585_data = xr.concat([hist_data, ssp585_data], dim="time")

# Convert time to datetime64 format
time = cesm2_585_data.time.astype('datetime64[ns]')

# Define logical mask for Southeast U.S.
tas_SE_lat = (cesm2_585_data.lat >= lat_min) & (cesm2_585_data.lat <= lat_max)
tas_SE_lon = (cesm2_585_data.lon >= lon_min) & (cesm2_585_data.lon <= lon_max)
tas_SE_370_lat = (cesm2_370_data.lat >= lat_min) & (cesm2_370_data.lat <= lat_max)
tas_SE_370_lon = (cesm2_370_data.lon >= lon_min) & (cesm2_370_data.lon <= lon_max)

# Apply mask and average spatially
tas_SE_585 = cesm2_585_data.where(tas_SE_lat & tas_SE_lon, drop=True).mean(dim=["lat", "lon"])
tas_SE_370 = cesm2_370_data.where(tas_SE_370_lat & tas_SE_370_lon, drop=True).mean(dim=["lat", "lon"])

# Calculate annual mean temperature
cesm2_mean_ssp370 = tas_SE_370.groupby("time.year").mean()
cesm2_mean_ssp585 = tas_SE_585.groupby("time.year").mean()

# Convert from Kelvin to Celsius
cesm2_mean_ssp370 = cesm2_mean_ssp370 - 273.15
cesm2_mean_ssp585 = cesm2_mean_ssp585 - 273.15

## Create plots for CESM2 Model Data

In [None]:
# Create figure
fig, ax = plt.subplots(figsize=(14, 7))

# Plot SSP370 (moderate emissions)
ax.plot(
    cesm2_mean_ssp370.year,
    cesm2_mean_ssp370["tas"],
    label="SSP370 (Moderate Emissions)",
    color="#4F4789"
)

# Plot SSP585 (high emissions)
ax.plot(
    cesm2_mean_ssp585.year,
    cesm2_mean_ssp585["tas"],
    label="SSP585 (High Emissions)",
    color="#990000"
)

# Add grid lines
ax.grid(which='major', color='lightgray', linewidth=0.8)
ax.grid(which='minor', color='lightgray', linestyle=':', linewidth=0.5)

# Add title and axis labels
ax.set_title(
    "Projected Surface Air Temperature in the Southeastern U.S. (1850–2100) under CESM2 Model",
    fontsize=20
)
ax.set_xlabel("Year", fontsize=16)
ax.set_ylabel("Temperature (°C)", fontsize=16)

# Add a legend
ax.legend(fontsize=14)

# Show plot
plt.show()

## Comparing Results

Both graphs show a similar shape denoting when temperatures have risen/are expected to rise. The CESM2 showed higher peaks and lower valleys than the CANESM5 model. Another difference is the distance between the different emission scenarios. The CESM2 model shows little over a 2-degree (C) difference, whereas the CANESM5 shows a less than 1-degree C difference. When comparing the 585 scenarios, we can see that the CANESM5 predicts a higher temperature (Over 28 C) than the CESM2 model (Under 27 C). Both models show a very concerning climate trend, which is not unexpected, although it is still distressing. I am suprised the CanESM5 showed a higher warming potential, I thought that perhaps its slightly more ecological focus would make it a bit more optomistic. However, ecosystems across the world are experiencing decline, so I can why this would make sense.

## Impact On Region 

The impact of this warming will be severe both on ecosystems and the people in the region. Heat waves in the region could create large public health issues as the area has some of the highest poverty rates in the US. Rising energy costs will make it difficult for people to afford air conditioning, which can have dire consequences in this region. With higher temperatures, increased flooding will also be likely. This will be desolating to areas trying to recover from previous floods. Additionally, the high poverty rates will make it difficult for residents to rebuild.

The Southeast is one of the most biodiverse places in the United States. The historically warm and wet weather of this region has made it one of the most advantageous for amphibians. The Smokey Mountains are considered to be the "Salamander Capital of the World." Because amphibians are ectotherms, they are reliant on stable temperatures for survival. Increasing temperatures will likely lead to species loss and extinction.

Lastly, the Southeast is a largely agricultural area. Higher temperatures could lead to increases in crop disease, plant stress, and severe weather. This will likely harm food production and negatively impact supply chains across the US. 