# Homework 2 - Geospatial Analysis 

### EDS 296 

May 11, 2025 

Jordan Sibley 

## Accessing CMIP6 data 

1. Using the CMIP6 database hosted on Amazon Web Services, choose any two models you like: use both their historical simulations and future projections from one of the four major SSPs (ssp126, ssp245, ssp370, or ssp585). **Provide a brief description of the models and scenarios you chose to include.**

2. Access information from your chosen models and scenario, for any climate variable you like; however, note that three-dimensional data is generally larger and can be slower to load, so I recommend either choosing a two-dimensional data field or reading in only the surface level if you choose 3D information. Some common choices of variables to analyze, and their CMIP names, include:
- Surface air temperature (tas)
- Precipitation (pr)
- Sea surface temperature (tos)

### Set up 

In [1]:
# Import libraries 
import xarray as xr
import matplotlib.pyplot as plt
import intake
import numpy as np
import pandas as pd
import cartopy.crs as ccrs
import cartopy.feature as cfeature

ERROR 1: PROJ: proj_create_from_database: Open of /opt/anaconda3/envs/eds296-stevenson/share/proj failed


In [2]:
# Open the CMIP6 data catalog, store as a variable
catalog = intake.open_esm_datastore('https://cmip6-pds.s3.amazonaws.com/pangeo-cmip6.json')

### Query the Database

**CMIP6 Models**:
- CanESM5 
- MIROC6


I chose the Canadian Earth System Model and the Model for Interdisciplinary Research on Climate as they both show good performance in simulating North American air temperature, and additionally they both have data for historical and projected time. 


**Emission Scenario**: SSP 370

SSP3-7.0 is a medium to high-end climate scenario within the Shared Socioeconomic Pathways (SSPs) framework. I chose this projected emission scenario because it modeled after a future marked by regional competition and conflict, and a future where there are no additional climate policies are implemented ([Climatedata.ca](https://climatedata.ca/resource/understanding-shared-socio-economic-pathways-ssps/)). Based on the current US administration, this kind of future could be made possible, so I was interested in exploring a more extreme emission scenario such as this one. 

In [3]:
# Query catalog for my two models 
# Chose historical and projected activities: CMIP = historical data, ScenarioMIP = future projections
activity_ids = ['ScenarioMIP', 'CMIP']

# Select two models 
source_id = ['CanESM5', 'MIROC6']

# Select historical and ssp370 experimental configurations  
experiment_ids = ['historical', 'ssp370']

# Choose ensemble member id (starting condtions, internal variation) 
member_id = 'r1i1p1f1'

# Choose monthly time resolution
table_id = 'Amon'

# Select air temperature (tas) for the environmental varaible
variable_id = 'tas'

In [4]:
# Search through catalog, store results in "res" variable
res = catalog.search(activity_id=activity_ids, source_id=source_id, experiment_id=experiment_ids, 
                     member_id=member_id, table_id=table_id, variable_id=variable_id)

# Display data frame associated with results
display(res.df)

Unnamed: 0,activity_id,institution_id,source_id,experiment_id,member_id,table_id,variable_id,grid_label,zstore,dcpp_init_year,version
0,CMIP,MIROC,MIROC6,historical,r1i1p1f1,Amon,tas,gn,s3://cmip6-pds/CMIP6/CMIP/MIROC/MIROC6/histori...,,20181212
1,ScenarioMIP,CCCma,CanESM5,ssp370,r1i1p1f1,Amon,tas,gn,s3://cmip6-pds/CMIP6/ScenarioMIP/CCCma/CanESM5...,,20190429
2,CMIP,CCCma,CanESM5,historical,r1i1p1f1,Amon,tas,gn,s3://cmip6-pds/CMIP6/CMIP/CCCma/CanESM5/histor...,,20190429
3,ScenarioMIP,MIROC,MIROC6,ssp370,r1i1p1f1,Amon,tas,gn,s3://cmip6-pds/CMIP6/ScenarioMIP/MIROC/MIROC6/...,,20190627


In [5]:
# Read in data and store as xarray object (MIROC)
hist_miroc = xr.open_zarr(res.df['zstore'][0], storage_options={'anon': True})
ssp_miroc = xr.open_zarr(res.df['zstore'][3], storage_options={'anon': True})

# Read in data and store as xarray object (CanESM5)
hist_canESM = xr.open_zarr(res.df['zstore'][2], storage_options={'anon': True})
ssp_canESM = xr.open_zarr(res.df['zstore'][1], storage_options={'anon': True})

In [7]:
# Concatenate historical and future projection data
# MIROC model
miroc_tas = xr.concat([hist_miroc, ssp_miroc], dim="time")

# CanESM5 model 
canESM_tas = xr.concat([hist_canESM, ssp_canESM], dim="time")


# Convert time to datetime64 format
time = miroc_tas.time.astype('datetime64[ns]')
time = canESM_tas .time.astype('datetime64[ns]')

## Choosing a study region 

3. Choose a region that you’re interested in to analyze, anywhere in the world. This should be a region that’s fairly large - think, the size of a large country or a sizable fraction of a continent. As you did for HW1, describe in markdown text some aspects of the climate of your region: **what are the interesting features there, and how might you expect that climate change would impact the area?**

The area of study I have choosen is a section of North America that includes Canada, United States, and Mexico. The climate of this large region varies with usual cooler temperatures on average in the North in Canda and warmer temperatures on averge in the South in Mexico. Due to rising temperatures due to climate change, all of this region is likely to expereince much higher average temperatures, potentially dangerous levels of air temperatures for areas in the US and Mexico, which is determinal to humans and animal life. 

In [None]:
# Define min/max bounds for region of North America (NOT CORRECT YET)
lat_min, lat_max = 40, 41.5
lon_min, lon_max = 285.5, 287


# Define logical mask: True when lat/lon inside the valid ranges, False elsewhere
miroc_NA_lat = (miroc_tas.lat >= lat_min) & (miroc_tas.lat <= lat_max)
miroc_NA_lon = (miroc_tas.lon >= lon_min) & (miroc_tas.lon <= lon_max)

# Find points where the mask value is True, drop all other points
miroc_tas_NA = canesm5_data.where(miroc_NA_lat & miroc_NA_lon, drop=True)

# Average over lat, lon dimensions to get a time series
miroc_tas_NA = miroc_tas_NA.mean(dim=["lat", "lon"])



# Define logical mask: True when lat/lon inside the valid ranges, False elsewhere
canESM_NA_lat = (canESM_tas.lat >= lat_min) & (canESM.lat <= lat_max)
canESM_NA_lon = (canESM_tas.lon >= lon_min) & (canESM.lon <= lon_max)

# Find points where the mask value is True, drop all other points
canESM_tas_NA = canESM_tas.where(canESM_NA_lat & canESM_NA_lon, drop=True)

# Average over lat, lon dimensions to get a time series
canESM_tas_NA = canESM_tas_NA.mean(dim=["lat", "lon"])

## Choosing two time periods to compare

4. Choose two separate time periods, each 30-50 years in length, and describe briefly why you chose these periods: then make some maps of the time average of your selected variable.
- Map the average over each time period separately
- Map the difference in the averages between the two time periods (note: make sure to label which time period you subtracted from which!)

For both your sets of maps, display some relevant political/geographic boundaries overlaid on the region: we saw some examples of how to do this using the Cartopy “feature” toolbox in the mapping tutorials

## Results 

5. Write 1-2 paragraphs in markdown text describing the results of your plot, and what you think they might mean for humans or ecosystems located in your study region. 

INSERT TEXT HERE 