<a href="https://colab.research.google.com/github/envgp/taking_the_pulse_of_the_planet/blob/main/notebooks/pulse_assignment_2_part_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Assignment 2 Part 2: Taking the Pulse of the Land Surface - Changes in Precipitation and Temperature
Rosemary Knight (rknight@stanford.edu) & Michael Morphew (mmorphew@stanford.edu), Stanford Environmental Geophysics Group

This assignment, presented in two parts, with both due on `2023-1-31`. Please put your answers within this notebook and share the completed notebook with the graders, bsalvado@stanford.edu and mmorphew@stanford.edu, using the Share banner located at the top right corner of this notebook. When sharing your notebook, please change the name of the notebook and add your name and sunetid (e.g., pulse_course_assignment_2_firstname_lastname_sunetid.ipynb)

## INTRODUCTION TO THE ASSIGNMENT
In this assignment, we will explore the changes occurring on the land surface in response to climate change, and consider these changes in the context of sustainability.

## DATA SETS

Data sets used this assignment are: 1) precipitation; 2) temperature. Here is a brief description of each:

1) Precipitation

Time period: 1985-2021

Temporal Resolution: Monthly

Spatial Extent: Global Land

Spatial Resolution: 1 degree

Unit: Meters

A reanalysis dataset is a dataset that blends observations and climate models in an attempt to produce the most complete and accurate map of historical and recent climate data. Our precipitation data come from the ERA-5 data set made available by the European Centre for Medium-Range Weather Forecasts, which combines their own models with data from satellites using active and passive microwave sensors. For more information see: https://confluence.ecmwf.int/display/CKB/ERA5%3A+data+documentation#ERA5:datadocumentation

2) Land Surface Temperature

Time Period: 1985-2021

Temporal Resolution: Monthly

Spatial Extent: Global Land

Spatial Resolution: 1 degree

Unit: Kelvin

The temperature data also come from the ERA5 dataset linked above, which again combines models with satellite data (and for temperature, ground-based observations are also used). This particular temperature dataset is what the model predicts the air temperature will be 2m above the land's surface.

## TOOLBOX

All the Python packages you will use in this assignment are in the toolbox for the course. Additionally, new tools are introduced in this notebook as needed in a guided format. (https://github.com/envgp/taking_the_pulse_of_the_planet/blob/main/notebooks/pulse_toolbox.ipynb)

## THE LEARNING GOALS FOR THE WEEK

(where the course learning goals are in plain text, and the focus this week is in italics)

•	learn about the ways in which climate change and human activity are impacting planet Earth, *with a focus this week in determining through analysis of the data how and where precipitation and surface temperature are changing.*

•	become familiar with the wide range of sensors available to study various components of the Earth system. These include sensors on satellites, aircraft, ground-based platforms, and deployed above or beneath the surface on land or water. *This week we will work with reanalysis data sets, that integrate measurements from satellites through models.*

•	become familiar with the basic physical principles (resolution, sampling, processing workflows, etc.) common to all sensors, *working this week with two data sets.*

•	work with various sources of data, learning how to access, analyze, synthesize, and describe the data to quantify trends; think critically and creatively about how to project these trends into the future. *In the first part of the assignment you will first design your own workflow, using your choice of data analysis methods and tools to explore the changes in precipitation and temperature. In part 2, we will lead you through some analyses.*

•	describe the complex interactions between human activity and various components of the Earth system, *this week framing this under the heading of sustainability, where the component of the Earth system is the land surface. We will consider: how are the changes in precipitation and temperature introducing sustainability challenges in different countries? How could human activity mitigate the negative effects? How are human activities amplifying the negative impacts?*

•	become motivated to think about new sensors and new ways of using sensor data to study the planet. *This is always the last question in each assignment. Given all that you now know about changes in precipitation and temperature, and how we measure/monitor it, what does the planet and all forms of life need you to design and deploy?*

## Download required data and install Packages

In [None]:
!pip install xarray numpy pandas geopandas cartopy==0.19.0.post1 rioxarray ipywidgets 

In [None]:
!pip uninstall -y shapely

In [None]:
!pip install shapely --no-binary shapely

In [None]:
!git clone https://premonition.stanford.edu/mmorphew/taking-the-pulse-global-data.git

In [None]:
!git clone https://premonition.stanford.edu/sgkang09/taking_the_pulse_atmosphere_data.git

In [None]:
import numpy as np
import xarray as xr
import pandas as pd
import geopandas as gpd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib
import cartopy.crs as ccrs
import cartopy.feature as cf
import datetime
import rioxarray
from shapely.geometry import mapping
matplotlib.rcParams['font.size'] = 14
from ipywidgets import widgets, interact

In [None]:
gdf_boundaries = gpd.read_file("./taking_the_pulse_atmosphere_data/world-administrative-boundaries.geojson")

## ASSIGNMENT PART 2

## Q1: Historical Temperature and Precipitation

## Q1-a: Historical Trend Maps

Calculate the trend in T and P globally in each individual grid cell and produce a map of globally gridded annual trends in each variable from 1985-2015. (Each cell will have one number, the slope from 1985-2021; color coded – red hotter and drier, blue – wetter and cooler) We will refer to these as the “historical trend maps.” What patterns do you observe? In other words, where are areas getting more(less) wet (dry) or hot (cool)? Pick two regions exhibiting either large increases or decreases in T or P. Based on what you know about these areas, are these patterns consistent with what you would expect due to predicted climate change dynamics and/or land-use?


In [None]:
temp_precip_data = xr.load_dataset("./taking-the-pulse-global-data/global_precip_and_temp.nc")

In [None]:
temp_precip_data

In [None]:
### your code here

## Q1-b: Yearly Histograms

Pick a few (2-3) years throughout 1985-2021, and produce two histograms, one histogram for temperature and one histogram for precipitation, for each year you selected. We recommend picking years that are spread apart. How are the histograms similar or different across the years? Why might the distributions be similar or different? Comment on the differences between using the trend maps and the single-year histograms to understand historical temperature and precipitation. Which do you trust more? Which do you prefer?

Because we did not extensively cover histograms in assignment 1, this question will be partially guided.

The following two code cells plot histograms for global data for a single year.

In [None]:
# If we want to plot a different year, we can select different times below.
temp_precip_1990 = temp_precip_data.sel(time=['01-01-1990', '01-01-1991'], method='nearest')
fig = plt.figure(figsize=(10, 10))
ax = plt.axes()
out = temp_precip_1990.temperature_2m.plot.hist(ax=ax, bins=20)
# set your labels and titles so that they make sense.
ax.set_xlabel("Temperature (K)")
ax.set_ylabel("counts")
ax.set_title("Global Temperature Histogram 1990")

In [None]:
# If we want to plot a different year, we can select different times below.
temp_precip_1990 = temp_precip_data.sel(time=['01-01-1990', '01-01-1991'], method='nearest')
fig = plt.figure(figsize=(10, 10))
ax = plt.axes()
# set your labels and titles so that they make sense.
out = temp_precip_1990.total_precipitation.plot.hist(ax=ax, bins=100)
ax.set_xlabel("Precipitation (m)")
ax.set_ylabel("counts")
ax.set_title("Global Temperature Histogram 1990")

Now that you've seen global histograms for one year, try plotting a few more years and comparing them.

In [None]:
### your code here

## Q1-c: Local Histograms

The above histograms may be difficult to interpret due to global temperatures varying wildly from the equator to the poles. Pick 2 countries that are spatially contiguous (i.e. Not the U.S.), and repeat the above histograms for data within those countries specifically. Do these histograms more align with your expectations? Why or why not? Because we did not cover how to do this in assignment 1, this question will be partially guided.

In [None]:
gdf_boundaries = gpd.read_file("./taking_the_pulse_atmosphere_data/world-administrative-boundaries.geojson")
country_names = np.sort(gdf_boundaries.name.values)
gdf_boundaries = gdf_boundaries.set_index('name')
widget_country = widgets.Select(options=country_names)
widget_country

In [None]:
widget_country.value

In [None]:
temp_precip_data_copy = xr.load_dataset("./taking-the-pulse-global-data/global_precip_and_temp.nc")

In [None]:
country_boundary = gdf_boundaries.loc[['China']]
country_boundary.crs
temp_precip_data_copy.rio.set_spatial_dims(x_dim="x", y_dim="y", inplace=True)
temp_precip_clipped = temp_precip_data_copy.rio.clip(
    country_boundary.geometry.apply(mapping), 
    country_boundary.crs, 
    drop=True
)

In [None]:
### we can change the time values to plot different years or time periods
temp_precip_clipped_1990 = temp_precip_clipped.sel(time=['01-01-1990', '01-01-1991'], method='nearest')
fig = plt.figure(figsize=(10, 10))
ax = plt.axes()
out = temp_precip_clipped_1990.temperature_2m.plot.hist(ax=ax, bins=20)
ax.set_xlabel("Temperature (K)")
ax.set_ylabel("counts")
ax.set_title("China Temperature Histogram 1990")

In [None]:
### we can change the time values to plot different years or time periods
temp_precip_clipped_1990 = temp_precip_clipped.sel(time=['01-01-1990', '01-01-1991'], method='nearest')
fig = plt.figure(figsize=(10, 10))
ax = plt.axes()
out = temp_precip_clipped_1990.total_precipitation.plot.hist(ax=ax, bins=100)
ax.set_xlabel("Precipitation (m)")
ax.set_ylabel("counts")
ax.set_title("China Precipitation Histogram 1990")

Now that you've seen an example, plot a few more years for a country of your choice. Compare the histograms and note any differences, both temporally and spatially. Do your local histograms vary from global histograms?

In [None]:
### your code here

## Q2: Recent Temperature and Precipitation

### Q2-a: Recent vs. Historical Trend Maps

Redo the temperature and precipitation trend maps but for 2003-2021, which we will call the “recent trend maps.” Compare and contrast the historical and recent trend maps.

In [None]:
### your code here

### Q2-b: Recent vs. Historical Climatologies
A “monthly climatology” calculates the average value of a variable for each month (e.g. the average of all January’s, average of all February’s, etc.) over a time period and allows us to consider the average monthly to seasonal variation in that variable. Calculate the monthly climatology for T and P for the full historical record and the recent record. Plot the T climatologies for the historical and recent periods in a single plot and repeat for the P climatologies in a single plot. Comment on the historical and recent climatologies for T and P separately. Is each variable experiencing similar variability between the two time periods? Are T and P exhibiting similar differences to one another between the time periods? How might climate change be impacting the differences between the historical and recent periods? 

Because we did not cover climatologies in assignment 1, this question will be guided.

We can use xarray's "groupby" feature to group the data by month and then take the mean over time and space. This reduces our dataset to 12 points, each representing the global mean for each month. We can plot it as a bar chart to show the climatology for each variable.

In [None]:
# In English: this says "get me all the data between 2003 and 2022"
recent_temp_precip_data = temp_precip_data.sel(
    time=slice("2003-01-01", "2022-01-01"))

In [None]:
# for both our original dataset and new recent dataset, let's group by month and take the mean
# we'll take the mean once to get mean temperatures monthly, and then we'll take the mean spatially to get a single
# monthly value for the globe
historical_temp_precip_monthly = temp_precip_data.groupby('time.month').mean().mean(['x','y'])
recent_temp_precip_monthly = recent_temp_precip_data.groupby('time.month').mean().mean(['x','y'])

In [None]:
fig, ax = plt.subplots(figsize=(9, 9))
# here we convert the data to a more usable format and change degrees to Celsius
# If we had a different data set, we would change the dataset and variable name below
series = historical_temp_precip_monthly['temperature_2m'].to_numpy()-273.15
series2 = recent_temp_precip_monthly['temperature_2m'].to_numpy()-273.15
index = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 
'Sep', 'Oct', 'Nov', 'Dec']
combined_df = pd.DataFrame({'historical': series, 'recent': series2},
                           index=index)

# now we plot it
combined_df.plot.bar(ax=ax)
ax.set_ylabel('Temperature (C)')
ax.set_xticklabels(('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 
                    'Sep', 'Oct', 'Nov', 'Dec'))
ax.set_title('Monthly Climatology: Temperature')


Copy the code above and change it so that we instead get the climatology for precipitation. Below, write your observations and answer the questions posed at the beginning of this question.

In [None]:
### your code here

## Q3: Taking the Pulse of the Land Surface in the Future

Given all that you now know about changes in precipitation and temperature, and how we measure/monitor it, what does the planet and all forms of life need you to design and deploy?


**your answer here**