# **Tutorial 9: Paleoclimate Data Assimilation**
**Week 1, Day 4, Paleoclimate**

**Content creators:** Sloane Garelick

**Content reviewers:** Brodie Pearson

**Content editors:** Agustina Pesce

**Production editors:** TBD

**Our 2023 Sponsors:** TBD

###**Code and Data Sources**

The code for this notebook is based on [code available from Erb et al. (2022)](https://github.com/Holocene-Reconstruction/Holocene-code) and workflow presented during the [Paleoclimate Data Assimilation Workshop 2022](https://github.com/michaelerb/da-workshop).

Data from the following sources are used in this tutorial:

*   Matthew B. Osman, Jessica E. Tierney, Jiang Zhu, Robert Tardif, Gregory J. Hakim, Jonathan King, Christopher J. Poulsen. 2021. Globally resolved surface temperatures since the Last Glacial Maximum. Nature, 599, 239-244. doi: 10.1038/s41586-021-03984-4
*   King, J. M., Tierney, J., Osman, M., Judd, E. J., & Anchukaitis, K. J. (2023). DASH: A MATLAB Toolbox for Paleoclimate Data Assimilation. Geoscientific Model Development, (in review).



















#**Tutorial 9 Objectives**

As we just discussed in the introductory video, proxies and models both have advantages and limitations for reconstructing past changes in earth's climate system. One approach for combining the strengths of both paleoclimate proxies and models is data assimilation. In this tutorial, we'll look at paleoclimate reconstructions that were made with data assimilation. The specific reconstruction we'll be analyzing is the Last Glacial Maximum reanalysis (LGMR) from [Osman et al. (2021)](https://www.nature.com/articles/s41586-021-03984-4), which contains temperature and d18O data for the past 24,000 years.


By the end of this tutorial you will be able to:

*   Understand how data assimilation works
*   Exisiting paleoclimate data assimilation datasets
*   Why data assimilation is useful
*   How to create time series and maps of data assimilation results 






In [None]:
# # Install libraries
# !pip install --no-binary shapely shapely --force # Add this to use cartopy. in this way it doesn't crush
# !pip install cartopy
# !pip install pooch
# !pip install xarray

In [None]:
# Import libraries
import pooch
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt

import cartopy.crs as ccrs
import cartopy.util as cutil

##Load the LGMR paleoclimate data assimilation reconstruction

This dataset contains reconstructions of surface air temperature (SAT), d18O, and global mean surface temperature (GMST). Let's download the paleoclimate reconstruction for SAT. 

In [None]:
data_path= pooch.retrieve(
  url="https://www.ncei.noaa.gov/pub/data/paleo/reconstructions/osman2021/LGMR_SAT_climo.nc",
  known_hash=None,
)

dataset = xr.open_dataset(data_path)

To see what's in the file you loaded, we can print the xarray:

In [None]:
# Variables in the file
dataset

In [None]:
print('\n=== Notes about extracted variables ===')
print('Shape of "tas_mean":', dataset['sat'].shape)
print('Range of "ages":', dataset['age'].min().values, dataset['age'].max().values)
print('Range of "lat":', dataset['lat'].min().values, dataset['lat'].max().values)
print('Range of "lon":', dataset['lon'].min().values, dataset['lon'].max().values)

## Plotting a **time series** of the paleoclimate data assimilation

Now that the data is loaded, we can plot a time series of the temperature data to beginning to assess global changes.

However, the `tas_mean` variable is a 3D array with dimensions of age-lat-lon, so we first need to calculate a global mean. The function below calculates the mean temperature value over a specified region. In this case, we'll be looking at the global mean.

In [None]:
def spatial_mean(dataset, variable, region):
  """A function to compute a regional-mean from a time-lat-lon variable"""
  i_selected = np.where((dataset.lon >= region[0]) & (dataset.lon <= region[1]))[0]
  j_selected = np.where((dataset.lat >= region[2]) & (dataset.lat <= region[3]))[0]
  print(
    'Computing spatial mean.',
    f'lats: {dataset.lat.values[j_selected[0]]} - {dataset.lat.values[j_selected[-1]]}',
    f'lons: {dataset.lon.values[i_selected[0]]} - {dataset.lon.values[i_selected[-1]]}.',
    'Points are inclusive.'
  )
  lat_weights = np.cos(np.radians(dataset.lat))
  variable_zonal = np.nanmean(dataset[variable].values[:, :, i_selected], axis=2)
  variable_mean = np.average(
      variable_zonal[:, j_selected],
      axis=1,
      weights=lat_weights[j_selected]
  )

  return variable_mean

In [None]:
print('Temperature:', dataset['sat'][:,40,40].values)

Call the function above, `spatial_mean`, to compute a global mean surface temperature.

In [None]:
# Calculate the global mean surface temperature
region = [0,360,-90,90]
tas_global_mean = spatial_mean(dataset, 'sat', region)

Now that we calculated our global mean, we can plot the results as a time series to assess changes in temperature over the past 24,000 years:

In [None]:
# Plot the global mean surface temperature
f,ax1 = plt.subplots(1, 1, figsize=(12,6))
ax1.plot(dataset['age'], tas_global_mean, linewidth=3)

ax1.set_xlim(dataset['age'].max().values, dataset['age'].min().values)
ax1.set_ylabel('$\Delta$T ($^\circ$C)', fontsize=16)
ax1.set_xlabel('Age (yr BP)', fontsize=16)
ax1.set_title(
  f'Mean $\Delta$T ($^\circ$C) for LGMR, region: {region}',
  fontsize=18,
  loc='center'
)
plt.show()

Consider the following questions:


*   How has global temperature varied over the past 24,000 years?
*   What climate forcings may have contributed to the increase in temperature ~17,000 years ago? 



## Plotting a **temperature anomaly map** of the paleoclimate data assimilation

Data assimilation creates spatial reconstructions, so we can also make figures showing spatial temperature anomalies for different time periods. The function below makes two figures: one that shows a map of reconstructed temperature, and the other that shows a zonal mean figure of temperature differences.

In [None]:
# A function to make a map of differences between two time periods
def map_temp_anom(dataset, variable_name, ages_anom,ages_ref):
    # Compute the difference between the periods specified above.
    ind_anom = np.where((dataset.age >= ages_anom[0]) & (dataset.age <= ages_anom[1]))[0]
    ind_ref  = np.where((dataset.age >= ages_ref[0])  & (dataset.age <= ages_ref[1]))[0]

    tas_change = np.mean(dataset[variable_name][ind_anom, :, :], axis=0) - np.mean(dataset[variable_name][ind_ref, :, :], axis=0)
    tas_change_zonal = np.mean(tas_change, axis=1)

    # Make a map of changes
    plt.figure(figsize=(12,8))
    ax = plt.axes(projection=ccrs.Robinson())
    ax.set_global()
    tas_change.plot(
        ax=ax,
        transform=ccrs.PlateCarree(), x="lon", y="lat",
        cbar_kwargs={'orientation': 'horizontal', 'label':'$\Delta$T ($^\circ$C)'}
    )
    ax.coastlines()
    ax.set_title(
        f'$\Delta$T ($^\circ$C) for LGMR, ages: anom = {ages_anom}, ref = {ages_ref}',
        loc='center',
        fontsize=16
    )
    ax.gridlines(color='k',linewidth=1,linestyle=(0,(1,5)))
    ax.spines['geo'].set_edgecolor('black')
    plt.show()

    # Make a zonal mean figure of the changes
    fig, ax1 = plt.subplots(1, 1)
    tas_change_zonal.plot(linewidth=3, y="lat")
    ax1.axvline(x=0,color='gray',alpha=1,linestyle=':',linewidth=2)
    ax1.set_ylim(-90, 90)
    ax1.set_xlabel('$\Delta$T ($^\circ$C)')
    ax1.set_ylabel('Latitude ($^\circ$)')
    ax1.set_title(
        f'Zonal-mean $\Delta$T ($^\circ$C), ages: anom = {ages_anom}, ref = {ages_ref}',
        loc='center',
    )
    plt.show()

Before making the figures, double-check the ages in your dataset by printing the beginning and end of the age variable.

In [None]:
print('First 5 ages:', dataset.age[:5].values)
print('Last 5 ages:', dataset.age[-5:].values)

The code below will make a figure that shows the temperature anomaly temperature anomaly of 21,000 years ago relative to today.

In [None]:
# Make a map of differences between ages
ages_anom = [20500,21500]; ages_ref = [0,1000]  # 21 ka vs 0 ka

map_temp_anom(dataset, 'sat', ages_anom, ages_ref)

What do you notice about the spatial differences in the LGM to present temperature anomalies?


*   How does the temperature anomaly vary with latitude?
*   Where was the largest temperature change? Why might this region region have undergone the largest temperature change during this time?

If you'd like, you can take a look at temperature anomalies during other time periods as well by changing the `ages_anom` and `ages_ref` values.