# Worksheet 4: Creating future climate scenarios and analysing climate change

The following exercises demonstrate basic methods for analysing changes in climate, using two CORDEX-Core experiments (REMO2015 driven by HadGEM2-ES and MPI-ESM-LR) As with worksheets 2 & 3, these are examples of some of the many types of analyses that can be performed using Python and Iris.

<div class="alert alert-block alert-warning">
<b>By the end of this worksheet you should be able to:</b><br> 

- Calculate difference and percentage differences across cubes<br>
- Plot cubes using different plotting methods and with an appropriate colour scale <br>
- Create time series anomalies of precipitation and tempeature<br>
</div>

<div class="alert alert-block alert-info">
<b>Note:</b> As in Worksheet 2, the data used here has been processed in the same way as in Worksheet 1. The whole period has been concatenated into a single file to avoid issues with loading multiple files.
</div>

## Contents
### [4.1: Calculate future OND mean precipitation](#4.1) 
### [4.2: Find OND anomalies](#4.2)
### [4.3: Plot precipitation and temperature](#4.3)
### [4.4: Future time series](#4.4)

## Preamble

In [None]:
# Code preamble - these libraries will be used in this worksheet.
# This code block needs to be re-run every time you restart this worksheet!
%matplotlib inline 
import os
import iris
import iris.coord_categorisation
from iris.experimental.equalise_cubes import equalise_attributes
import iris.quickplot as qplt
import iris.plot as iplt
import matplotlib.pyplot as plt
import numpy as np
import numpy.ma as ma

# Some helpful data locations
DATADIR = 'data_afr22/AFR-22/'
CLIMDIR = os.path.join(DATADIR, 'climatology')
HISTDIR = os.path.join(DATADIR, 'historical')
FUTRDIR = os.path.join(DATADIR, 'rcp85')
GCMIDS = ['hadgem2-es', 'mpi-esm-lr']

<a id='4.1'></a>
## 4.1 Calculate future OND mean precipitation
**a)** First, we **calculate future OND (October, November, December) mean precipitation** for the period 2041-2060 for the HadGEM2-ES driven REMO2015 simulation and the MPI-ESM-driven simulation :

In [None]:
for gcmid in GCMIDS:
    infile = os.path.join(FUTRDIR, gcmid + '.mon.2041_2060.GERICS-REMO2015.pr.mmday-1.nc')
    data = iris.load_cube(infile)

    # in order to calculate OND mean, we divide the months into two seasons: 
    # one for OND and a second for the remaining months
    iris.coord_categorisation.add_season(data, 'time', name='seasons', seasons=('jfmamjjas','ond'))

    # Extract the data for the OND season only
    data_ond = data.extract(iris.Constraint(seasons='ond'))

    # Now calculate the mean over the OND season
    ond_mean = data_ond.aggregated_by(['seasons'], iris.analysis.MEAN)

    # save the OND mean as a netCDF
    outfile = os.path.join(CLIMDIR, gcmid + '.OND.mean.2041_2060.GERICS-REMO2015.pr.mmday-1.nc')
    iris.save(ond_mean, outfile)
    print('Saved: {}'.format(outfile))

---
<div class="alert alert-block alert-success">
<b>Question:</b> Within the loop, we have created two cubes: a seasonal OND constrained cube <code>data_ond</code>, and a seasonal mean cube <code>ond_mean</code>.  Inspect the cube metadata.  What are the differences? 
</div>

In [None]:
# Use this code block to inspect the two cubes
# e.g. print(cube)



<b>Answer</b><br>
    *Type your answer here...*

---

<div class="alert alert-block alert-info">
    <b>Note:</b> Remember, the loop has created and saved two cubes, <b>one for each downscaled GCM</b>.
</div>

<a id='4.2'></a>
## 4.2 Find OND anomalies
**b)** Next, we **subtract the baseline (1986-2005) mean from the future (2041-2060) mean** for OND to get the change in precipitation (or **anomaly**) from both simulations.  The changes are also converted to percentages:

In [None]:
for gcmid in GCMIDS:
    # Load the baseline cube
    infile = os.path.join(CLIMDIR, gcmid + '.OND.mean.1986_2005.pr.mmday-1.nc')
    OND_baseline = iris.load_cube(infile)
    # Set the correct units
    OND_baseline.units = "mm day-1"
    # Load the future cube
    infile = os.path.join(CLIMDIR, gcmid + '.OND.mean.2041_2060.GERICS-REMO2015.pr.mmday-1.nc')
    OND_future = iris.load_cube(infile)
    # Subtract the baseline cube from the future cube
    diff = iris.analysis.maths.subtract(OND_future, OND_baseline)
    # rename the cube to reflect the data processing
    diff.rename('precipitation flux difference')
    # Save the resulting cube
    outfile = os.path.join(CLIMDIR, gcmid + '.OND.mean.diff.GERICS-REMO2015.pr.mmday-1.nc')
    iris.save(diff, outfile)
    print('Saved {}'.format(outfile))
    # Find the percentage change
    pcent_change = iris.analysis.maths.multiply(iris.analysis.maths.divide(diff, OND_baseline), 100)
    # remember to change the title and units to reflect the data processing
    pcent_change.rename('precipitation flux percent difference')
    pcent_change.units = '%'
    # And save this too
    outfile = os.path.join(CLIMDIR, gcmid + '.OND.mean.diffperc.GERICS-REMO2015.pr.mmday-1.nc')
    iris.save(pcent_change, outfile)
    print('Saved {}'.format(outfile))

**c)** Now, repeat the calculations yourself for **temperature**. 

First, we calculate the **OND mean** temperatures. 

**Fill in the missing commands in the code block below**:

In [None]:
# HINT: Your filenames should have the format: 
# gcmid + '.OND.mean.' + time_periods[period] + '.GERICS-REMO2015.tm.C.nc'

for gcmid in GCMIDS:

    # first historical

    # in order to calculate OND mean, we divide the months into two seasons: 
    # one for OND and a second for the remaining months

    # Extract the data for the OND season only

    # Now calculate the mean over the OND season

    # save the OND mean as a netCDF
    outfile = os.path.join(CLIMDIR, gcmid + '.OND.mean.1986_2005.GERICS-REMO2015.tm.C.nc')
    
    # then RCP8.5

    # in order to calculate OND mean, we divide the months into two seasons: 
    # one for OND and a second for the remaining months

    # Extract the data for the OND season only

    # Now calculate the mean over the OND season

    # save the OND mean as a netCDF
    outfile = os.path.join(CLIMDIR, gcmid + '.OND.mean.2041_2060.GERICS-REMO2015.tm.C.nc')


**d)** Next, we **calculate the difference** between the baseline and future periods.

In [None]:
for gcmid in GCMIDS:
    # Load files:
    baselinefile = os.path.join(CLIMDIR, gcmid + '.OND.mean.1986_2005.GERICS-REMO2015.tm.C.nc')
    futurefile = os.path.join(CLIMDIR, gcmid + '.OND.mean.2041_2060.GERICS-REMO2015.tm.C.nc')
    OND_baseline = iris.load_cube(baselinefile)
    OND_future = iris.load_cube(futurefile)
    
    # Calculate 'future mean' minus 'baseline mean':
    diff = iris.analysis.maths.subtract(OND_future, OND_baseline)
    diff.rename('surface temperature difference')
    
    # Save
    outfile = os.path.join(CLIMDIR, gcmid + '.OND.mean.diff.GERICS-REMO2015.tm.C.nc')
    iris.save(diff, outfile)
    print('Saved: {}'.format(outfile))

<a id='4.3'></a>
## 4.3 Plot precipitation and temperature

**e)** **Plot changes** to precipitation (in %) and temperature (in deg.C)

In [None]:
# Create a figure of the size 12x12 inches
plt.figure(figsize=(12, 12))

# Read in the percentage changes in precipitation
for n, gcmid in enumerate(GCMIDS):
    prpath = os.path.join(CLIMDIR, gcmid + '.OND.mean.diffperc.GERICS-REMO2015.pr.mmday-1.nc')
    tmpath = os.path.join(CLIMDIR, gcmid + '.OND.mean.diff.GERICS-REMO2015.tm.C.nc')
    pcent_change = iris.load_cube(prpath)
    degc_change = iris.load_cube(tmpath)

    # Remove extra time dimension using an iris utility 
    pcent_change = iris.util.squeeze(pcent_change)
    degc_change = iris.util.squeeze(degc_change)
    
    plot_num = n*2 + 1
    plt.subplot(2, 2, plot_num) # Create a new subplot with 2 rows, 2 columns, 1st plot
    qplt.pcolormesh(pcent_change, vmax=30, vmin=-30, cmap='BrBG')
    plt.title(gcmid + ' precipitation change (%)')
    ax = plt.gca()              # gca function that returns the current axes
    ax.coastlines()             # adds coastlines defined by the axes of the plot

    plt.subplot(2, 2, plot_num+1)
    qplt.pcolormesh(degc_change, vmax=2.5, vmin=0, cmap='Reds')
    plt.title(gcmid + ' temperature change ($\degree$C)')
    ax = plt.gca()
    ax.coastlines()

plt.show()

---
<div class="alert alert-block alert-success">
<b>Question:</b> How would you use a discrete contoured colour map to show changes in precipitation and temperature? <br>
    Modify the code above to use <strong>qplt.pcontourf()</strong>.  Remember to choose appropriate contours.
</div>

<div class="alert alert-block alert-success">
<b>Question:</b>  The plots show projected changes in precipitation and temperature using two models; what are the common features between the two model projections? 
    
What differences exist between the two model projections? Which is hotter, wetter, how does the spatial distribution differ? 
    
</div>

<b>Answer:</b><br>
*Type your answers here...*

---

<a id='4.4'></a>
## 4.4 Future time series

**f)** Calculate and then plot a 2041-2060 monthly **time series of precipitation anomalies** for land
points only, relative to the 1986-2005 baseline monthly mean. Do this for both the downscaled HadGEM2-ES and MPI-ESM-LR 


In [None]:
# Read in the land-sea mask. 
# The cube data array has a land fraction associated with it which we'll use to mask out ocean points.
land_fraction_file = 'sftlf_AFR-22_MOHC-HadGEM2-ES_historical_r0i0p0_GERICS-REMO2015_v1_fx_r0i0p0.nc'
land_fraction = iris.load_cube(DATADIR + land_fraction_file)

# convert this to a binary (i.e. 1 or 0 mask)
land_sea_mask = land_fraction.copy()
land_sea_mask.data = np.where(land_sea_mask.data < 50, 0, 1)
land_sea_mask.name = 'land_binary_mask'
# apply a mask to the cube 
land_sea_mask = iris.util.mask_cube(land_sea_mask, land_sea_mask.data < 0.5)


for gcmid in GCMIDS:
    # Read in original data for baseline and future
    baselinepath = os.path.join(HISTDIR, gcmid + '.mon.1986_2005.GERICS-REMO2015.pr.mmday-1.nc')
    futurepath = os.path.join(FUTRDIR, gcmid + '.mon.2041_2060.GERICS-REMO2015.pr.mmday-1.nc')
    baseline = iris.load_cube(baselinepath)
    future = iris.load_cube(futurepath)
    
    # Apply land mask
    baseline.data = ma.array(baseline.data, mask=baseline.data*land_sea_mask.data.mask[np.newaxis, :,:])
    future.data = ma.array(future.data, mask=future.data*land_sea_mask.data.mask[np.newaxis, :,:])

    # the code to calculate area weights requires a single longitude and latitude coordindate - 
    # remove the 2D "true" lat and lon
    baseline.remove_coord('longitude')
    baseline.remove_coord('latitude')
    future.remove_coord('longitude')
    future.remove_coord('latitude')
    
    # Guess bounds
    for cube in [baseline, future]:
        for coord in ['grid_longitude', 'grid_latitude']:
            cube.coord(coord).guess_bounds()
    
    # Calculate mean values over land points
    baseline_land = baseline.collapsed(['grid_longitude', 'grid_latitude'], iris.analysis.MEAN,
                                      weights = iris.analysis.cartography.area_weights(baseline))
    future_land = future.collapsed(['grid_longitude', 'grid_latitude'], iris.analysis.MEAN,
                                  weights = iris.analysis.cartography.area_weights(future))

    # Save future & baseline area averaged monthly data (time series)
    baselineout = os.path.join(CLIMDIR, gcmid + '.mon.1986_2005.series.GERICS-REMO2015.pr.mmday-1.nc')
    futureout = os.path.join(CLIMDIR, gcmid + '.mon.2041_2060.series.GERICS-REMO2015.pr.mmday-1.nc')
    iris.save(baseline_land, baselineout)
    iris.save(future_land, futureout)

    # Subtract baseline from future
    diff = future_land.copy()
    diff.data = future_land.data - baseline_land.data.mean()
    diff.rename('future anomaly relative to mean historical precipitation')

    # Save the area averaged monthly future anomalies (time series)
    outpath = os.path.join(CLIMDIR, gcmid + '.mon.2041_2060.anom.series.GERICS-REMO2015.pr.mmday-1.nc')
    iris.save(diff, outpath)
    print('Saved: {}'.format(outpath))

---
<div class="alert alert-block alert-success">
    <b>Question:</b> Why do we only want to produce a time series for changes over land?
</div>

**Answer:**

_Type your answer here..._

---

**g)** **Plot the precipitation anomalies** of HadGEM2-ES and MPI-ESM-LR downscaled

In [None]:
# Read in the monthly series
hadgem2es = iris.load_cube(CLIMDIR + '/hadgem2-es.mon.2041_2060.anom.series.GERICS-REMO2015.pr.mmday-1.nc')
mpiesm = iris.load_cube(CLIMDIR + '/mpi-esm-lr.mon.2041_2060.anom.series.GERICS-REMO2015.pr.mmday-1.nc')
time = hadgem2es.coord('time')

# Plot the two model time series' on the same figure
plt.figure(figsize=(16,5))
iplt.plot(time, hadgem2es, label = 'HadGEM2-ES')
iplt.plot(time, mpiesm, label = 'MPI-ESM-LR')
plt.legend()
plt.suptitle('2041-2060 Precipitation anomaly (relative to 1986-2005)')
plt.ylabel(f'Precipitation change ({hadgem2es.units}')
plt.xlabel('Years')
plt.show()

**h) Produce and plot a montly time series of temperature data** relative to the 1986-2005 baseline.  As for (f) and (g) produce time series for HadGEM2-ES and MPI-ESM-LR driven runs.

**Fill in the missing commands in the code blocks below**:

In [None]:
# HINT: The temperature data has filenames with the pattern:
# gcmid + '.mon.1986_2005.GERICS-REMO2015.tm.C.nc' or gcmid + '.mon.2041_2060.GERICS-REMO2015.tm.C.nc'

# Loop over GCMIDS


# Read in original data for baseline and future

# Apply land mask

# remove the 2D "true" lat and lon

# Guess bounds

# Calculate mean values over land points

# Save future & baseline area averaged monthly data (time series)

# Subtract baseline from future

# Save the data, make sure you follow the file naming convention!

    

In [None]:
# Do some plotting...
# Read in the monthly series


# Plot the two model time series' on the same figure


---
<div class="alert alert-block alert-success">
<b>Question:</b> Write a short summary of these two graphs. Include:
        
- A description of what each plot shows
- The differences between the two models
- A consideration of the ways the climate in East Asia might be different in the future
</div>

**Answer:**

_Type your answer here..._

<div class="alert alert-block alert-success">
    <b>Question:</b> Conside the plots we produced in Section 4.3.  What <b>additional</b> time series analysis could you do to support your consideration of future changes to climate in the question above?
</div>

---

<div class="alert alert-block alert-warning">
<center><b>This completes worksheet 4.</b></center><br>
    You have used Iris to investigate differences between historical and future changes in model output by comparing 20 years of baseline data (1986-2005) against a future period (20241-2060). <br>
To do so, you have:
    
- calculated and plotted seasonal mean changes in temperature and precipitation
- masked out ocean data to focus on changes over land
- calculated anomalies by comparing future data to the historical mean period
- plotted time series of both temperature and precipitation anomalies over land for two different models<br>

In worksheet 5, you will investigate climate extremes by investigating threshold and extreme climate indicies.
</div>

<p><img src="img/MO_MASTER_black_mono_for_light_backg_RBG.png" alt="python + iris logo" style="float: center; height: 100px;"/></p>
<center>© Crown Copyright 2022, Met Office</center>