# Worksheet 2: Using the Python Iris library for analysis and visualisation

In this worksheet, sample PRECIS output over Southeast Asia driven by HadCM3Q0 and ECHAM5 is compared with observations for validation purposes. Validation of model results by comparison with observed data is an essential step - this is the measure by which we can assess the quality of the model and it informs appropriate uses of the data.


Here, we use PRECIS output driven by two different GCMs. Using data from both experiments will give us two representations of present day climate and two possible climate scenarios. For more details on multimodel approaches see the PRECIS workshop lecture on climate model ensembles.


The following are examples of types of analyses undertaken as part of a model validation. The methods shown are not necessarily the only way to proceed and are intended to demonstrate the use of Iris in model validation, and provide a starting point for your own analyses. For further help on validating your PRECIS simulations, refer to the PRECIS workshop lecture notes.

<div class="alert alert-block alert-warning">
<b>By the end of this worksheet you should be able to:</b><br> 
- Apply basic statistical operations to Iris cubes. <br>
- Plot information from Iris cubes.<br>
- Code a basic python workflow for undertaking basic temperature and rainfall analysis
</div>

<div class="alert alert-block alert-info">
<b>Note:</b> The data used here has been processed in the same way as Worksheet 1. The 8 point-rim has been removed and it has been converted from PP to NetCDF format.
</div>

In [None]:
# Code preamble - these libraries will be used in this worksheet.
# This code block needs to be re-run every time you restart this worksheet!
%matplotlib inline
import os
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.dates as mdates
import calendar
import iris
import iris.coord_categorisation
import iris.quickplot as qplt
import cartopy.crs as ccrs
from mpl_toolkits.axes_grid1 import AxesGrid
from cartopy.mpl.geoaxes import GeoAxes

# 2.1 Inspecting the data

The datasets used here are daily and monthly data from two PRECIS runs carried out over Southeast Asia, one driven by HadCM3Q0 and the other driven by ECHAM5. The observations used for comparison are from the APHRODITE gridded observational data set (insert URL).

in Iris, data are read into an object called a cube. A cube contains the data of interest (e.g., temperature, rainfall, wind speeds) and metadata about a phenomenon. A single cube describes only one type of data. It is not possible for a cube to contain both temperature and rainfall, for example. A cube always has a name, a unit and an n-dimensional data array to represents the cube’s data. Additionally, the cube contains collections of coordinates.  Coordinate types can include spatial information (latitude, longitude, altitude), a time dimension, or other information, e.g., an ensemble number.

<p><img src="img/multi_array_to_cube.png" alt="Example Iris cube" style="float: center; height: 300px;"/></p>

__a) Load the netCDF file for the HadCM3Q0 and ECHAM5 model data and the APHRODITE rainfall observation data and print the cube output__

A cube has coordinates (for example time, longitude, latitude, model levels etc) and this information can be accessed with commands. In the following exercise we follow a similar example to that in the [Iris documentation](http://scitools.org.uk/iris/docs/latest/userguide/navigating_a_cube.html#accessing-coordinates-on-the-cube) and find the latitude and longitude of the corners of the locations for the APHRODITE data. You can do so by printing the latitude and longitude coordinates (.points) and note the first and last values in the array.

Before running the code take a look at it line by line to understand what steps are being made. Add code where prompted and then click in the box and press <kbd>ctrl</kbd> + <kbd>enter</kbd> to run the code.

In [None]:
# Feed in the names of the directories where the netCDF model files are stored
DATADIR = '/project/precis/worksheets/data/'
path_in_cahpa   = os.path.join(DATADIR, 'netcdf/cahpa/')
path_in_cahpb   = os.path.join(DATADIR, 'netcdf/cahpb/')
path_in_APHRODITE   = os.path.join(DATADIR, 'APHRODITE/')

# Load and print the HadCM3Q0 (cahpa) model cube data
cahpaData = iris.load_cube(path_in_cahpa + 'cahpa.pm.1981_1983.pr.norim.nc')

# Load and print the ECHAM5 (cahpb) model cube data
cahpbData = iris.load_cube(path_in_cahpb + 'cahpb.pm.1981_1983.pr.norim.nc')

# Load and print the APHRODITE observation cube data
aphroData = iris.load_cube(path_in_APHRODITE + 'aphro.mon.6190.nc')

__b) Extract a subset of the data within a cube__

The ability to extractis an important function in Iris. The extraction of a subset of data is called slicing.  For example, it could be necessary to extract data over all latitude and longitude grid points on the first time step. For more information around subsetting cubes please read further [here](http://scitools.org.uk/iris/docs/latest/userguide/subsetting_a_cube.html#cube-indexing).

__Using the HacCM3Q0 data, the example below shows how to subset a cube for the first time and last timesteps. This method will be used later for plotting data from a cube.__ 

Work through the example below line by line then click in the box and press <kbd>Ctrl</kbd> + <kbd>Enter</kbd> to run the code.

This is the cube with all the time steps:

In [None]:
cahpaData

Iris cubes can be sliced in a similar way to arrays. 

This is the first time step of the cube:

In [None]:
cahpaData[0, :, :]

And this is the last time step of the cube:

In [None]:
cahpaData[-1, :, :]

<div class="alert alert-block alert-success">
<b>Question</b> What is the order of dimensions in a cube? <br>
a) [longitude, latitude, time]<br>
b) [time, latitude, longitude]<br>
c) [latitude, longitude, time]
</div>

## 2.2 Converting units

__a) Convert the OND (October, November, December) seasonal precipitation fields for both runs from kg/m2/s (equivalent to mm/s) to mm/day__

We could just multiply the raw data in mm/s by 86400 seconds, but a clearer way is to use the __`.convert_units()`__ method with the name of the units we want to convert the data into.

For clarity let's do this for the __cahpa__ historical data first and break down the steps as follow:

* Print the units and summary statistic about the data
* Convert the unit and print the information again
* Rename the units value in the cube and save it as a new netCDF file

In [None]:
print(cahpaData)

In [None]:
# print the unit
print('The current unit for data is: {}'.format(cahpaData.units))

# print the summary statistic (maximum monthly precipitation)
maxpr = np.max(cahpaData.data)
print('This is an example rainfall rate (kg m-2 s-1) prior to conversion:{:f}'.format(maxpr))

In [None]:
# Convert units to kg m-2 day-1 (same as multiplying by 86400 seconds)
cahpaData.convert_units('kg m-2 day-1')
# Print cube.units to view new units for precipitation
print('The new rainfall units are: {}'.format(cahpaData.units))
maxpr = np.max(cahpaData.data)

# print the summary statistic (maximum monthly precipitation) after the unit conversion
print('This is the same rainfall rate but in (kg m-2 day-1): {:f}'.format(maxpr))

Rename the new cube units for consistnecy, then save the converted cube:

In [None]:
# Rename cube units
cahpaData.units = 'mm day-1'
# Remove extraneous cube metadata.  This helps make cube comparisons easier later.
cahpaData.remove_coord('forecast_period')
cahpaData.remove_coord('forecast_reference_time')
# Save the new cube as a new netCDF file
iris.save(cahpaData, path_in_cahpa + 'cahpa.pm.1981_1983.pr.norim.mmday-1.nc')

Complete the follow code blcok to repeat the same proceedure for __cahpb__:

In [None]:
# Print the current cahpb cube units
print('The current unit for data is: {}'.format( ))

# convert units to kg m-2 day-1


# Rename the units to mm day-1. 1 kg m-2 is equivalent to 1 mm of rain


# save the cube into a new netCDF file


In [None]:
cahpbData.convert_units('kg m-2 day-1')
cahpbData.units = 'mm day-1'
cahpbData.remove_coord('forecast_period')
cahpbData.remove_coord('forecast_reference_time')
iris.save(cahpbData, path_in_cahpb + 'cahpb.pm.1981_1983.pr.norim.mmday-1.nc')

## 2.3 Climatological mean calculation

__a) Calculate the 1961-1990 seasonal mean precipitation field for October-December (OND) from both the HadCM3Q0 (cahpa) and ECHAM5 (cahpb) driven PRECIS runs__

Work through the example below line by line then click in the box and press <kbd>ctrl</kbd> + <kbd>enter</kbd> to run the code.

In [None]:
# Define output location
outdir = os.path.join(DATADIR, 'climatology/')
# Check to see if this directory exists, if not create it
if not os.path.isdir(outdir):
    # Make directory
    os.mkdir(outdir)
    # Set directory permissions 
    os.chmod(outdir, 0o776)

# Loop through two model runs
for jobid in ['cahpa', 'cahpb']:
    infile = os.path.join(DATADIR, 'netcdf', jobid, jobid + '.pm.1981_1983.pr.norim.mmday-1.nc')

    # Load the data
    data = iris.load_cube(infile)

    # In order to calculate OND mean, we use the command below to add season membership coordinate
    # The seasons can be any sequence of months, identified by the first letters of the names of the months.
    # Here, we define two seasons, jfmamjjas (the months we are not interested in) and ond (October, November and
    # December), the months we do want.
    iris.coord_categorisation.add_season(data, 'time', name='seasons', seasons=('jfmamjjas','ond'))

    # This command extracts data for the OND season using a constraint
    data_ond = data.extract(iris.Constraint(seasons='ond'))

    # The cube data_ond contains data for October-December for all years. The command below
    # calculates the mean over all years.
    seasonal_mean = data_ond.aggregated_by(['seasons'], iris.analysis.MEAN)
    
    # Save the OND seasonal mean as a netCDF
    outfile = os.path.join(outdir, jobid + '.a.OND.mean.baseline.pr.mmday-1.nc')
    iris.save(seasonal_mean, outfile)

In [None]:
print(seasonal_mean)

__b) Calculate the 1961-1990 seasonal mean for OND from the APHRODITE observation data__

APHRODITE is a daily high resolution (0.25 degree) rain gauge-based precipitation data set over Asia for 1950-2007. See http://www.chikyu.ac.jp/precip/ for more information.

Follow step a) and complete the code yourself.  The file name to load is: `aphro.mon.6190.nc`

In [None]:
# Directory names where data is read from and stored to
infile = os.path.join(DATADIR, 'APHRODITE', 'aphro.mon.6190.nc')

# Load the aprhodite data


# in order to calculate OND mean, need to a add season membership coordinate


# Then constrain the cube just for the OND season


# Now calculate the mean over this season


# save the seasonal mean as a netCDF
outfile = os.path.join(outdir, 'aphro.a.OND.mean.baseline.pr.mmday-1.nc')


In [None]:
infile = os.path.join(DATADIR, 'APHRODITE', 'aphro.mon.6190.nc')
aphro_cube = iris.load_cube(infile)
iris.coord_categorisation.add_season(aphro_cube, 'time', name='seasons', seasons=('jfmamjjas','ond'))
aphro_cube_ond = aphro_cube.extract(iris.Constraint(seasons='ond'))
seasonal_mean = aphro_cube_ond.aggregated_by(['seasons'], iris.analysis.MEAN)
outfile = os.path.join(outdir, 'aphro.a.OND.mean.baseline.pr.mmday-1.nc')
iris.save(seasonal_mean, outfile)

<div class="alert alert-block alert-success">
<b>Question:</b> How would you calculate the standard deviation of mean rainfall?  How about annual maximum rainfall?
</div>

## 2.4 IRIS quick plotting and visualising data

Now we will plot the output to take a first look at what the precipitation 1961-1990 OND seasonal mean looks like for each dataset. This provides an initial introduction to visualising data quickly through iris, for further reading and instructions how please visit: http://scitools.org.uk/iris/docs/latest/userguide/plotting_a_cube.html

What are the differences between the plots? Note the colour bars.  Where are the largest daily rainfall rates distributed? Why do you think this is happening?

In [None]:
# Directory name where data is read from
indir = os.path.join(DATADIR, 'climatology')

# load cahpa model data
cahpa_cube = iris.load_cube(indir + '/cahpa.a.OND.mean.baseline.pr.mmday-1.nc')

# load cahpb model data
cahpb_cube = iris.load_cube(indir + '/cahpb.a.OND.mean.baseline.pr.mmday-1.nc')

# load APHRODITE data
obs_cube   = iris.load_cube(indir + '/aphro.a.OND.mean.baseline.pr.mmday-1.nc')

# Do some plotting!
# Create a figure of the size 12x10 inches
plt.figure(figsize=(12, 10))

plt.subplot(1, 3, 1)           # Create a new subplot for the model data 2 row, 2 columns, 1st plot
levels = range(0, 22, 2)       # Define the contour levels for all plots

# Note this is where cube slicing is needed as you can only plot 2-coordinate
# dimensions with qplt.contourf, so here we have selected time[0] as there is only
# one timestep (the baseline 1961-1990 mean)
qplt.contourf(cahpa_cube[0], levels=levels, cmap=cm.RdBu)
                               

plt.title('Q0 model')          # plots a title for the plot
ax = plt.gca()                 # gca function that returns the current axes
ax.coastlines()                # adds coastlines defined by the axes of the plot

plt.subplot(1, 3, 2)           # Create a new subplot for the model data 2 row, 2 columns, 2nd plot
qplt.contourf(cahpb_cube[0], levels=levels, cmap=cm.RdBu)

plt.title('ECHAM5 model')       # plots a title for the plot
ax = plt.gca()                 # gca function that returns the current axes
ax.coastlines()                # adds coastlines defined by the axes of the plot

plt.subplot(1, 3, 3)           # Create a new subplot for the observed data 2 row, 1 columns, second plot
                               # This plot will be centred and below the two model plots
qplt.contourf(obs_cube[0], levels=levels, cmap=cm.RdBu)

plt.title('APHRODITE obs')     # plots a title for the plot
ax = plt.gca()                 # gca function that returns the current axes
ax.coastlines()                # adds coastlines defined by the axes of the plot

plt.tight_layout()             # automatically adjusts subplot(s) to fit in to the figure area
plt.show()

## 2.5 Mean annual cycle calculation

If you have an area or region you want to focus on you can extract data for the region of interest. This example works through how to constrain your cube.

__a) Extract the area around Kuala Lumpur from the monthly precipitation data for both the HadCM3Q0 (cahpa) and ECHAM5 (cahpb) driven runs by specifiying latitude and longitude coordinates__

In [None]:
# Constrain the cube area over Kuala Lumpur (KL).
# PRECIS uses a rotated grid, so the co-ordinates required are different to real world coordinates.

for jobid in ['cahpa', 'cahpb']:
    # Directory name where data are read from and stored to
    infile = os.path.join(DATADIR, 'netcdf', jobid, jobid + '.pm.1981_1983.pr.norim.mmday-1.nc')
    
    # Load the baseline precipitation data using the KL_constraint - the command below
    # loads the data into a cube constrained by the area chosen
    data = iris.load_cube(infile)
    # All grid cells whose longitudes and latitudes lie within the limits shown will be selected.
    data_KL = data.intersection(grid_longitude=(-8.17, -7.43),
                                grid_latitude=(-12.10, -11.38))

    # save the constrained cube
    outfile = os.path.join(DATADIR, 'netcdf', jobid, jobid + '.pm.1981_1983.pr.norim.mmday-1.KL.nc')
    iris.save(data_KL, outfile)

Now do the same for APHRODITE:

In [None]:
# Note that the Aphrodite data are on a regular grid (unlike the  model data) so real latitudes and longitudes are
# used to define the region around KL (more on this in section 2.6)
obs_cube_KL = obs_cube.intersection(longitude=(101.25, 102.15),
                                    latitude=(2.74, 3.48))

# save the constrained cube to directory
outfile = os.path.join(DATADIR, 'APHRODITE', 'aphro.pm.6190.KL.nc')
iris.save(obs_cube_KL, outfile)

__b) We now calculate monthly mean fields for 1961-1990 for each of the twelve months for the KL area__

In [None]:
for jobid in ['cahpa', 'cahpb']:
    # Set up the path to the data
    infile = os.path.join(DATADIR, 'netcdf', jobid, jobid + '.pm.1981_1983.pr.norim.mmday-1.KL.nc')
    
    # Load the data extracted around Kuala Lumpur created in previous step
    data = iris.load_cube(infile)

    # Add monthly coord categorisation to the time dimension coordinate
    iris.coord_categorisation.add_month_number(data, 'time', name='month_number')

    # Calculate monthly mean values
    monthly_mean = data.aggregated_by(['month_number'], iris.analysis.MEAN)

    # Calculate area averaged monthly mean rainfall 
    monthly_mean = monthly_mean.collapsed(['grid_longitude', 'grid_latitude'], iris.analysis.MEAN)

    # Save the area averaged monthly mean data
    outfile = os.path.join(DATADIR, 'climatology', jobid + '.monmean.1981_1983.pr.norim.mmday-1.KL.nc')
    iris.save(monthly_mean, outfile)

__c) What is the KL area averaged monthly mean precipitation amount in mm/day for the HadCM3Q0 and ECHAM5 driven PRECIS runs?__ 

By plotting the data of the cubes note down the approximate values in mm day-1.

In [None]:
for jobid in ['cahpa', 'cahpb']:
    # Load the model cube
    inpath = os.path.join(DATADIR, 'climatology', jobid + '.monmean.1981_1983.pr.norim.mmday-1.KL.nc')
    cube = iris.load_cube(inpath)
    
    # Quick line plot for each cube 
    qplt.plot(cube.coord('month_number'), cube, label=jobid)
    plt.title('KL area averaged ' + jobid + ' monthly\n average of daily rainfall')
    ax = plt.gca()
    ax.xaxis.set_label_text('Month Number')
    ax.set_xlim(0.5, 12.5)
    plt.show()

__d) Now by following the same methodology as above for 1 b) for the KL area find the monthly means 1961-1990 for APHRODITE observations__

In [None]:
# Dirctories to load and save to
path_in = 'monthly/APHRODITE/' 
path_out = 'monthly/climatology/'

# Load the KL extracted data created in previous step
aphrod = iris.load_cube(path_in + 'aphro.pm.6190.05216.rr8.ext.mmday.nc')

# Add monthly coord categorisation to the time dim coordinate
iris.coord_categorisation.add_month_number(aphrod, 'time', name='month_number')

# Now calculate monthly means
aphro_monthly_mean = aphrod.aggregated_by(['month_number'], iris.analysis.MEAN)

# create the area averaged monthly mean of daily rainfall 
aphro_monthly_mean = aphro_monthly_mean.collapsed(['longitude', 'latitude'], iris.analysis.MEAN)

# Save output
iris.save(aphro_monthly_mean, path_out + 'aphro.monmean.baseline.05216.rr8.ext.mmday.nc')

<div class="alert alert-block alert-success">
<b>Question:</b> Plot the observations and the HadCM3Q0 and ECHAM5 driven PRECIS runs. What are the differences between the observations and models, what months are the differences greatest?
</div>

## 2.6 Comparing models and observations

In 2.4 we saw how to plot individual model output on a map, but to compare spatial model and observation fields properly they must be on the same grid. We will regrid to the coarsest grid. For the data used here, the observations have the coarsest resolution so we will regrid the model data onto the observation grid.

The PRECIS model data are on a grid known as a Rotated Grid. The idea is that the "real" north pole in the Arctic is shifted such that the equator relative to our rotated pole then runs through the centre of the regional model domain.

__a) Regrid the multiannual OND mean model fields onto the observations grid__

Here we use the `regrid` method to regrid the target cube. Here we will use linear interpolation. First, load in the data.


In [None]:
# directory where data is stored
data_path = os.path.join(DATADIR, 'climatology')

# load cahpaa
cahpa_model_cube = iris.load_cube(data_path + '/cahpa.a.OND.mean.baseline.pr.mmday-1.nc')
# load cahpba
cahpb_model_cube = iris.load_cube(data_path + '/cahpb.a.OND.mean.baseline.pr.mmday-1.nc')
# load APHRODITE into a cube
obs_cube = iris.load_cube(data_path + '/aphro.a.OND.mean.baseline.pr.mmday-1.nc')

Before we can regrid the model data to the grid used by the observations, the coordinate system used for the observations must be supplied.  In this case it is missing from the original NetCDF file, but the observations are on a regular longitude-latitude grid so the correct coordinate system is WGS84.

We define the WGS84 coordinate system and then apply it to the x- and y-axes (i.e. longitudes and latitudes) of the observations.  The coordinate system used by the model (the rotated grid) is already defined.

In [None]:
# Define WGS84 projection for obs data
wgs84 = iris.coord_systems.GeogCS(semi_major_axis=6378137.0, inverse_flattening=298.257223563)

# Apply WGS84 to obs cube
obs_cube.coord(axis='x').coord_system = wgs84
obs_cube.coord(axis='y').coord_system = wgs84

# Print out and compare the two coordinate systems
print(obs_cube.coord_system())
print(cahpa_model_cube.coord_system())

The next few lines regrid the model data to the regular grid used by the observations.  From the figures above, we know that the area over which APHRODITE data are available is larger than the PRECIS model domain. Hence, the extrapolation mode is set to 'mask' so that any grid cells on the APHRODITE grid which do not overlap with model grid cells are masked off; otherwise, the model data would be interpolated which would produce misleading results.

In [None]:
# Regrid the climate model data onto APHRODITE grid
cahpa_regrid = cahpa_model_cube.regrid(obs_cube, iris.analysis.Nearest(extrapolation_mode='mask'))
cahpb_regrid = cahpb_model_cube.regrid(obs_cube, iris.analysis.Nearest(extrapolation_mode='mask'))

# Save output
iris.save(cahpa_regrid, data_path + '/cahpa.a.OND.mean.baseline.pr.mmday-1.rg.nc')
iris.save(cahpb_regrid, data_path + '/cahpb.a.OND.mean.baseline.pr.mmday-1.rg.nc')

Now that the model grids have been regridded to the observation cube: (i) load the netCDF files, and (ii) then plot the APHRODITE and model data again (as above in 15.) to compare them visually once again.

In [None]:
# Directory name where data is read from
indir = os.path.join(DATADIR, 'climatology')

# load cahpa model data
cahpa_cube = iris.load_cube(indir + '/cahpa.a.OND.mean.baseline.pr.mmday-1.rg.nc')

# load cahpb model data
cahpb_cube = iris.load_cube(indir + '/cahpb.a.OND.mean.baseline.pr.mmday-1.rg.nc')

# load APHRODITE data
obs_cube   = iris.load_cube(indir + '/aphro.a.OND.mean.baseline.pr.mmday-1.nc')

# Do some plotting!
plt.figure(figsize=(12, 10))

plt.subplot(1, 3, 1)
levels = range(0, 22, 2)


qplt.contourf(cahpa_cube[0], levels=levels, cmap=cm.RdBu)
plt.title('HadCM3Q0 precipitation \n on a global longitude latitude grid')
ax = plt.gca()                 # gca function that returns the current axes
ax.coastlines()                # adds coastlines defined by the axes of the plot

plt.subplot(1, 3, 2) 
qplt.contourf(cahpb_cube[0], levels=levels, cmap=cm.RdBu)
plt.title('ECHAM5 precipitation \n on a global longitude latitude grid')
ax = plt.gca()
ax.coastlines()

plt.subplot(1, 3, 3)
qplt.contourf(obs_cube[0], levels=levels, cmap=cm.RdBu)
plt.title('Observational APHRODITE precipitation \n on a coarser global longitude latitude grid')
ax = plt.gca()
ax.coastlines()
plt.tight_layout()
plt.show()

<div class="alert alert-block alert-success">
<b>Question:</b> What differences do you see?
</div>

__d) Find the difference between the model and the observation OND multiannual mean fields and plot maps to view the differences__

We can simply subtract the model data from the observations.  There is a subtract function within Iris but it cannot be used here.  The model cubes contain extra coordinates which are not present in the obs cube; Iris requires all coordinates within the cubes to match exactly.

In [None]:
# Make sure units are the same
obs_cube.units = cahpb_cube.units

# Make recieving cube
cahpa_obs_diff = obs_cube.copy()
cahpb_obs_diff = obs_cube.copy()

# Replace data with the differences
cahpa_obs_diff.data = cahpa_cube.data - obs_cube.data

# cahpb - aphrodite differences
cahpb_obs_diff.data = cahpb_cube.data - obs_cube.data

# Save the differences
# iris.save(cahpa_obs_diff, data_path + 'diff.cahpa_aphro.OND.baseline.nc')
# iris.save(cahpb_obs_diff, data_path + 'diff.cahpb_aphro.OND.baseline.nc')

# Plotting
plt.figure(figsize=(12, 10))
plt.subplot(1, 2, 1)           # Create a new subplot for the first differences, 2 rows, 1 column, 1st plot

# Cut-out region with data. We use the intersection method to plot the region with data
qplt.pcolormesh(cahpa_obs_diff[0].intersection(longitude=(90, 135), latitude=(-20, 32)), 
                vmax=10, vmin=-10, 
                cmap=plt.get_cmap('RdBu'))   # Note this is where cube slicing is needed as you can only plot 2-coordinate
                               # dimensions with qplt.contourf, so here we have selected time[0] as there is only
                               # one timestep (the baseline 1961-1990 mean)

plt.title('cahpa - obs')       # plots a title for the plot
ax = plt.gca()                 # gca function that returns the current axes
ax.coastlines()                # adds coastlines defined by the axes of the plot

plt.subplot(1, 2, 2)           # Create a new subplot for the model data 2 row, 2 columns, 2nd plot
qplt.pcolormesh(cahpb_obs_diff[0].intersection(longitude=(90, 135), latitude=(-20, 32)),
             vmax=10, vmin=-10,
             cmap=plt.get_cmap('RdBu'))

plt.title('cahpb - obs')       # plots a title for the plot
ax = plt.gca()                 # gca function that returns the current axes
ax.coastlines()                # adds coastlines defined by the axes of the plot

plt.show()

## 2.7 Climatological mean and annual cycle for an ensemble

So far data from two models downscaled with PRECIS have been analysed. In this section we will look at some addional HadCM3 ensemble memebers and the CRU data set. 

This gives us an ensemble of RCM data:

* HadCM3Q0 (cahpa)
* ECHAM5 (cahpb)
* HadCM3Q3 (cahpc)
* HadCM3Q10 (cahpd)
* HadCM3Q11 (cahpe)
* HadCM3Q13 (cahpf)

And observational datasets:

* APHRODITE
* CRU

The CRU data are a monthly global land-only dataset (1901-present) at 0.5 degree resolution. Nine variables are available, including mean, min, max temperature and precipitation. For further details please see: http://www.cru.uea.ac.uk/~timm/grid/CRU_TS_2_1.html

Taking an ensemble approach allows us to account for a range of uncertainty in the model projections.

Write a series of scripts to do the following:

__a) Calculate the OND seasonal mean and annual cycle (for the KL area) for 1.5m temperature and precipitation for CRU and APHRODITE observations__

__b) Calculate OND seasonal-mean and monthly-mean anomalies for the KL area for the 4 additional HadCM3Q ensemble members (cahpc, cahpd, cahpe & cahpf)__

__c) Plot a series of figures that shows 1) the monthly cycles of temperature and rainfall comparing the 6 models and the observations; and 2) the monthly differences between the models and observations__

In [None]:
'''
Here are some useful varibles  you might like to use in your scripts
'''
# Some helpful data locations
DATADIR = '/project/precis/worksheets/data'
APHRODIR = os.path.join(DATADIR, 'APHRODITE')
CRUDIR = os.path.join(DATADIR, 'CRU')
CLIMDIR = os.path.join(DATADIR, 'climatology')
MODELDIR = os.path.join(DATADIR, 'netcdf')

# Some helpful model variables
JOBID = ['cahpa', 'cahpb', 'cahpc', 'cahpd', 'cahpe', 'cahpf']
STASHCODES = ['03236', '05216']

# Kuala Lumpur domains...
# ... in roatated pole coordinates
grid_longitude=(-8.17, -7.43)
grid_latitude=(-12.10, -11.38)
# ... in true lat-lon coordiates
longitude=(101.25, 102.15)
latitude=(2.74, 3.48)

In [None]:
'''
a) Calculate the OND seasonal-mean and monthly-mean 1.5m temperature and precipitation 
for the KL area, for CRU and APHRODITE observations
'''
# Load APHRODITE data

# Load CRU data

# Extract KL area

# Add OND season catagorisation

# Add monthly catagorisation

# Extract season

# Aggregate cubes

# Find KL area average

# Check and add cube metadata

# Save cubes to CLIMDIR
# Remember to use the same naming convention we used earlier


In [None]:
'''
b) Calculate OND seasonal-mean and monthly-mean anomalies for the KL area 
for the 4 additional HadCM3Q ensemble members (cahpc, cahpd, cahpe & cahpf)
'''
# Load ensemble members
# Remember you need to do this for both precipitation AND temperature

# Regrid ensemble members onto observational grid
# Remember you need to check your model and obs cubes have the appropriate coordinate systems defined

# Extract the KL area. Remember you are now working in true lat-lon coordinates!

# Find OND and monthly means

# Calculate model anomalies
# Remember temp anomaly   = model - CRU data
#          precip anomaly = model - APHRO data

# Check cube metadata consitency and save


<div class="alert alert-block alert-success">
    <b>Question:</b> What difference would it make if we first extracted the KL area and <em>then</em> regrid the models? <br> 
Which order is best for preserving data integrity?
</div>

__c) Create four figures:__
    
    i) the monthly cycle of temperature (model and observations) 
    ii) the monthly cyce of rainfall (model and observations)
    iii) the monthly temperature anomaly for each model
    iv) the monthly precipitation anomaly for each model

In [None]:
'''
Plot 1: The monthly cycle of temperature (model and observations)
'''


In [None]:
'''
Plot 2: The monthly cycle of precipitation (model and observations)
'''


In [None]:
'''
Plot 3: The monthly temperature anomaly for each model
'''


In [None]:
'''
Plot 4: The monthly precipitation anomaly for each model
'''


<div class="alert alert-block alert-success">
    <b>Question:</b> How could you summarise the ensemble variability amongst model members in a plot?
</div>

<div class="alert alert-block alert-success">
    <b>Question:</b> How does the monthly temperature and precipitation anomaly compare to the OND average?
</div>

<div class="alert alert-block alert-success">
<b>Question:</b> What are the advantages and disadvantages of plotting spatial maps of temperature and rainfall variability over Kuala Lumpur?
</div>

© Crown Copyright 2018, Met Office