In [None]:
%matplotlib inline

## Iris introduction course
# 6. Data Processing

**Learning Outcome**: by the end of this section, you will be able to apply arithmetic and statistical operations on cube data.

**Duration:** 1.5 hour

**Overview:**<br>
6.1 [Cube Arithmetic](#arithmetic)<br>
6.2 [Aggregation and Statistics](#agg_and_stats)<br>
6.3 [Exercise](#ex_5)<br>
6.4 [Summary of the Section](#summary)

## Setup

In [None]:
import iris
import numpy as np

## 6.1 Cube Arithmetic<a id='arithmetic'></a>

Basic mathematical operators exist on the cube to allow one to add, subtract, divide, multiply and perform other mathematical operations on cubes of a similar shape to one another:

In [None]:
a1b = iris.load_cube(iris.sample_data_path('A1B_north_america.nc'))
e1 = iris.load_cube(iris.sample_data_path('E1_north_america.nc'))

print(e1.summary(True))
print(a1b)

In [None]:
scenario_difference = a1b - e1
print(scenario_difference)

Notice that the resultant cube's name is now `unknown` and that resultant cube's `attributes` and `cell_methods` have disappeared; this is because these all differed between the two input cubes.

----

<div class="alert alert-block alert-warning">
    <b><font color='brown'>Exercise: </font></b>
    <p>Work out which aspects of the cube metadata have changed in the difference result, and print them out.</p>
    <p>What do you think is the purpose of these changes in the result ?</p>
    <p>What other metadata <i>is</i> preserved ?</p>
</div>

In [None]:
#
# edit space for user code
#

In [None]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.1a

----

It is also possible to operate on cubes with numeric scalars, NumPy arrays and even cube coordinates.

<div class="alert alert-block alert-warning">
    <b><font color='brown'>Exercise: </font></b>
    <p>Can you multiply the 'e1' air temperature cube by its own latitude coordinate ?</p>
    <p>What are the units of the result ?</p>
</div>

In [None]:
#
# edit space for user code
#

In [None]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.1b

----

Although a cube's units can be freely set to any valid unit, the calculation of result units and compatibility checking is built into the arithmetic operations.

For example:

In [None]:
six_feet = iris.cube.Cube(6.0, units='feet')
twelve_days = iris.cube.Cube(12.0, units='days')
print(six_feet / twelve_days)

<div class="alert alert-block alert-warning">
    <b><font color='brown'>Exercise: </font></b>
    <p>What do you predict will result from adding together the 'six_feet' and 'twelve_days' cubes ?</p>
</div>

In [None]:
#
# edit space for user code
#

In [None]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.1c

----

Note that you can update the cube's data and metadata directly, for instance by assigning to `cube.data`, `cube.standard_name` or `cube.units`.  When you do this, though, you need to be careful that the metadata is still an accurate description.  By changing data explicitly you are basically stating that the result is correct.

<div class="alert alert-block alert-warning">
    <b><font color='brown'>Exercise: </font></b>
    <p>What happens if you change the name of `e1` to 'potential_temperature' ?</p>
    <p>What is the meaning of the resulting data cube ?</p>
    <p>What happens if you then set the units of this to a time period ?</p>
</div>

In [None]:
#
# edit space for user code
#

In [None]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.1d

----

Another function of cube arithmetic is to support 'broadcasting', in the numpy sense :  operations between data with different shapes.

In fact we already saw this above, with `product = e1 * e1.coord('latitude')`.

Broadcasting is simpler in Iris than in numpy, because how the dimensions "line up" is determined by matching coordinates, rather than depending on the ordering of dimensions.

<div class="alert alert-block alert-warning">
    <b><font color='brown'>Exercise: </font></b>
    <p>The basic result values in the above example,
        <b><font face=courier color="black">product = e1 * e1.coord('latitude')</font></b>, 
        come from multiplying <font face=courier color="black">e1<b>.data</b></font> 
        times <font face=courier color="black">e1.coord('latitude')<b>.points</b></font>.</p>
    <p>What happens if you simply multiply those two arrays, and are the values the same ?</p>
</div>

In [None]:
#
# edit space for user code
#

In [None]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.1e

An even simpler example of broadcasting is doing arithmetic between a cube and a scalar value.

<div class="alert alert-block alert-warning">
    <b><font color='brown'>Exercise: </font></b>
    <p>What happens if you add <b><font face='courier' color='black'>5.2</font></b> to the <b><font face='courier' color='black'>e1</font></b> cube ?</p>
    <p>What is the meaning of the result ?</p>
</div>

In [None]:
#
# edit space for user code
#

In [None]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.1f

If the scalar is just a value, like this one, then it is assumed to have the same units as the cube.

However, a scalar _cube_ or _coordinate_ has its own units, which take part in the calculation,
as seen above in the "feet per day" calculation.

## 6.2 Cube aggregation and statistics<a id='agg_and_stats'></a>

Many standard univariate aggregations exist in Iris. Aggregations allow one or more dimensions of a cube to be statistically collapsed for the purposes of statistical analysis of the cube's data. Iris uses the term 'aggregators' to refer to the statistical operations that can be used for aggregation.

A list of aggregators is available at http://scitools.org.uk/iris/docs/latest/iris/iris/analysis.html.

In [None]:
fname = iris.sample_data_path('uk_hires.pp')
cube = iris.load_cube(fname, 'air_potential_temperature')
print(cube.summary(True))

To take the vertical mean of this cube:

In [None]:
print(cube.collapsed('model_level_number', iris.analysis.MEAN))

NOTE: the printout shows that the result has a cell method of "mean: model_level_number".  Cell methods are a [CF metadata convention](http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#cell-methods) which records that data are the results of statistical operations.

----

<div class="alert alert-block alert-warning">
    <b><font color='brown'>Exercise: </font></b>
    <p>How can you calculate all-time minimum temperatures for this data, and what is the form of the result ?</p>
</div>

In [None]:
#
# edit space for user code
#

In [None]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.2a

----

In addition to "collapse", other types of statistical reductions are also possible.  These also use aggregators to define the statistic.  See the following documentation areas :

 * [Cube.collapsed](https://scitools.org.uk/iris/docs/latest/iris/iris/cube.html?highlight=collapsed#iris.cube.Cube.collapsed), as discussed above.
 * [Cube.rolling_window](https://scitools.org.uk/iris/docs/latest/iris/iris/cube.html?highlight=rolling#iris.cube.Cube.rolling_window).
 * [Cube.aggregated_by](https://scitools.org.uk/iris/docs/latest/iris/iris/cube.html?highlight=aggregated_by#iris.cube.Cube.aggregated_by), used with the [coord_categorisation module](https://scitools.org.uk/iris/docs/latest/iris/iris/coord_categorisation.html?highlight=categor#module-iris.coord_categorisation).  
This provides calculations of a "group-by-and-reduce" pattern  -- these are explained later in section 6, "Advanced Concepts".

## 6.3 : Section Review Exercise : arithmetic and statistics<a id='ex_5'></a>

Let's apply all that we've learned about data processing and visualisation in Iris. We will perform data processing and visualisation to compare two possible climate futures scenarios, called the A1B scenario and the E1 scenario.

#### 1\. Load data
Load as cubes the datasets found at `iris.sample_data_path('E1_north_america.nc')` and `iris.sample_data_path('A1B_north_america.nc')`. Print the summary of each cube.

In [None]:
# user code ...

In [None]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.3a

----

#### 2a\. Plot E1, A1B and difference data

Plot the following data in a single figure with three maps, side-by-side in one row :

 * the air temperature in the E1 scenario for the year 2099, 
 * the air temperature in the A1B scenario for the year 2099, and
 * the difference between the two scenarios.

Think about the most appropriate matplotlib colormap(s) to use for each plot.

Hint: the different matplotlib colormaps can be seen at https://matplotlib.org/1.5.3/examples/color/colormaps_reference.html. 

2b\. What information do your plots show? 

In [None]:
# user code ...

In [None]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.3b

----

#### 3. Produce time sequences of global area-averaged air temperature

Perform an average over all X and Y (making a 1-D time sequence), for each scenario. Calculate the model difference between these two cubes.

HINT: see the documentation on [iris.cube.Cube.collapsed](https://scitools.org.uk/iris/docs/latest/iris/iris/cube.html#iris.cube.Cube.collapsed)
and [iris.analysis.cartography.area_weights](https://scitools.org.uk/iris/docs/latest/iris/iris/analysis/cartography.html#iris.analysis.cartography.area_weights)


In [None]:
# user code ...

In [None]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.3c

----

#### 4\. Draw comparison line plots

Make a single plot with the data from the two absolute temperature cubes you produced in part 3. Make sure you label the lines you plot.  Also plot the difference "e1 - a1b" for comparison.

In [None]:
# user code ...

In [None]:
# SAMPLE SOLUTION
# %load solutions/iris_exercise_6.3d

----

## 6.4 Summary of Section: Data processing<a id='summary'></a>

In this section we learnt:
* cubes can be combined with arithmetic operators like addition, as for numpy arrays.  Broadcasting also works.
* coordinates can also be used in cube arithmetic. 
* operators are provided to perform statistical aggregations of cube data.
* statistics can be calculated over selected dimensions, identified by coordinates.
