## Loading cubes when Cell Methods are present 

### Introduction

Iris can constrain the loading of data to a subset of the entire dataset based on metadata interpreted on the Iris Cube. This includes the ability to constrain based on the cube's cell methods, if present. These cell methods represent operations already applied to the data, describing climatological and diurnal statistics.  Available methods include point, sum, mean, maximum, minimum, mid_range, standard_deviation, variance, mode, and median.  Along with an associated coordinate, an interval is often defined.  An example cell method might be a *time mean of two hours*.  Here the method is "mean", "time" is the coordinate name and an interval of two hours indicates that the time mean is over a two hour interval.

This worked example shows how to constrain the loading of cubes dependent on whether they have one or more cell methods present.

### Writing a constraining function

The `iris.Constraint` function can be used to constrain cube loading in many different ways. The one that is of interest to us here is the keyword argument `cube_func`, which allows us to specify a function that takes a cube as its only argument and returns either `True` or `False`.

Below is a function that does precisely that, returning `True` or `False` depending on whether the cube passed has cell methods set or not.

We will also import Iris, as it's going to be needed later on.

In [1]:
import iris
print(f'Iris version: {iris.__version__}')

Iris version: 3.10.0


In [2]:
def has_cell_methods(cube):
    cm = cube.cell_methods
    return len(cm) > 0

The `cell_methods` method always returns a tuple, which will be empty if no cell methods are set on the cube. An empty tuple has length zero, giving us a logical test to run against the input cube to determine if any cell methods are set, as performed in the `return` line above.

### Simple constraining

All that remains to do is to use our cell methods constraint function when loading some data. We will use three Iris sample data files to test whether it is working. They are:

* A1B_north_america.nc (one cube of air temperature that does have cell methods),
* ostia_monthly.nc (one cube of surface_temperature with cell methods), and
* colpex.pp (comprised of two cubes [air potential temperature and air pressure], neither of which have any cell methods).
    
If our constraint function above works as expected, we would expect to see the cubes from the first two files loaded fine, but neither of the cubes from the third file.

Let's test this. First we need to add our function to a constraint that we can use when loading our data:

In [3]:
cm_constraint = iris.Constraint(cube_func=has_cell_methods)

Now we can test it out on our sample data. Let's start by loading the data into an Iris `cubelist` of the three cubes loaded by our data:

In [4]:
a1b_fname = iris.sample_data_path('A1B_north_america.nc')
colpex_fname = iris.sample_data_path('colpex.pp')
ostia_fname = iris.sample_data_path('ostia_monthly.nc')

cubelist = iris.load([a1b_fname, colpex_fname, ostia_fname])
for cube in cubelist:
    print('{}\n\tCell Methods = {}\n'.format(cube.summary(True), cube.cell_methods))

air_potential_temperature / (K)     (time: 6; model_level_number: 10; grid_latitude: 83; grid_longitude: 83)
	Cell Methods = ()

air_pressure / (Pa)                 (time: 6; model_level_number: 10; grid_latitude: 83; grid_longitude: 83)
	Cell Methods = ()

air_temperature / (K)               (time: 240; latitude: 37; longitude: 49)
	Cell Methods = (CellMethod(method='mean', coord_names=('time',), intervals=('6 hour',), comments=()),)

surface_temperature / (K)           (time: 54; latitude: 18; longitude: 432)
	Cell Methods = (CellMethod(method='mean', coord_names=('month', 'year'), intervals=(), comments=()),)





We can now extract from our cubelist using the constraint we defined above. This will leave only the cubes where cell methods are defined:

In [5]:
print (cubelist.extract(cm_constraint))

0: air_temperature / (K)               (time: 240; latitude: 37; longitude: 49)
1: surface_temperature / (K)           (time: 54; latitude: 18; longitude: 432)


Good news! Using our constraint to extract from the cubelist has returned only the cube with cell methods defined.

### Constraining based on the methods used

We can go further than what is demonstrated in the example above, as we can also constrain loading based on the specifics of the cell methods used. From the printed cubelist above it is clear that while both cubes with cell methods have the same method (`mean`) deployed, the specifics of that mean change, in terms of both the coordinates affected and the interval defined.

We can use these differences to select *just one* of the two cubes with cell methods. Let's define a constraint that will only load the `air_temperature` cube in the list immediately above:

In [6]:
interval_6_hr = iris.coords.CellMethod('mean', coords='time', intervals='6 hour')
constraint_6_hr = iris.Constraint(cube_func=lambda cube: interval_6_hr in cube.cell_methods)

This is a two-step process. First we define a cell method based on the definition of the cell methods we want to match to, and then test for its inclusion in the cell methods of a given cube when we reference it in the new constraint defined. Let's test our new constraint on our cubelist:

In [7]:
print (cubelist.extract(constraint_6_hr))

0: air_temperature / (K)               (time: 240; latitude: 37; longitude: 49)


So now we can not only extract from a cubelist based on the general presence of cell methods, but also on the presence of a specific, pre-defined cell methods instance.