# Temporal aggregation

This is a demonstration of how to run temporal aggregations using the cython library code. (Or other aggreagations of multiple files into one - not specifically temporal).

The code is written in cython (`raster_utilities/aggregation/temporal/core/temporal.pyx`) and a helper class `raster_utilities/aggregation/temporal/temporal_aggregation_runner.py` is provided to assist with loading the data and passing it to the core function.

This notebook demonstrates how to use the helper code.

Import the aggregation helper class

In [1]:
from raster_utilities.aggregation.temporal.temporal_aggregation_runner import TemporalAggregator


We initialise the class with a dictionary in which the keys are the required output tags (e.g. years) and the values are a list of files corresponding to that period. So, build that object

In [2]:
import glob

In [3]:
inFilePattern = r'\\map-fs1.ndph.ox.ac.uk\map_data\mastergrids\Other_Global_Covariates\Rainfall\CHIRPS\10k\*.tif'
inFiles = glob.glob(inFilePattern)

Build the dictionary based on extracting the date from the filenames - this will need changing to suit the filename patterns being used and the type of outputs we want (annual, monthly, synoptic months?) 

We can use a defaultdict rather than a real dict which simplifies the loop a bit.

The dictionary key will be used to create the output filenames so you might want to alter the strings slightly to make them more informative.

In [None]:
import os
from collections import defaultdict

In [10]:
processingKey = defaultdict(list)
# build a dictionary keyed by year, to create annual outputs
for fn in inFiles:
    parts = os.path.basename(fn).split('.')
    yr = parts[1]
    outkey = "CHIRPS."+str(yr)
    processingKey[outkey].append(fn)

We also need to specify the output folder, the output nodata value, and whether we want to create a synoptic (overall) output too (this doubles memory use so don't do unless you need it).

In [5]:
outDir = r"E:\Data\Harry\Documents\dataprep\CHIRPS_Summary"
outNDV = -9999
doSynoptic = True

Finally we need to specify which stats to do, what's appropriate will depend on the data. For rainfall we just want a sum.
The values must be specified as a list of values from the TemporalAggregationStats class. You can also use TemporalAggregationStats.ALL.

In [6]:
from raster_utilities.aggregation.aggregation_values import TemporalAggregationStats

In [7]:
stats = [TemporalAggregationStats.SUM]

Now we just need to instantiate the class and run the aggregation

In [12]:
agg = TemporalAggregator(processingKey, outDir, outNDV, stats, doSynoptic)

In [13]:
agg.RunAggregation()

Running entire extent in one pass
CHIRPS.2002
CHIRPS.2003
CHIRPS.2000
CHIRPS.2001
CHIRPS.2006
CHIRPS.2007
CHIRPS.2004
CHIRPS.2005
CHIRPS.2015
CHIRPS.2014
CHIRPS.2008
CHIRPS.2016
CHIRPS.2011
CHIRPS.2010
CHIRPS.2013
CHIRPS.2012
CHIRPS.2009
All done!
