# Aggregation (spatial downsampling)

### Continuous aggregation

The core aggregation code is written in Cython, in raster_utilities.aggregation.spatial.core.continuous.pyx. 

A helper class raster_utilities.aggregation.spatial.SpatialAggregator is provided to manage calling the Cython code.

This notebook demonstrates using the helper class to aggregate a series of continuous-type raster files.

The code has been written to read input rasters of theoreticlly unlimited size, which are read in tiles to build up the output coarser / smaller grids; memory use is determined by the size of the output files (and the number of categories, i.e. output files that are created).

In [2]:
# The helper class
import raster_utilities.aggregation.spatial.SpatialAggregator

In [3]:
# Enumerations to provide acceptable values for the aggregation parameters,
# avoid having to remember strings
from raster_utilities.aggregation.aggregation_values import *

In [4]:
import glob

### Run a continuous aggregation across a series of files in a folder

In [5]:
# The files to be aggregated should be provided as a list of filepaths. 
# (Just make a single-item list for one file)
inContFiles = glob.glob(r'J:\MOD11A2_DiurnalDiffs\1km\Synoptic\*.Synoptic.*.*.1km.Data.tif')

# Also provide the output folder
outDir = r'J:\MOD11A2_DiurnalDiffs\5km\Synoptic'

Specify the output nodata value (it doesn't have to be the same as the input, incoming NDV will be read from the files (better be set properly!)

In [None]:
ndvOut = -9999

Specify the aggregation statistics to create. This must be a list of items from the ContinuousAggregationStats enumeration, or their string representations.

In [6]:
# e.g.
# stats = [ContinuousAggregationStats.MEAN, ContinuousAggregationStats.MAX]

# or do do all of them use this convenience: 
stats = ContinuousAggregationStats.ALL.value

Finally configure the aggregation. The final parameter for the SpatialAggregator constructor should be a dictionary that configures how the aggregation will run. 

* This should have a key that is a member of the AggregationTypes enumeration, i.e. AggregationTypes.RESOLUTION, AggregationTypes.FACTOR, or AggregationTypes.SIZE. This key determines the resolution of the output files in one of three ways.
* The value of this key should be as follows:
    * AggregationTypes.RESOLUTION: (Float value, or string "1km", "5km" or "10km")
    * AggregationTypes.FACTOR: Int value (e.g. 5 to go from 1k rasters to 5k rasters
    * AggregationTypes.SIZE: 2-tuple specifying the (height,width) of the output rasters

* A key "resolution_name" may be provided, which provides the name for the output resolution to be used as the fifth token of the 6-token output filenames (e.g. "5km")

* A key "mem_limit_gb" may be provided, to limit the memory use (if not, 30GB will be the default). Note that it's not very accurate so be conservative!


In [8]:
# e.g.
# Resolution can be a floating point number, or a string representing 
# one of the core mastergrid resolutions "1km", "5km", or "10km".
aggArgs = {AggregationTypes.RESOLUTION:"5km", "resolution_name":"5km"}

Now just instantiate and run the aggregation:

In [7]:
agg = SpatialAggregator(inContFiles, outDir, ndvOut, stats, aggArgs)

In [None]:
agg.RunAggregation()