<img src="https://raw.githubusercontent.com/EO-College/cubes-and-clouds/main/icons/cnc_3icons_process_circle.svg"
     alt="Cubes & Clouds logo"
     style="float: center; margin-right: 10px;" />

In [1]:
# pip install openeo==0.23.0

# 2.3 Data Access and Basic Processing

## Reduce Operators

When computing statistics over time or indices based on multiple bands, it is possible to use reduce operators.

In openEO we can use the [reduce_dimension](https://processes.openeo.org/#reduce_dimension) process, which applies a reducer to a data cube dimension by collapsing all the values along the specified dimension into an output value computed by the reducer.

Reduce the temporal dimension to a single value, the mean for instance:

In [2]:
import openeo
from openeo.local import LocalConnection
local_conn = LocalConnection('')

url = "https://earth-search.aws.element84.com/v1/collections/sentinel-2-l2a"
spatial_extent = {"west": 11.259613, "east": 11.406212, "south": 46.461019, "north": 46.522237}
temporal_extent = ["2022-05-10T00:00:00Z","2022-06-30T00:00:00Z"]
bands = ["red","nir"]
datacube = local_conn.load_stac(url=url,
                                spatial_extent=spatial_extent,
                                temporal_extent=temporal_extent,
                                bands=bands)

datacube_min_time = datacube.reduce_dimension(dimension="time",reducer="min")
datacube_min_time

Did not load machine learning processes due to missing dependencies: Install them like this: `pip install openeo-processes-dask[implementations, ml]`
  times = pd.to_datetime(


Check what happens to the datacube inspecting the resulting xArray object:

In [3]:
datacube_min_time.execute()

  times = pd.to_datetime(


Unnamed: 0,Array,Chunk
Bytes,12.46 MiB,3.98 MiB
Shape,"(2, 713, 1145)","(1, 606, 860)"
Dask graph,8 chunks in 8 graph layers,8 chunks in 8 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 12.46 MiB 3.98 MiB Shape (2, 713, 1145) (1, 606, 860) Dask graph 8 chunks in 8 graph layers Data type float64 numpy.ndarray",1145  713  2,

Unnamed: 0,Array,Chunk
Bytes,12.46 MiB,3.98 MiB
Shape,"(2, 713, 1145)","(1, 606, 860)"
Dask graph,8 chunks in 8 graph layers,8 chunks in 8 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


It is possible to reduce in the same way all the available dimensions of the datacube.

We can, for instance, reduce the band dimension similarly as we did for the temporal dimension:

In [4]:
datacube_mean_band = datacube.reduce_dimension(dimension="band",reducer="mean")

The result will now contain values resulting from the average of the bands:

In [5]:
datacube_mean_band.execute()

  times = pd.to_datetime(


Unnamed: 0,Array,Chunk
Bytes,130.80 MiB,3.98 MiB
Shape,"(21, 713, 1145)","(1, 606, 860)"
Dask graph,84 chunks in 6 graph layers,84 chunks in 6 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 130.80 MiB 3.98 MiB Shape (21, 713, 1145) (1, 606, 860) Dask graph 84 chunks in 6 graph layers Data type float64 numpy.ndarray",1145  713  21,

Unnamed: 0,Array,Chunk
Bytes,130.80 MiB,3.98 MiB
Shape,"(21, 713, 1145)","(1, 606, 860)"
Dask graph,84 chunks in 6 graph layers,84 chunks in 6 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


**Quiz hint: look carefully at number of pixels of the loaded datacube!**

The reducer could be again a single process, but when computing spectral indices like NDVI, NDSI etc. an arithmentical formula is used instead.

For instance, the [NDVI](https://en.wikipedia.org/wiki/Normalized_difference_vegetation_index) formula can be expressed using a `reduce_dimension` process over the `bands` dimension:

$$ NDVI = {{NIR - RED} \over {NIR + RED}} $$

In [6]:
def NDVI(data):
    red = data.array_element(index=0)
    nir = data.array_element(index=1)
    ndvi = (nir - red)/(nir + red)
    return ndvi

ndvi = datacube.reduce_dimension(reducer=NDVI,dimension="band")
ndvi.execute()

  times = pd.to_datetime(


Unnamed: 0,Array,Chunk
Bytes,130.80 MiB,3.98 MiB
Shape,"(21, 713, 1145)","(1, 606, 860)"
Dask graph,84 chunks in 9 graph layers,84 chunks in 9 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 130.80 MiB 3.98 MiB Shape (21, 713, 1145) (1, 606, 860) Dask graph 84 chunks in 9 graph layers Data type float64 numpy.ndarray",1145  713  21,

Unnamed: 0,Array,Chunk
Bytes,130.80 MiB,3.98 MiB
Shape,"(21, 713, 1145)","(1, 606, 860)"
Dask graph,84 chunks in 9 graph layers,84 chunks in 9 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray


Additionally, it is possible to reduce both spatial dimensions of the datacube at the same time.

To do this, we need the `reduce_spatial` process.

We show an example where it is used to compute the standard deviation `sd`.

In [7]:
datacube_spatial_sd = datacube.reduce_spatial(reducer="sd")
datacube_spatial_sd

Verify that the spatial dimensions were collapsed:

In [8]:
datacube_spatial_sd.execute()

  times = pd.to_datetime(


Unnamed: 0,Array,Chunk
Bytes,336 B,8 B
Shape,"(21, 2)","(1, 1)"
Dask graph,42 chunks in 7 graph layers,42 chunks in 7 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 336 B 8 B Shape (21, 2) (1, 1) Dask graph 42 chunks in 7 graph layers Data type float64 numpy.ndarray",2  21,

Unnamed: 0,Array,Chunk
Bytes,336 B,8 B
Shape,"(21, 2)","(1, 1)"
Dask graph,42 chunks in 7 graph layers,42 chunks in 7 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
