<!--COURSE_INFORMATION-->
<img align="left" style="padding-right:10px;" src="https://sitejerk.com/images/google-earth-logo-png-5.png" width=5% >
<img align="right" style="padding-left:10px;" src="https://colab.research.google.com/img/colab_favicon_256px.png" width=6% >


>> *This notebook is part of the free course [EEwPython](https://colab.research.google.com/github/csaybar/EEwPython/blob/master/index.ipynb); the content is available [on GitHub](https://github.com/csaybar/EEwPython)* and released under the [Apache 2.0 License](https://www.gnu.org/licenses/gpl-3.0.en.html). 99% of this material has been adapted from [Google Earth Engine Guides](https://developers.google.com/earth-engine/).

<!--NAVIGATION-->
 < [Geometry, Feature and FeatureCollection](4_features.ipynb) | [Contents](index.ipynb) |  [Joins](6_Joins.ipynb)>

<a href="https://colab.research.google.com/github/csaybar/EEwPython/blob/master/5_Reducer.ipynb"><img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab" title="Open and Execute in Google Colaboratory"></a>

<center>
<h1>Google Earth Engine with Python </h1>
<h2> Reducers </h2>
</center>
<h2> Topics:</h2>

1. Reducer Overview
2. ImageCollection Reductions
3. Image Reductions
4. Statistics of an Image Region
5. Statistics of Image Regions
6. Statistics of Image Neighborhoods
7. Raster to Vector Conversion
8. Vector to Raster Conversion
9. Grouped Reductions and Zonal Statistics
10. Weighted Reductions
11. Linear Regression



## Connecting GEE with Google Services

- **Authenticate to Earth Engine**

In [0]:
!pip install earthengine-api #earth-engine Python API

In [0]:
!earthengine authenticate 

- **Authenticate to Google Drive (OPTIONAL)**

In [0]:
from google.colab import drive
drive.mount('/content/drive')

- **Authenticate to Google Cloud (OPTIONAL)**

In [0]:
from google.colab import auth
auth.authenticate_user()

## Testing the software setup

In [0]:
# Earth Engine Python API
import ee
ee.Initialize()

In [0]:
import folium

# Define the URL format used for Earth Engine generated map tiles.
EE_TILES = 'https://earthengine.googleapis.com/map/{mapid}/{{z}}/{{x}}/{{y}}?token={token}'

print('Folium version: ' + folium.__version__)

In [0]:
#@title Mapdisplay: Display GEE objects using folium.
def Mapdisplay(center, dicc, Tiles="OpensTreetMap",zoom_start=10):
    '''
    :param center: Center of the map (Latitude and Longitude).
    :param dicc: Earth Engine Geometries or Tiles dictionary
    :param Tiles: Mapbox Bright,Mapbox Control Room,Stamen Terrain,Stamen Toner,stamenwatercolor,cartodbpositron.
    :zoom_start: Initial zoom level for the map.
    :return: A folium.Map object.
    '''
    mapViz = folium.Map(location=center,tiles=Tiles, zoom_start=zoom_start)
    for k,v in dicc.items():
      if ee.image.Image in [type(x) for x in v.values()]:
        folium.TileLayer(
            tiles = v["tile_fetcher"].url_format,
            attr  = 'Google Earth Engine',
            overlay =True,
            name  = k
          ).add_to(mapViz)
      else:
        folium.GeoJson(
        data = v,
        name = k
          ).add_to(mapViz)
    mapViz.add_child(folium.LayerControl())
    return mapViz

# 1. Reducer Overview

Reducers are the way to aggregate data over time, space, bands, arrays and other data structures in Earth Engine. The `ee.Reducer` class specifies how data is aggregated. The reducers in this class can specify a simple statistic to use for the aggregation (e.g. minimum, maximum, mean, median, standard deviation, etc.), or a more complex summary of the input data (e.g. histogram, linear regression, list). Reductions may occur over:

- **time**  = `imageCollection.reduce()`
- **space** = `image.reduceRegion()` and `image.reduceNeighborhood()`**
- **bands**  = `image.reduce()`,

Or the attribute space of a `FeatureCollection` (`featureCollection.reduceColumns()` or `FeatureCollection` methods that start with `aggregate_`).


## Reducers have inputs and outputs

Reducers take an input dataset and produce a single output. When a single input reducer is applied to a multi-band image, Earth Engine automatically replicates the reducer and applies it separately to each band. As a result, the output image has the same number of bands as the input image; each band in the output is the reduction of pixels from the corresponding band in the input data. Some reducers take tuples of input datasets. These reducers will not be automatically replicated for each band. For example, `ee.Reducer.LinearRegression()` takes multiple predictor datasets (representing independent variables in the regression) in a particular order (see Regression reducers section bellow).

Some reducers produce multiple outputs, for example `ee.Reducer.minMax()`, `ee.Reducer.histogram()` or `ee.Reducer.toList()`. For example:

In [0]:
# Load and filter the Sentinel-2 image collection.
collection = ee.ImageCollection('COPERNICUS/S2')\
               .filterDate('2016-01-01', '2016-12-31')\
               .filterBounds(ee.Geometry.Point([-81.31, 29.90]))

# Reduce the collection.
extrema = collection.reduce(ee.Reducer.minMax())

This will produce an output with twice the number of bands of the inputs, where band names in the output have ‘_min’ or ‘_max’ appended to the band name.

The output type should match the computation. For example, a reducer applied to an ImageCollection has an Image output. Because the output is interpreted as a pixel value, you must use reducers with a numeric output to reduce an ImageCollection (reducers like `toList()` or `histogram()` won’t work).

## Reducers use weighted inputs

By default, reductions over pixel values are weighted by their mask, though this behavior can be changed (see the Weighting section bellow). Pixels with mask equal to 0 will not be used in the reduction.

## Combining reducers

If your intent is to apply multiple reducers to the same inputs, it's good practice to `combine()` the reducers for efficiency. Specifically, calling `combine()` on a reducer with `sharedInputs` set to `true` will result in only a single pass over the data. For example, to compute the mean and standard deviation of pixels in an image, you could use something like this:

In [0]:
from pprint import pprint 

# Load a Landsat 8 image.
image = ee.Image('LANDSAT/LC08/C01/T1/LC08_044034_20140318')

# Combine the mean and standard deviation reducers.
reducers = ee.Reducer.mean().combine(
  reducer2=ee.Reducer.stdDev(),
  sharedInputs=True
)

# Use the combined reducer to get the mean and SD of the image.
stats = image.reduceRegion(
  reducer=reducers,
  bestEffort=True,
)

# Display the dictionary of band means and SDs.
pprint(stats.getInfo())

In the output, note that the names of the reducers have been appended to the names of the inputs to distinguish the reducer outputs. This behavior also applies to image outputs, which will have the name of the reducer appended to output band names.

# 2. ImageCollection Reductions

Consider the example of needing to take the median over a time series of images represented by an `ImageCollection`. To reduce an `ImageCollection`, use `imageCollection.reduce()`. This reduces the collection of images to an individual image as illustrated in Figure 1. Specifically, the output is computed pixel-wise, such that each pixel in the output is composed of the median value of all the images in the collection at that location. To get other statistics, such as mean, sum, variance, an arbitrary percentile, etc., the appropriate reducer should be selected and applied. For basic statistics like min, max, mean, etc., `ImageCollection` has shortcut methods like `min()`, `max()`, `mean()`, etc. They function in exactly the same way as calling reduce(), except the resultant band names will not have the name of the reducer appended.


<center>
<image src = "https://developers.google.com/earth-engine/images/Reduce_ImageCollection.png">  
</center>

<center>
  Figure 1. Illustration of an ee.Reducer applied to an ImageCollection.

In [0]:
# Load an image collection, filtered so it's not too much data.
collection = ee.ImageCollection('LANDSAT/LT05/C01/T1')\
               .filterDate('2008-01-01', '2008-12-31')\
               .filter(ee.Filter.eq('WRS_PATH', 44))\
               .filter(ee.Filter.eq('WRS_ROW', 34))

# Compute the median in each band, each pixel.
# Band names are B1_median, B2_median, etc.
median = collection.reduce(ee.Reducer.median())

# The output is an Image.  Add it to the map.
vis_param = {'bands': ['B4_median', 'B3_median', 'B2_median'], 'gamma': 1.6}
median_tk = median.getMapId(vis_param)

center = [37.7924, -122.3355]
Mapdisplay(center,{'Landsat 5':median_tk},zoom_start=9)

This returns a multi-band Image, each pixel of which is the median of all unmasked pixels in the ImageCollection at that pixel location. Specifically, the reducer has been repeated for each band of the input imagery. Note that the band names have the name of the reducer appended: ‘B1_median’, ‘B2_median’, etc.


# 2. Image Reductions

To reduce an `Image`, use `image.reduce()`. Reducing an image functions in an analogous way to `imageCollection.reduce()`, except the bands of the image are input to the reducer rather than the images in the collection. The output is also an image with number of bands equal to number of reducer outputs. For example:




In [0]:
# Load an image and select some bands of interest.
image = ee.Image('LANDSAT/LC08/C01/T1/LC08_044034_20140318')\
          .select(['B4', 'B3', 'B2'])

# Reduce the image to get a one-band maximum value image.
maxValue = image.reduce(ee.Reducer.max())

# Display the result
vis_param = {'max': 13000, 'gamma': 1.6}
center = [37.7924, -122.3355]

maxValue_tk = maxValue.getMapId(vis_param)
Mapdisplay(center,{'Landsat 5':maxValue_tk},zoom_start=10)

# 4. Statistics of an Image Region

Suppose there is need to calculate statistics over a region (or regions) of an `ee.Image`. To get statistics of pixel values in an image region, use `image.reduceRegion()`. This reduces all the pixels in the region(s) to a statistic or other compact representation of the pixel data in the region (e.g. histogram). The region is represented as a `Geometry`, which might be a polygon, containing many pixels, or it might be a single point, in which case there will only be one pixel in the region. In either case, as illustrated in Figure 2,  the output is a statistic derived from the pixels in the region.

<center>
<image src=" https://developers.google.com/earth-engine/images/Reduce_region_diagram.png" >
</center>
  

<center>
  Figure 2. An illustration of an `ee.Reducer` applied to an image and a region.
</center>
  
  
For an example of getting pixel statistics in a region of an image using reduceRegion(), consider finding the mean spectral values of a 5-year Landsat composite within the boundaries of the Sierra Nevada Coniferous Forest:
  
  

In [0]:
# Load input imagery: Landsat 7 5-year composite.
image = ee.Image('LANDSAT/LE7_TOA_5YEAR/2008_2012')

# Load an input region: Sierra Nevada mixed conifer forest.
region = ee.Feature(ee.FeatureCollection(
  'ft:1Ec8IWsP8asxN-ywSqgXWMuBaxI6pPaeh6hC64lA')
  .filter(ee.Filter.eq('G200_REGIO', 'Sierra Nevada Coniferous Forests'))\
  .first())

# Reduce the region. The region parameter is the Feature geometry.
meanDictionary = image.reduceRegion(**{
  'reducer': ee.Reducer.mean(),
  'geometry': region.geometry(),
  'scale': 30,
  'maxPixels': 1e9
})

# The result is a Dictionary.  Print it.
pprint(meanDictionary.getInfo())

{'B1': 24.42471064449399,
 'B2': 22.57720943212569,
 'B3': 20.919795223268885,
 'B4': 53.67924806897354,
 'B5': 34.62817356591265,
 'B6_VCID_2': 198.17174001586343,
 'B7': 21.361143212564883}


Note that in this example the reduction is specified by providing the `reducer` (`ee.Reducer.mean()`), the `geometry` (`region.geometry()`), the `scale` (30 meters) and `maxPixels` for the maximum number of pixels to input to the reducer. A scale should always be specified in reduceRegion() calls. This is because in complex processing flows, which may involve data from different sources with different scales, the scale of the output will not be unambiguously determined from the inputs. In that case, the scale defaults to 1 degree, which generally produces unsatisfactory results. See [this page](https://developers.google.com/earth-engine/scale) for more information about how Earth Engine handles scale.

There are two ways to set the scale: by specifying the scale parameter, or by specifying a CRS and CRS transform. (See the [glossary](https://developers.google.com/earth-engine/glossary) for more information about CRS's and CRS transforms). For example, the meanDictionary reduction (above) is equivalent to the following:

In [0]:
# As an alternative to specifying scale, specify a CRS and a CRS transform.
# Make this array by constructing a 4326 projection at 30 meters,
# then copying the bounds of the composite, from composite.projection().
affine = [0.00026949458523585647, 0, -180, 0, -0.00026949458523585647, 86.0000269494563];

# Perform the reduction, print the result.
pprint(image.reduceRegion(**{
  'reducer': ee.Reducer.mean(),
  'geometry': region.geometry(),
  'crs': 'EPSG:4326',
  'crsTransform': affine,
  'maxPixels': 1e9
}).getInfo()) 

{'B1': 24.42471064449399,
 'B2': 22.57720943212569,
 'B3': 20.919795223268885,
 'B4': 53.67924806897354,
 'B5': 34.62817356591265,
 'B6_VCID_2': 198.17174001586343,
 'B7': 21.361143212564883}


In general, specifying the scale is sufficient and results in more readable code. Earth Engine determines which pixels to input to the reducer by first rasterizing the region. If a scale is specified without a CRS, the region is rasterized in the image's native projection scaled to the specified resolution. If both a CRS and scale are specified, the region is rasterized based on them. Pixels are ‘in’ the region if their centroid is covered by the region at the specified scale and projection.

The `maxPixels` parameter is needed to get the computation to succeed. If this parameter is left out of the example, an error is returned, which looks something like:



```
Dictionary (Error)
  Image.reduceRegion: Too many pixels in the region. Found 527001545, but only 10000000 allowed.
```


There are multiple options to get past these errors: increase maxPixels, as in the example, increase the scale, or set bestEffort to true, which automatically computes a new (larger) scale such that maxPixels is not exceeded. If you do not specify maxPixels, the default value is used.

# 4. Statistics of Image Regions

To get image statistics in multiple regions stored in a `FeatureCollection`, you can use `image.reduceRegions()` to reduce multiple regions at once. The input to `reduceRegions()` is an Image and a `FeatureCollection`. The output is another `FeatureCollection` with the `reduceRegions()` output set as properties on each `Feature`. In this example, means of the Landsat 7 annual composite bands in each feature geometry will be added as properties to the input features:


In [0]:
# Load input imagery: Landsat 7 5-year composite.
image = ee.Image('LANDSAT/LE7_TOA_5YEAR/2008_2012')

# Load a FeatureCollection of counties in Maine.
maineCounties = ee.FeatureCollection('ft:1S4EB6319wWW2sWQDPhDvmSBIVrD3iEmCLYB7nMM')\
                  .filter(ee.Filter.eq('StateName', 'Maine'))

# Add reducer output to the Features in the collection.
maineMeansFeatures = image.reduceRegions(**{
  'collection': maineCounties,
  'reducer': ee.Reducer.mean(),
  'scale': 30,
})

# Print the first feature, to illustrate the result.
pprint(ee.Feature(maineMeansFeatures.first()).select(image.bandNames()).getInfo())

{'geometry': {'coordinates': [[[-70.02323189999998, 44.134471999999995],
                               [-70.021187, 44.158581],
                               [-70.01097100000001, 44.15873299999999],
                               [-70.006828, 44.17503],
                               [-69.998047, 44.172485],
                               [-69.993637, 44.182384],
                               [-70.075027, 44.208988],
                               [-70.074059, 44.291981],
                               [-70.121117, 44.374111],
                               [-70.10193599999998, 44.38496000000001],
                               [-70.1249239, 44.486477],
                               [-70.1996379, 44.479504],
                               [-70.2317659, 44.46626700000001],
                               [-70.273293, 44.445076],
                               [-70.260872, 44.36852999999999],
                               [-70.319565, 44.21826599999999],
                             

# 5. Statistics of Image Neighborhoods

Rather than specifying a region over which to perform a reduction, it is also possible to specify a neighborhood in which to apply a reducer. To reduce image neighborhoods, use `image.reduceNeighborhood()`. In this case, the reduction will occur in a sliding window over the input image, with the window size and shape specified by an `ee.Kernel`. The output of `reduceNeighborhood()` will be another image, with each pixel value representing the output of the reduction in a neighborhood around that pixel in the input image. Figure 3 illustrates this type of reduction.

<center>
<img src="https://developers.google.com/earth-engine/images/Reduce_Neighborhood.png">  
</center>  

<center>
Figure 3. Illustration of `reduceNeighborhood()`, where the reducer is applied in a kernel.
</center>  


For example, consider using National Agriculture Imagery Program (NAIP) imagery to quantify landscape differences resulting from logging in the California redwood forests. Specifically, use standard deviation (SD) in a neighborhood to represent the difference in texture between the logged area and the protected area. For example, to get texture of a NAIP Normalized Difference Vegetation Index (NDVI) image, use reduceNeighborhood() to compute SD in a neighborhood defined by a kernel:



In [0]:
# Define a region in the redwood forest.
redwoods = ee.Geometry.Rectangle(-124.0665, 41.0739, -123.934, 41.2029)

# Load input NAIP imagery and build a mosaic.
naipCollection = ee.ImageCollection('USDA/NAIP/DOQQ')\
                   .filterBounds(redwoods)\
                   .filterDate('2012-01-01', '2012-12-31')

naip = naipCollection.mosaic()

# Compute NDVI from the NAIP imagery.
naipNDVI = naip.normalizedDifference(['N', 'R'])

# Compute standard deviation (SD) as texture of the NDVI.
texture = naipNDVI.reduceNeighborhood(**{
  'reducer': ee.Reducer.stdDev(),
  'kernel': ee.Kernel.circle(7),
})

# Display the results.
center = redwoods.centroid().getInfo()['coordinates']
center.reverse()
dicc ={'NAIP input imagery': naip.getMapId(),
       'NDVI': naipNDVI.getMapId({'min': -1, 'max': 1, 'palette': ['FF0000', '00FF00']}),
       'SD of NDVI':texture.getMapId({'min': 0, 'max': 0.3})}
Mapdisplay(center,dicc,zoom_start=12)

# 6. Statistics of FeatureCollection Columns

To reduce properties of features in a `FeatureCollection`, use `featureCollection.reduceColumns()`. Consider the following toy example:


In [0]:
# Make a toy FeatureCollection.
aFeatureCollection = ee.FeatureCollection([
  ee.Feature(None, {'foo': 1, 'weight': 1}),
  ee.Feature(None, {'foo': 2, 'weight': 2}),
  ee.Feature(None, {'foo': 3, 'weight': 3}),
])

# Compute a weighted mean and display it.
pprint(aFeatureCollection.reduceColumns(**{
  'reducer': ee.Reducer.mean(),
  'selectors': ['foo'],
  'weightSelectors': ['weight']
}).getInfo())

{'mean': 2.3333333333333335}


As a more complex example, consider a `FeatureCollection` of US counties with census data as attributes. The variables of interest are total population and total housing units. You can get their sum(s) by supplying a summing reducer argument to `reduceColumns()` and printing the result:

In [0]:
# Load a collection of US counties with census data properties and display it.
counties = ee.FeatureCollection('ft:1S4EB6319wWW2sWQDPhDvmSBIVrD3iEmCLYB7nMM')

# Compute sums of the specified properties and print the resultant Dictionary.
sums = counties.filter(ee.Filter.And(ee.Filter.neq('Census 2000 Population', None),
                                     ee.Filter.neq('Census 2000 Housing Units', None)))\
               .reduceColumns(**{'reducer': ee.Reducer.sum().repeat(2),
                               'selectors': ['Census 2000 Population', 'Census 2000 Housing Units']})
print(sums.getInfo())

# Display with folium!
center = counties.geometry().centroid().getInfo()['coordinates']
center.reverse()
Mapdisplay(center,{'Census':counties.getMapId()},zoom_start=5)

{'sum': [279162260.0, 115048233.0]}


Note that because feature collections may contain missing data (unlike images, which handle missing data with masks), the input needs to be pre-filtered to eliminate null values.

An error that looks something like the following may be thrown as a result of attributes with **None** values:

```
Dictionary (Error)
  Collection.reduceColumns: Can't set input 0 of Reducer(reducer=SUM, count=2) to .:
  Input must be a scalar number.
```  
Also note that unlike `imageCollection.reduce()`, in which reducers are automatically repeated for each band, reducers on a `FeatureCollection` must be explicitly repeated using `repeat()`. Specifically, repeat the reducer m times for m inputs. The following error may be thrown as a result of not repeating the reducer:

```
Dictionary (Error)
  Collection.reduceColumns: Need 1 inputs for <Reducer>, got 2.
```



# 7. Raster to Vector Conversion

To convert from an `Image` (raster) to a `FeatureCollection` (vector) data type, use `image.reduceToVectors()`. This is the primary mechanism for vectorization in Earth Engine, and can be useful for generating regions for input to other types of reducer. The `reduceToVectors()` method creates polygon edges (optionally centroids or bounding boxes instead) at the boundary of homogeneous groups of connected pixels.

For example, consider a 2012 nightlights image of Japan. Let the nightlights digital number serve as a proxy for development intensity. Define zones using arbitrary thresholds on the nightlights, combine the zones into a single-band image, vectorize the zones using `reduceToVectors()`:

In [0]:
# Load a Japan boundary from the Large Scale International Boundary dataset.
japan = ee.FeatureCollection('USDOS/LSIB_SIMPLE/2017')\
          .filter(ee.Filter.eq('country_na', 'Japan'))

# Load a 2012 nightlights image, clipped to the Japan border.
nl2012 = ee.Image('NOAA/DMSP-OLS/NIGHTTIME_LIGHTS/F182012')\
           .select('stable_lights')\
           .clipToCollection(japan)

# Define arbitrary thresholds on the 6-bit nightlights image.
zones = nl2012.gt(30)\
              .add(nl2012.gt(55))\
              .add(nl2012.gt(62))\
              .updateMask(zones.neq(0))


# Convert the zones of the thresholded nightlights to vectors.
vectors = zones.addBands(nl2012).reduceToVectors(**{
  'geometry': japan,
  'crs': nl2012.projection(),
  'scale': 1000,
  'geometryType': 'polygon',
  'eightConnected': False,
  'labelProperty': 'zone',
  'reducer': ee.Reducer.mean()
})

display = ee.Image(0).updateMask(0).paint(vectors, '000000', 3)


# Display the thresholds.
center = [35.712, 139.6225]
zones_tk = zones.getMapId({'min': 1, 'max': 3, 'palette': ['0000FF', '00FF00', 'FF0000']})
display_tk = display.getMapId({'palette': '000000'})

Mapdisplay(center,{'raster':zones_tk,'vectors':display_tk},zoom_start=9)


<ee.image.Image at 0x7fd39932fdd8>

Note that the first band in the input is used to identify homogeneous regions and the remaining bands are reduced according to the provided reducer, the output of which is added as a property to the resultant vectors. The `geometry` parameter specifies the extent over which the vectors should be created. In general, it is good practice to specify a minimal zone over which to create vectors. It is also good practice to specify the `scale` and `crs` to avoid ambiguity. The output type is `‘polygon’` where the polygons are formed from homogeneous zones of four-connected neighbors (i.e. `eightConnected` is false). The last two parameters, `labelProperty` and reducer, specify that the output polygons should receive a property with the zone label and the mean of the nightlights band(s), respectively.

# 8. Vector to Raster Conversion

Vector to raster conversion in Earth Engine is handled by the `featureCollection.reduceToImage()` method. This method assigns pixels under each feature the value of the specified property. This example uses the counties data to create an image representing the population of each county:

In [0]:
# Load a collection of US counties with census data properties.
counties = ee.FeatureCollection('ft:1S4EB6319wWW2sWQDPhDvmSBIVrD3iEmCLYB7nMM')

# Make an image out of the population attribute and display it.
popImage = counties.filter(ee.Filter.neq('Census 2000 Population', None))\
                   .reduceToImage(**{'properties': ['Census 2000 Population'],
                                     'reducer': ee.Reducer.first()})
center = [40.38, -99.976]
popImage_tk = popImage.getMapId({'min': 0, 'max': 1000000,
                                 'palette': ['0000FF', '00FF00', '00FFFF', 'FF0000']})
Mapdisplay(center, {'popImage':popImage_tk}, zoom_start=5)

If the features overlap, specify a reducer to indicate how to aggregate properties of overlapping features. In the previous example, since there is no overlap, an ee.Reducer.first() is sufficient. As in this example, pre-filter the data to eliminate nulls that can not be turned into an image. 

# 9. Grouped Reductions and Zonal Statistics

You can get statistics in each zone of an `Image` or `FeatureCollection` by using `reducer.group()` to group the output of a reducer by the value of a specified input. For example, to compute the total population and number of housing units in each state, this example groups the output of a reduction of a counties `FeatureCollection` as follows:




In [0]:
# Load a collection of US counties with census data properties.
counties = ee.FeatureCollection('ft:1S4EB6319wWW2sWQDPhDvmSBIVrD3iEmCLYB7nMM')

# Compute sums of the specified properties, grouped by state name.
sums = counties.filter(ee.Filter.And(ee.Filter.neq('Census 2000 Population', None),
                                     ee.Filter.neq('Census 2000 Housing Units', None)))\
               .reduceColumns(**{
                  'selectors': ['Census 2000 Population', 'Census 2000 Housing Units', 'StateName'],
                  'reducer': ee.Reducer.sum().repeat(2).group(**{
                      'groupField': 2,
                      'groupName': 'state'})})

# Print the resultant Dictionary.
pprint(sums.getInfo())

{'groups': [{'state': 'Alabama', 'sum': [4447100.0, 1963711.0]},
            {'state': 'Alaska', 'sum': [620795.0, 257020.0]},
            {'state': 'Arizona', 'sum': [5130632.0, 2189189.0]},
            {'state': 'Arkansas', 'sum': [2673400.0, 1173043.0]},
            {'state': 'California', 'sum': [33871648.0, 12214549.0]},
            {'state': 'Colorado', 'sum': [4301261.0, 1808037.0]},
            {'state': 'Connecticut', 'sum': [3405565.0, 1385975.0]},
            {'state': 'Delaware', 'sum': [783600.0, 343072.0]},
            {'state': 'District of Columbia', 'sum': [572059.0, 274845.0]},
            {'state': 'Florida', 'sum': [13729016.0, 6450669.0]},
            {'state': 'Georgia', 'sum': [8186453.0, 3281737.0]},
            {'state': 'Hawaii', 'sum': [1211390.0, 460370.0]},
            {'state': 'Idaho', 'sum': [1293953.0, 527824.0]},
            {'state': 'Illinois', 'sum': [12419293.0, 4885615.0]},
            {'state': 'Indiana', 'sum': [6080485.0, 2532319.0]},
         

The `groupField` argument is the index of the input in the selectors array that contains the codes by which to group, the `groupName` argument specifies the name of the property to store the value of the grouping variable. Since the reducer is not automatically repeated for each input, the `repeat(2)` call is needed.

To group output of `image.reduceRegions()` you can specify a grouping band that defines groups by integer pixel values. This type of computation is sometimes called "zonal statistics" where the zones are specified as the grouping band and the statistic is determined by the reducer. In the following example, change in nightlights in the United States is grouped by land cover category:

In [0]:
# Load a region representing the United States
region = ee.Feature(ee.FeatureCollection('ft:1tdSwUL7MVpOauSgRzqVTOwdfy17KDbw-1d9omPw')\
                      .filter(ee.Filter.eq('Country', 'United States'))\
                      .first())

# Load MODIS land cover categories in 2001.
landcover = ee.Image('MODIS/051/MCD12Q1/2001_01_01')\
              .select('Land_Cover_Type_1') # Select the IGBP classification band.

# Load nightlights image inputs.
nl2001 = ee.Image('NOAA/DMSP-OLS/NIGHTTIME_LIGHTS/F152001')\
           .select('stable_lights')

nl2012 = ee.Image('NOAA/DMSP-OLS/NIGHTTIME_LIGHTS/F182012')\
           .select('stable_lights')

# Compute the nightlights decadal difference, add land cover codes.
nlDiff = nl2012.subtract(nl2001).addBands(landcover)

# Grouped a mean reducer: change of nightlights by land cover category.
means = nlDiff.reduceRegion(**{
  'reducer': ee.Reducer.mean().group(**{
    'groupField': 1,
    'groupName': 'code',
  }),
  'geometry': region.geometry(),
  'scale': 1000,
  'maxPixels': 1e8
});

# Print the resultant Dictionary.
pprint(means.getInfo())

{'groups': [{'code': 0, 'mean': -0.11172072503842889},
            {'code': 1, 'mean': -0.10531694840105481},
            {'code': 2, 'mean': 0.5319690050938765},
            {'code': 3, 'mean': -0.2348240630931086},
            {'code': 4, 'mean': -0.23063576587635104},
            {'code': 5, 'mean': 0.014154122164175172},
            {'code': 6, 'mean': 0.4316131220363849},
            {'code': 7, 'mean': 0.1385422162199612},
            {'code': 8, 'mean': 0.40154603020642976},
            {'code': 9, 'mean': 0.19139864991972394},
            {'code': 10, 'mean': 0.1837396574848538},
            {'code': 11, 'mean': 0.20074970826357522},
            {'code': 12, 'mean': -0.2169800337243732},
            {'code': 13, 'mean': 0.28636567428499526},
            {'code': 14, 'mean': -0.06603301038183143},
            {'code': 15, 'mean': 0.025706054801271267},
            {'code': 16, 'mean': 0.1768030496705951}]}


Note that in this example, the `groupField` is the index of the band containing the zones by which to group the output. The first band is index 0, the second is index 1, etc.

# 10. Weighted Reductions

By default, reducers applied to imagery weight the inputs according to the mask value. This is relevant in the context of fractional pixels created through operations such as `clip()`. Adjust this behavior by calling `unweighted()` on the reducer. Using an unweighted reducer forces all pixels in the region to have the same weight. The following example illustrates how pixel weighting can affect the reducer output:



In [0]:
# Load a Landsat 8 input image.
image = ee.Image('LANDSAT/LC08/C01/T1/LC08_044034_20140318')

# Creat an arbitrary region.
geometry = ee.Geometry.Rectangle(-122.496, 37.532, -121.554, 37.538)

# Make an NDWI image.  It will have one band named 'nd'.
ndwi = image.normalizedDifference(['B3', 'B5']);

# Compute the weighted mean of the NDWI image clipped to the region.
weighted = ndwi.clip(geometry)\
               .reduceRegion(**{'reducer': ee.Reducer.sum(),
                              'geometry': geometry,
                              'scale': 30})\
               .get('nd')

# Compute the UN-weighted mean of the NDWI image clipped to the region.
unweighted = ndwi.clip(geometry)\
                 .reduceRegion(**{'reducer': ee.Reducer.sum().unweighted(),
                                'geometry': geometry,
                                'scale': 30}).get('nd')

# Observe the difference between weighted and unweighted reductions.
print('weighted:', weighted.getInfo())
print('unweighted', unweighted.getInfo())

weighted: -9081.917068950213
unweighted -9086.503929115412


The difference in results is due to pixels at the edge of the region receiving a weight of one as a result of calling `unweighted()` on the reducer.

In order to obtain an explicitly weighted output, it is preferable to set the weights explicitly with `splitWeights()` called on the reducer. A reducer modified by `splitWeights()` takes two inputs, where the second input is the weight. The following example illustrates `splitWeights()` by computing the weighted mean Normalized Difference Vegetation Index (NDVI) in a region, with the weights given by cloud score (the cloudier, the lower the weight):

In [0]:
# Load an input Landsat 8 image.
image = ee.Image('LANDSAT/LC08/C01/T1_TOA/LC08_186059_20130419')

# Compute cloud score and reverse it such that the highest
# weight (100) is for the least cloudy pixels.
cloudWeight = ee.Image(100).subtract(
  ee.Algorithms.Landsat.simpleCloudScore(image).select(['cloud']))

# Compute NDVI and add the cloud weight band.
ndvi = image.normalizedDifference(['B5', 'B4']).addBands(cloudWeight)

# Define an arbitrary region in a cloudy area.
region = ee.Geometry.Rectangle(9.9069, 0.5981, 10.5, 0.9757)

# Use a mean reducer.
reducer = ee.Reducer.mean()

# Compute the unweighted mean.
unweighted = ndvi.select(['nd']).reduceRegion(reducer, region, 30)

# compute mean weighted by cloudiness.
weighted = ndvi.reduceRegion(reducer.splitWeights(), region, 30);

# Observe the difference as a result of weighting by cloudiness.
print('unweighted:', unweighted.getInfo())
print('weighted:', weighted.getInfo())

unweighted: {'nd': 0.49304970804714043}
weighted: {'mean': 0.587125215732146}


Observe that cloudWeight needs to be added as a band prior to calling reduceRegion(). The result indicates that the estimated mean NDVI is higher as a result of decreasing the weight of cloudy pixels.



# 11. Linear Regression

Earth Engine contains a variety of methods for performing linear regression using reducers. The simplest linear regression reducer is `ee.Reducer.linearFit()` which computes the least squares estimate of a linear function of one variable with a constant term. The data should be set up as a two-band input image, where the first band is the independent variable and the second band is the dependent variable. The following example shows estimation of the linear trend of future precipitation (after 2006 in the [NEX data](https://developers.google.com/earth-engine/datasets/catalog/NASA_NEX-DCP30)) projected by climate models. The dependent variable is projected precipitation and the independent variable is time, added prior to calling `linearFit()`:

In [0]:
# This function adds a time band to the image.
def createTimeBand(image):
  # Scale milliseconds by a large constant to avoid very small slopes
  # in the linear regression output.
  return image.addBands(image.metadata('system:time_start').divide(1e18))


# Load the input image collection: projected climate data.
collection = ee.ImageCollection('NASA/NEX-DCP30_ENSEMBLE_STATS')\
               .filter(ee.Filter.eq('scenario', 'rcp85'))\
               .filterDate(ee.Date('2006-01-01'), ee.Date('2050-01-01'))\
               .map(createTimeBand) #Map the time band function over the collection.

# Reduce the collection with the linear fit reducer.
# Independent variable are followed by dependent variables.
linearFit = collection.select(['system:time_start', 'pr_mean'])\
                      .reduce(ee.Reducer.linearFit())

# Display the results.
center = [40.38,-100.11]
lineargm = linearFit.getMapId({'min': 0, 'max': [-0.9, 8e-5, 1], 'bands': ['scale', 'offset', 'scale']})
Mapdisplay(center,{'fit':lineargm},zoom_start=5)

Observe that the output contains two bands, the `‘offset’` (intercept) and the `‘scale’` ('scale' in this context refers to the slope of the line and is not to be confused with the scale parameter input to many reducers, which is the spatial scale). The result, with areas of increasing trend in blue, decreasing trend in red and no trend in green should look something like the Map above.



For a more flexible approach to linear modelling, use one of the linear regression reducers which allow for a variable number of independent and dependent variables. Specifically, `ee.Reducer.linearRegression()` implements ordinary least squares regression (OLS). Alternatively, `robustLinearRegression()` uses a cost function based on regression residuals to iteratively de-weight outliers in the data ([O’Leary 1990](https://epubs.siam.org/doi/abs/10.1137/0611032)).

For example, suppose there are two dependent variables: precipitation and maximum temperature, and two independent variables: a constant and time. The collection is identical to the previous example, but the constant band must be manually added prior to the reduction. The first two bands of the input are the ‘X’ (independent) variables and the next two bands are the ‘Y’ (dependent) variables. In this example, first get the regression coefficients, then flatten the array image to extract the bands of interest:

In [0]:
# This function adds a time band to the image.
def createTimeBand(image):
  # Scale milliseconds by a large constant.
  return image.addBands(image.metadata('system:time_start').divide(1e18))


# This function adds a constant band to the image.
def createConstantBand(image):
  return ee.Image(1).addBands(image)

# Load the input image collection: projected climate data.
# 1. Map the functions over the collection, to get constant and time bands.
# 2. Select the predictors and the responses.
collection = ee.ImageCollection('NASA/NEX-DCP30_ENSEMBLE_STATS')\
               .filterDate(ee.Date('2006-01-01'), ee.Date('2099-01-01'))\
               .filter(ee.Filter.eq('scenario', 'rcp85'))\
               .map(createTimeBand)\
               .map(createConstantBand)\
               .select(['constant', 'system:time_start', 'pr_mean', 'tasmax_mean'])

# Compute ordinary least squares regression coefficients.
linearRegression = collection.reduce(
  ee.Reducer.linearRegression(**{
    'numX': 2,
    'numY': 2
}))

# Compute robust linear regression coefficients.
robustLinearRegression = collection.reduce(
  ee.Reducer.robustLinearRegression(**{
    'numX': 2,
    'numY': 2
}))

# The results are array images that must be flattened for display.
# These lists label the information along each axis of the arrays.
bandNames = [['constant', 'time'], # 0-axis variation.
             ['precip', 'temp']] # 1-axis variation.

# Flatten the array images to get multi-band images according to the labels.
lrImage = linearRegression.select(['coefficients']).arrayFlatten(bandNames)
rlrImage = robustLinearRegression.select(['coefficients']).arrayFlatten(bandNames)

# Display the OLS results.
center = [40.38,-100.11]
lrImage_tk = lrImage.getMapId({'min': 0, 
                           'max': [-0.9, 8e-5, 1],
                           'bands': ['time_precip', 'constant_precip', 'time_precip']})

Mapdisplay(center,{'OLS':lrImage_tk},zoom_start=5)

In [0]:
# Compare the results at a specific point:
print('OLS estimates:')
pprint(lrImage.reduceRegion(**{
  'reducer': ee.Reducer.first(),
  'geometry': ee.Geometry.Point([-96.0, 41.0]),
  'scale': 1000
}).getInfo())

print('Robust estimates:')
pprint(rlrImage.reduceRegion(**{
  'reducer': ee.Reducer.first(),
  'geometry': ee.Geometry.Point([-96.0, 41.0]),
  'scale': 1000
}).getInfo())

OLS estimates:
{'constant_precip': 2.4574870622018352e-05,
 'constant_temp': 288.0090637207031,
 'time_precip': 0.620400607585907,
 'time_temp': 2055808.75}
Robust estimates:
{'constant_precip': 2.742403739830479e-05,
 'constant_temp': 288.0704650878906,
 'time_precip': 2.982717752456665,
 'time_temp': 2032093.375}


Inspect the results to discover that `linearRegression()` output is equivalent to the coefficients estimated by the `linearFit()` reducer, though the `linearRegression()` output also has coefficients for the other dependent variable, `tasmax_mean`. Robust linear regression coefficients are different from the OLS estimates. The example compares the coefficients from the different regression methods at a specific point.