# Join Overview

Joins are used to combine elements from different collections (e.g. `ImageCollection` or `FeatureCollection`) based on a condition specified by an `ee.Filter`. The filter is constructed with arguments for the properties in each collection that are related to each other. Specifically, `leftField` specifies the property in the primary collection that is rela ted to the `rightField` in the secondary collection. The type of filter (e.g. `equals`, `greaterThanOrEquals`, `lessThan`, etc.) indicates the relationship between the fields. The type of join indicates one-to-many or one-to-one relationships between the elements in the collections and how many matches to retain. The output of a join is produced by `join.apply()` and will vary according to the type of join.


# Simple Joins

A simple join returns elements from the `primary` collection that match any element in the  `secondary` collection according to the match condition in the filter. To perform a simple join, use an  `ee.Join.simple()`. This might be useful for finding the common elements among different collections or filtering one collection by another. For example, consider two image collections which (might) have some matching elements, where “matching” is defined by the condition specified in a filter. For example, let matching mean the image IDs are equal. Since the matching images in both collections are the same, use a simple join to discover this set of matching images:

In [None]:
from pprint import pprint
# Load a Landsat 8 image collection at a point of interest.
collection = ee.ImageCollection('LANDSAT/LC08/C01/T1_TOA')\
               .filterBounds(ee.Geometry.Point(-122.09, 37.42))

# Define start and end dates with which to filter the collections.
april = '2014-04-01'
may = '2014-05-01'
june = '2014-06-01'
july = '2014-07-01'

# The primary collection is Landsat images from April to June.
primary = collection.filterDate(april, june)

# The secondary collection is Landsat images from May to July.
secondary = collection.filterDate(may, july)

# Use an equals filter to define how the collections match.
Filter = ee.Filter.equals(leftField = 'system:index', rightField = 'system:index')

# Create the join.
simpleJoin = ee.Join.simple()

# Apply the join.
simpleJoined = simpleJoin.apply(primary, secondary, Filter)
images = simpleJoined.getInfo()

# Display the result.
print('Simple join: ')
pprint(images)

In the previous example, observe that the collections to join temporally overlap by about a month. Note that when this join is applied, the output will be an `ImageCollection` with only the matching images in the `primary` collection. The output should look something like:

In [None]:
for x in images['features']:
  print('Landsat8: ',x['id'])  

This output shows that two images match (as specified in the filter) between the primary and secondary collections, images at day-of-year 125 and 141, or May 5 and May 21.



# Inverted Joins

Suppose that the purpose of the join is to retain all images in the `primary` collection that are not in the `secondary` collection. You can perform this type of inverted join using `ee.Join.inverted()`. Using the filter, `primary` and `secondary` collections as defined in the [simple join example](https://developers.google.com/earth-engine/joins_simple), specify the inverted join as:



In [None]:
# Define the join.
invertedJoin = ee.Join.inverted()

# Apply the join.
invertedJoined = invertedJoin.apply(primary, secondary, filter)
images_rev = invertedJoined.getInfo()

The output should look something like:



In [None]:
for x in images_rev['features']:
  print('Landsat8: ',x['id'])  

The inverted join contains the images from April 3 and April 19, indicating the images that are present in the primary collection but not in the secondary collection.



# Inner Joins

To enumerate all matches between the elements of two collections, use an `ee.Join.inner()`. The output of an inner join is a `FeatureCollection` (even if joining one `ImageCollection` to another `ImageCollection`). Each feature in the output represents a match, where the matching elements are stored in two properties of the feature. For example, `feature.get('primary')` is the element in the primary collection that matches the element from the secondary collection stored in `feature.get('secondary')`. (Different names for these properties can be specified as arguments to `inner()`, but `‘primary’` and `‘secondary’` are the defaults). One-to-many relationships are represented by multiple features in the output. If an element in either collection doesn’t have a match, it is not present in the output.

Join examples using `ImageCollection` inputs apply without modification to `FeatureCollection` inputs. It is also possible to join a `FeatureCollection` to an `ImageCollection` and vice versa. Consider the following toy example of inner join:


In [None]:
# Create the primary collection.
primaryFeatures = ee.FeatureCollection([
  ee.Feature(None, {'foo': 0, 'label': 'a'}),
  ee.Feature(None, {'foo': 1, 'label': 'b'}),
  ee.Feature(None, {'foo': 1, 'label': 'c'}),
  ee.Feature(None, {'foo': 2, 'label': 'd'}),
])

# Create the secondary collection.
secondaryFeatures = ee.FeatureCollection([
  ee.Feature(None, {'bar': 1, 'label': 'e'}),
  ee.Feature(None, {'bar': 1, 'label': 'f'}),
  ee.Feature(None, {'bar': 2, 'label': 'g'}),
  ee.Feature(None, {'bar': 3, 'label': 'h'}),
])

# Use an equals filter to specify how the collections match.
toyFilter = ee.Filter.equals(leftField='foo',rightField='bar')

# Define the join.
innerJoin = ee.Join.inner('primary', 'secondary')

# Apply the join.
toyJoin = innerJoin.apply(primaryFeatures, secondaryFeatures, toyFilter)

# Print the result.
print('Inner join toy example:')    
pprint(toyJoin.getInfo())

In the previous example, notice that the relationship between the tables is defined in the filter, which indicates that fields `‘foo’` and `‘bar’` are the join fields. An inner join is then specified and applied to the collections. Inspect the output and observe that each possible match is represented as one `Feature`.

For a motivated example, consider joining MODIS `ImageCollection` objects. MODIS quality data are sometimes stored in a separate collection from the image data, so an inner join is convenient for joining the two collections in order to apply the quality data. In this case, the image acquisition times are identical, so an equals filter handles the job of specifying this relationship between the two collections:

In [None]:
# Make a date filter to get images in this date range.
dateFilter = ee.Filter.date('2014-01-01', '2014-02-01')

# Load a MODIS collection with EVI data.
mcd43a4 = ee.ImageCollection('MODIS/MCD43A4_006_EVI')\
            .filter(dateFilter)

# Load a MODIS collection with quality data.
mcd43a2 = ee.ImageCollection('MODIS/006/MCD43A2')\
            .filter(dateFilter)

# Define an inner join.
innerJoin = ee.Join.inner()

# Specify an equals filter for image timestamps.
filterTimeEq = ee.Filter.equals(leftField='system:time_start',rightField='system:time_start')

# Apply the join.
innerJoinedMODIS = innerJoin.apply(mcd43a4, mcd43a2, filterTimeEq)

# Display the join result: a FeatureCollection.
print('Inner join output:')    
pprint(innerJoinedMODIS.getInfo())

innerJoinedMODIS.first().getInfo()

To make use of the joined images in the output `FeatureCollection`, `map()` a combining function over the output. For example, the matching images can be stacked together such that the quality bands are added to the image data:

In [None]:
# Map a function to merge the results in the output FeatureCollection.
joinedMODIS = innerJoinedMODIS.map(
    lambda feature: ee.Image.cat(feature.get('primary'), feature.get('secondary')))

# Print the result of merging.
print('Inner join, merged bands:')
pprint(joinedMODIS.getInfo())

Although this function is mapped over a `FeatureCollection`, the result is an `ImageCollection`. Each image in the resultant `ImageCollection` has all the bands of the images in the primary collection (in this example just `‘EVI’`) and all the bands of the matching image in the secondary collection (the quality bands).

# Save-All Joins

Saving joins are one way of representing one-to-many relationships in Earth Engine. Unlike an `inner join`, a saving join stores matches from the `secondary` collection as a named property of the features in the `primary` collection. To save all such matches, use an `ee.Join.saveAll()`. If there is a one-to-many relationship, a `saveAll()` join stores all matching features as an `ee.List`. Unmatched elements in the primary collection are dropped. For example, suppose there is a need to get all MODIS imagery acquired within two days of each Landsat image in a collection. This example uses a `saveAll()` join for that purpose:

In [None]:
# Load a primary collection: Landsat imagery.
primary = ee.ImageCollection('LANDSAT/LC08/C01/T1_TOA')\
            .filterDate('2014-04-01', '2014-06-01')\
            .filterBounds(ee.Geometry.Point(-122.092, 37.42))

# Load a secondary collection: MODIS imagery.
modSecondary = ee.ImageCollection('MODIS/006/MOD09GA')\
                 .filterDate('2014-03-01', '2014-07-01')

# Define an allowable time difference: two days in milliseconds.
twoDaysMillis = 2 * 24 * 60 * 60 * 1000

# Create a time filter to define a match as overlapping timestamps.
timeFilter = ee.Filter.Or(
  ee.Filter.maxDifference(
    difference= twoDaysMillis,
    leftField='system:time_start',
    rightField= 'system:time_end'
  ),
  ee.Filter.maxDifference(
    difference=twoDaysMillis,
    leftField='system:time_end',
    rightField='system:time_start'
  )
)

# Define the join.
saveAllJoin=ee.Join.saveAll(
  matchesKey='terra',
  ordering='system:time_start',
  ascending=True
)

# Apply the join.
landsatModis = saveAllJoin.apply(primary, modSecondary, timeFilter)

# Display the result.
print('Join.saveAll:')
landsatModis.getInfo()

In this example, note that the `secondary` MODIS collection is pre-filtered to be chronologically similar to the `primary` Landsat collection for efficiency. To compare the Landsat acquisition time to the MODIS composite time, which has a daily range, the filter compares the endpoints of the image timestamps. The join is defined with the name of the property used to store the list of matches for each Landsat image (`‘terra’`) and optional parameter to sort the list of matches by the `system:time_start` property.

Inspection of the result indicates that images within the primary collection have the added `terra` property which stores a list of the matching MODIS images.

# Save-Best Joins

To save only the best match for each element in a collection, use an `ee.Join.saveBest()`. The `saveBest()` join functions in an equivalent way to the `saveAll()` join, except for each element in the `primary` collection, it saves the element from the `secondary` collection with the best match. Unmatched elements in the primary collection are dropped. Suppose the intention is to find a meteorological image closest in time to each Landsat image in the `primary` collection. To perform this join, the `ee.Filter` must be redefined for a single join condition (combined filters will not work with `saveBest()` since it is ambiguous how to combine ranks from multiple sub-Filters):


In [None]:
# Load a primary collection: Landsat imagery.
primary = ee.ImageCollection('LANDSAT/LC08/C01/T1_TOA')\
            .filterDate('2014-04-01', '2014-06-01')\
            .filterBounds(ee.Geometry.Point(-122.092, 37.42))

# Load a secondary collection: GRIDMET meteorological data
gridmet = ee.ImageCollection('IDAHO_EPSCOR/GRIDMET')

# Define a max difference filter to compare timestamps.
maxDiffFilter = ee.Filter.maxDifference(
  difference=2 * 24 * 60 * 60 * 1000,
  leftField='system:time_start',
  rightField='system:time_start'
)

# Define the join.
saveBestJoin = ee.Join.saveBest(
  matchKey='bestImage',
  measureKey='timeDiff'
)

# Apply the join.
landsatMet = saveBestJoin.apply(primary, gridmet, maxDiffFilter)

# Print the result.
pprint(landsatMet.getInfo())

Note that a `saveBest()` join defines the name of the property with which to store the best match (`‘bestImage’`) and the name of the property with which to store the goodness of the match metric (`‘timeDiff’`). Inspection of the results indicates that a matching `DAYMET` image has been added to the property `bestImage` for each Landsat scene in the `primary` collection. Each of these DAYMET images has the property `timeDiff` indicating the time difference in milliseconds between the DAYMET image and the Landsat image, which will be minimum among the DAYMET images passing the condition in the filter.



# Spatial Joins

Collections can be joined by spatial location as well as by property values. To join based on spatial location, use a withinDistance() filter with .geo join fields specified. The .geo field indicates that the item's geometry is to be used to compute the distance metric. For example, consider the task of finding all FLUXNET sites within 100 kilometers of each Landsat image in a collection. For that purpose, use a filter on the geometry fields, with the maximum distance set to 100 kilometers using the distance parameter:

In [None]:
# Load a primary collection: Landsat imagery.
primary = ee.ImageCollection('LANDSAT/LC08/C01/T1_TOA')\
                .filterDate('2014-04-01', '2014-06-01')\
                .filterBounds(ee.Geometry.Point(-122.09, 37.42))

# Load a secondary collection: FLUXNET points in a Fusion Table.
fluxnet = ee.FeatureCollection('ft:1f85fvccyKSlaZJiAta8ojlXGhgf-LPPNmICG9kQ')

# Define a spatial filter, with distance 100 km.
distFilter = ee.Filter.withinDistance(
  distance=100000,
  leftField='.geo',
  rightField='.geo',
  maxError=10
)

# Define a saveAll join.
distSaveAll = ee.Join.saveAll(
  matchesKey='points',
  measureKey='distance'
)

# Apply the join.
spatialJoined = distSaveAll.apply(primary, fluxnet, distFilter)

# Print the result.
pprint(spatialJoined.getInfo())  

Note that the previous example joins a `FeatureCollection` to an `ImageCollection`. The `saveAll()` join sets a property (`points`) on each image in the `primary` collection which stores a list of the points within 100 km of the image. The distance of each point to the image is stored in the `distance` property of each joined point.

Spatial joins can also be applied to feature collections to find places where the features in one collection intersect those in another. For example, consider two feature collections: a `primary` collection containing one polygon representing the boundary of California state, a secondary collection containing polygons representing Landsat image footprints according to the [Worldwide Reference System](https://landsat.gsfc.nasa.gov/the-worldwide-reference-system/). Suppose there is need to find all the image footprints which intersect the California polygon. This can be accomplished with a spatial join as follows:

In [None]:
# Load the primary collection: a California polygon.
cali = ee.FeatureCollection('ft:1fRY18cjsHzDgGiJiS2nnpUU3v9JPDc2HNaR7Xk8')\
         .filter(ee.Filter.eq('Name', 'California'))

# Load the secondary collection: WRS-2 polygons.
wrs = ee.FeatureCollection('ft:1_RZgjlcqixp-L9hyS6NYGqLaKOlnhSC35AB5M5Ll')

# Define a spatial filter as geometries that intersect.
spatialFilter = ee.Filter.intersects(
  leftField='.geo',
  rightField='.geo',
  maxError=10
)

# Define a save all join.
saveAllJoin = ee.Join.saveAll(
  matchesKey='scenes',
)

# Apply the join.
intersectJoined = saveAllJoin.apply(cali, wrs, spatialFilter)

# Get the result and display it.
intersected = ee.FeatureCollection(ee.List(intersectJoined.first().get('scenes')))

In [None]:
#Display results using folium!
import folium

intersected_gjson = intersected.getInfo()
cali_gjson = cali.getInfo()

centerobject = cali.geometry().centroid().getInfo()['coordinates']
centerobject.reverse()

dicc = {'WRS-2 polygons':intersected_gjson,
        'cali':cali_gjson}
Mapdisplay(centerobject,dicc,zoom_start=6)