<a href="https://colab.research.google.com/github/YoungHyunKoo/GEE_remote_sensing/blob/main/Week2/2_2_Image_Collection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **[GEO 6083] Remote Sensing Imge Processing - Spring 2024**
# **WEEK 2-2. Image Collection**

### OBJECTIVES
1. Explore how an image collection is constructed.
2.Use filter functions to ge the image collection of interest.
3. Visualize image collections.

Credited by Younghyun Koo (kooala317@gmail.com)

## GEE Image collection
An `ImageCollection` is a stack or sequence of images. An ImageCollection can be loaded by pasting an GEE asset ID into the ImageCollection constructor. In the data catalog [link](https://developers.google.com/earth-engine/datasets), you can find some IDs of ImageCollection. In this tutorial, you will be able to how to handle these image collections by temporal information (date and time), spatial information (latitude/longitude), and metadata (e.g. cloud covers)



## **1. Import an image collection**

First, let's import and initialize `ee` library.


In [1]:
# Import ee library
import ee

# Authenticate
ee.Authenticate()

# Initialize with your own project.
ee.Initialize(project = "utsa-spring2024")

In [2]:
import geemap

We will import [USGS Landsat 8 Level 2, Collection 2, Tier 1 ](https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C02_T1_L2) image collection.

In [3]:
# Import image collection - Landsat 8 surface reflectance
collection = ee.ImageCollection('LANDSAT/LC08/C02/T1_L2')

In [4]:
# How many images are in the image collection?
print(collection.size().getInfo())

1825579


There are more than 1.8 million images in this image collection. However, we don't need all of these images because they are from various locations over the world with very long time series (2013-present). Therefore, we need to filter out a specific date and location that we are interested in.

### **Filter by dates**

There is a function named `filterDate` to filter an image collection by a date range. You need to provide start and end day for this function. You can find the details of `filterDate` function here: [filterDate](https://developers.google.com/earth-engine/apidocs/ee-imagecollection-filterdate)

In [5]:
# ImageCollection.filterDate(start, end)
collction = ee.ImageCollection('LANDSAT/LC08/C02/T1_L2').filterDate('2020-05-01', '2020-06-01')

**NOTE:** Start date is INCLUSIVE but end date is EXCLUSIVE.

In [6]:
# Number of images in the filtered image collection
print(collction.size().getInfo())

15168


Now you filtered out the image collection in May 2020. The number of images are reduced.

### **Filter by location**

In addition to the temporal filtering, you can use the `filterBounds` function to narrow down to your region of interest (ROI). You will filter out the image collection based on the GEE `geometry`. Before running the `filterBounds` function, you need to define the `geometry` as a ROI. We will set a ROI using `BBox` function. ([Link to BBox function](https://developers.google.com/earth-engine/apidocs/ee-geometry-bbox))

In [15]:
# Region of interest as a point (longitude, latitude)
roi = ee.Geometry.BBox(-99, 29, -98, 30)
# San Antonio region: ee.Geometry.BBox(west, south, east, north)

In [18]:
Map = geemap.Map()
Map.addLayer(roi, {}, "ROI")
Map.centerObject(roi, 8)
Map

Map(center=[29.499488782128225, -98.50000000000014], controls=(WidgetControl(options=['position', 'transparent…

In addition to the Point geometry, there are several other geometry types you can use: Line string, Linear ring, rectangle, polygon. Please visit the link below and practice how to create another type of geometry. [GEE geometry](https://developers.google.com/earth-engine/guides/geometries)

In [None]:
# # Geometry
# point = ee.Geometry.Point([1.5, 1.5]);

# # Rectangle
# rectangle = ee.Geometry.Rectangle([-40, -20, 40, 20]);

# # Polygon
# polygon = ee.Geometry.Polygon([
#   [[-5, 40], [65, 40], [65, 60], [-5, 60], [-5, 60]]
# ]);

# # Line string
# lines = ee.Geometry.LineString([[5, -10], [35, -10], [35, 10], [5, 10], [5, -10]])

***DO IT YOURSELF!!***
- Please create Line string, Linear ring, rectangle, and polygon with any latitude and longitude. I also encourage you to try using these geometries for the following location filters.

Now we will filter the image collection based on this defined geometry using the `filterBounds` function. Please find more details about this function here: [filterBounds](https://developers.google.com/earth-engine/apidocs/ee-imagecollection-filterbounds)

In [23]:
collection = ee.ImageCollection('LANDSAT/LC08/C02/T1_L2') \
    .filterDate('2020-05-01', '2020-05-31') \
    .filterBounds(roi)

In [24]:
# Number of images in the filtered image collection
print(collection.size().getInfo())

8


In [21]:
# Load a geemap
Map = geemap.Map()

# "First" function gets the first image in the collection
image = collection.first()

# image visualization factors
vis_param = {'min': 0,
             'max': 20000,
             'bands': ['SR_B5', 'SR_B4', 'SR_B3'],
             'gamma': 0.5}

Map.addLayer(image, vis_param, "First image")
Map.centerObject(image, 8)

Map

Map(center=[30.304528146939795, -97.8528857330202], controls=(WidgetControl(options=['position', 'transparent_…

In [22]:
# Check the image properties
geemap.image_props(image).getInfo()

{'ALGORITHM_SOURCE_SURFACE_REFLECTANCE': 'LaSRC_1.5.0',
 'ALGORITHM_SOURCE_SURFACE_TEMPERATURE': 'st_1.3.0',
 'CLOUD_COVER': 22.08,
 'CLOUD_COVER_LAND': 22.08,
 'COLLECTION_CATEGORY': 'T1',
 'COLLECTION_NUMBER': 2,
 'DATA_SOURCE_AIR_TEMPERATURE': 'MODIS',
 'DATA_SOURCE_ELEVATION': 'GLS2000',
 'DATA_SOURCE_OZONE': 'MODIS',
 'DATA_SOURCE_PRESSURE': 'Calculated',
 'DATA_SOURCE_REANALYSIS': 'GEOS-5 FP-IT',
 'DATA_SOURCE_TIRS_STRAY_LIGHT_CORRECTION': 'TIRS',
 'DATA_SOURCE_WATER_VAPOR': 'MODIS',
 'DATE_ACQUIRED': '2020-05-14',
 'DATE_PRODUCT_GENERATED': 1597952371000,
 'DATUM': 'WGS84',
 'EARTH_SUN_DISTANCE': 1.0108416,
 'ELLIPSOID': 'WGS84',
 'GEOMETRIC_RMSE_MODEL': 5.121,
 'GEOMETRIC_RMSE_MODEL_X': 3.546,
 'GEOMETRIC_RMSE_MODEL_Y': 3.694,
 'GEOMETRIC_RMSE_VERIFY': 3.036,
 'GRID_CELL_SIZE_REFLECTIVE': 30,
 'GRID_CELL_SIZE_THERMAL': 30,
 'GROUND_CONTROL_POINTS_MODEL': 676,
 'GROUND_CONTROL_POINTS_VERIFY': 152,
 'GROUND_CONTROL_POINTS_VERSION': 5,
 'IMAGE_DATE': '2020-05-14',
 'IMAGE_QUALITY_

### **Filter by metadata**

As you can see in the image properties above, this image is from May 10, 2020. (See "system:time_end") However, the cloud cover of this image is about 50 % (i.e., 50 % of the entire image is covered by cloud). This cloud-covered image is not very useful for monitoring earth surfaces. Therefore, we need to select some images with low cloud covers. GEE provides the function named `filterMetadata` to filter out the image collection based on the metadata. To filter out cloud-covered images, we will use the 'CLOUD_COVER' property from the metadata.

In [25]:
collection = ee.ImageCollection('LANDSAT/LC08/C02/T1_L2') \
    .filterDate('2020-01-01', '2020-12-31') \
    .filterBounds(ee.Geometry.Point(-122.4488, 37.7589)) \
    .filterMetadata('CLOUD_COVER', 'less_than', 10) \
    .sort("CLOUD_COVER")

# Filter by metadata: cloud cover is less than 10 %.
# In this case, we use 'less than' operator to filter out low cloud cover area.
# However, there are some other operators as well: 'greater than', 'equals', etc.
# .sort function is to sort the images in the image collection based on cloud covers.
# That is, the images are sorted from low cloud covers to higher cloud covers


In [26]:
# Number of images in the filtered image collection
print(collection.size().getInfo())

6


In [31]:
# Load a geemap
Map = geemap.Map()

# "First" function gets the first image in the collection
image = collection.first()

# image visualization factors
vis_param = {'min': 0,
             'max': 15000,
             'bands': ['SR_B4', 'SR_B3', 'SR_B2'],
             'gamma': 0.5}

Map.addLayer(image, vis_param, "First image")
Map.centerObject(image, 8)

Map

Map(center=[37.4730201816219, -122.11971703434752], controls=(WidgetControl(options=['position', 'transparent_…

In [32]:
geemap.image_props(image).getInfo()

{'ALGORITHM_SOURCE_SURFACE_REFLECTANCE': 'LaSRC_1.5.0',
 'ALGORITHM_SOURCE_SURFACE_TEMPERATURE': 'st_1.3.0',
 'CLOUD_COVER': 0.07,
 'CLOUD_COVER_LAND': 0.1,
 'COLLECTION_CATEGORY': 'T1',
 'COLLECTION_NUMBER': 2,
 'DATA_SOURCE_AIR_TEMPERATURE': 'MODIS',
 'DATA_SOURCE_ELEVATION': 'GLS2000',
 'DATA_SOURCE_OZONE': 'MODIS',
 'DATA_SOURCE_PRESSURE': 'Calculated',
 'DATA_SOURCE_REANALYSIS': 'GEOS-5 FP-IT',
 'DATA_SOURCE_TIRS_STRAY_LIGHT_CORRECTION': 'TIRS',
 'DATA_SOURCE_WATER_VAPOR': 'MODIS',
 'DATE_ACQUIRED': '2020-10-12',
 'DATE_PRODUCT_GENERATED': 1604538043000,
 'DATUM': 'WGS84',
 'EARTH_SUN_DISTANCE': 0.9978226,
 'ELLIPSOID': 'WGS84',
 'GEOMETRIC_RMSE_MODEL': 4.842,
 'GEOMETRIC_RMSE_MODEL_X': 3.158,
 'GEOMETRIC_RMSE_MODEL_Y': 3.67,
 'GEOMETRIC_RMSE_VERIFY': 2.811,
 'GRID_CELL_SIZE_REFLECTIVE': 30,
 'GRID_CELL_SIZE_THERMAL': 30,
 'GROUND_CONTROL_POINTS_MODEL': 1108,
 'GROUND_CONTROL_POINTS_VERIFY': 441,
 'GROUND_CONTROL_POINTS_VERSION': 5,
 'IMAGE_DATE': '2020-10-12',
 'IMAGE_QUALITY_OLI

You can extract the information of the images in the image collection as a form of array by using "aggregate_array" function.

In [33]:
# Cloud cover information as a array
collection.aggregate_array('CLOUD_COVER').getInfo()

[0.07, 0.47, 0.55, 0.57, 3, 6.58]

In [34]:
# Image ID information as a array
collection.aggregate_array('system:id').getInfo()

['LANDSAT/LC08/C02/T1_L2/LC08_044034_20201012',
 'LANDSAT/LC08/C02/T1_L2/LC08_044034_20200403',
 'LANDSAT/LC08/C02/T1_L2/LC08_044034_20200302',
 'LANDSAT/LC08/C02/T1_L2/LC08_044034_20201129',
 'LANDSAT/LC08/C02/T1_L2/LC08_044034_20201028',
 'LANDSAT/LC08/C02/T1_L2/LC08_044034_20200606']

In [37]:
# From this image ID information, you can import an image you want from the image collection.

# Array of IDs is saved as a variable named "id_array"
id_array = collection.aggregate_array('system:id').getInfo()

# Import the 3rd image (index: 2) as a image
image = ee.Image(id_array[2])

# image visualization factors
vis_param = {'min': 0,
             'max': 15000,
             'bands': ['SR_B4', 'SR_B3', 'SR_B2'],
             'gamma': 0.5}

Map.addLayer(image, vis_param, "First image")
Map.centerObject(image, 8)

Map

Map(bottom=405905.0, center=[37.47238420230516, -122.12227008401321], controls=(WidgetControl(options=['positi…

***DO IT YOURSELF!!***
- So far, we used Landsat 8 image collections. Please do the same process you did for Landsat 8 above, but now with Sentinel-2 data. If you want to get the band information about this data, please go to this link: [Sentinel-2 MSI: MultiSpectral Instrument, Level-2A](https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR)
- Also, please change the region of interest with using different latitude and longitude.
- Please keep in mind that the keys of metadata for Sentinel-2 can be different from the keys of Landsat 8 data. For example, Sentinel-2 uses "CLOUDY_PIXEL_PERCENTAGE" for cloud covers, instead of "CLOUD_COVER" of the Landsat image.

## **2. Reduce image collections**

In the previous examples, we filter image collections using dates, regions, and metadata, and extract an image (e.g., first image) from the image collection. However, sometimes you may need to calculate some statistics (e.g., mean, median, standard devation, max, min) from the image collections. To composite images in an `ImageCollection`, we can use `imageCollection.reduce()` function. This will composite all the images in the collection to a single image representing, for example, the m mean, median, standard devation, max, or min of the images. You will learn how to use this `reduce()` function to extract the information you want.

### Median & Mean reducer

First, let's see how the reducer `median()` works for the Landsat 8 TOA image collection.

In [48]:
# Load a Landsat 8 TOA collection for a single path-row.
collection = ee.ImageCollection('LANDSAT/LC08/C02/T1_TOA')\
.filterDate('2014-01-01', '2015-01-01') \
.filter(ee.Filter.eq('WRS_PATH', 44)) \
.filter(ee.Filter.eq('WRS_ROW', 34))
# Filter the image collecetion with path and row (from metadata)

## The above script is the same to the below script using filterMetadata
# collection = ee.ImageCollection('LANDSAT/LC08/C02/T1_TOA')\
# .filterDate('2014-01-01', '2015-01-01') \
# .filterMetadata('WRS_PATH', 'equals', 44) \
# .filterMetadata('WRS_ROW', 'equals', 34) \

# Compute a median image and display.
median = collection.median()

# Draw a map
Map = geemap.Map()
Map.centerObject(collection, 8)
Map.addLayer(median, {'bands': ['B4', 'B3', 'B2'], 'max': 0.3}, 'Median')
Map

Map(center=[37.47184929405802, -122.11426033557976], controls=(WidgetControl(options=['position', 'transparent…

In [44]:
# See what is stored in the median variable
median

In [49]:
# Reduce the collection with a median reducer.
median = collection.reduce(ee.Reducer.median())

# Display the median image.
Map.addLayer(median, {'bands': ['B4_median', 'B3_median', 'B2_median'], 'max': 0.3}, 'Also median')
Map

Map(bottom=25701.0, center=[37.47184929405802, -122.11426033557976], controls=(WidgetControl(options=['positio…

Please note that the band names differ as a result of using `reduce()` instead of the convenience method. Specifically, the names of the reducer have been appended to the band names. Let's see another reduce `mean()`.

In [50]:
# Reduce the collection with a mean reducer.
mean = collection.reduce(ee.Reducer.mean())

# Display the mean image.
Map.addLayer(mean, {'bands': ['B4_mean', 'B3_mean', 'B2_mean'], 'max': 0.3}, 'Mean')
Map

Map(bottom=25701.0, center=[37.47184929405802, -122.11426033557976], controls=(WidgetControl(options=['positio…

You can find various `ee.Reducer()` in this web page: [ee.Reducer](https://developers.google.com/earth-engine/apidocs/ee-reducer-count)

### Linear fitting reducer

Now, let's try more complex reductions. For example, to compute the long term linear trend over a collection, use one of the linear regression reducers. The following code computes the linear trend of MODIS Enhanced Vegetation Index (EVI) ([LINK](https://developers.google.com/earth-engine/datasets/catalog/MODIS_061_MYD13A1)).



In [54]:
# This function adds a band representing the image timestamp.
def addTime(image):
  return image.addBands(image.metadata('system:time_start').divide(1000 * 60 * 60 * 24 * 365))
  # Convert milliseconds from epoch to years to aid in interpretation of the following trend calculation.

# Load a MODIS collection, filter to several years of 16 day mosaics, and map the time band function over it.
collection = ee.ImageCollection('MODIS/006/MYD13A1').filterDate('2010-01-01', '2020-01-01').map(addTime)

# Select the bands to model with the independent variable first.
trend = collection.select(['system:time_start', 'EVI']).reduce(ee.Reducer.linearFit())

Map = geemap.Map()
Map.setCenter(-96.943, 39.436, 5)
Map.addLayer(trend, {'min': -100, 'max': 100, 'bands': ['scale'], 'palette': ['red', 'white', 'blue']}, 'EVI trend')
Map

Map(center=[39.436, -96.943], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=SearchDat…

In [52]:
trend

## **References**
- https://geemap.org/tutorials/
- https://developers.google.com/earth-engine/guides/ic_reducing