# Image Collections & Useful Operations

> Note: This example notebook covers the basics for how to define/initialize image collections and general data management operations using `RadGEEToolbox`
>
> ⚠⚠⚠ If you would like to learn how to visualize images processed with `RadGEEToolbox`, [please follow the `Complete_ReadMe_Example.ipynb`, `Palettes_and_Visualization.ipynb`, or `S1_SAR_Backscatter_Basic_Usage.ipynb` example notebooks on GitHub](https://github.com/radwinskis/RadGEEToolbox/tree/main/Example%20Notebooks) ⚠⚠⚠
>

**Initialization and Setup**

In [2]:
import ee
from RadGEEToolbox import LandsatCollection
from RadGEEToolbox import Sentinel2Collection

In [None]:
# Store name of Google Cloud Project assosiated with Earth Engine - replace with your project ID/name
PROJECT_ID = 'your-cloud-project-id'
# Attempt to initialize Earth Engine
try:
    ee.Initialize(project=PROJECT_ID)
    print("Earth Engine initialized successfully.")
except Exception as e:
    print("Initialization failed, attempting authentication...")
    try:
        ee.Authenticate()
        ee.Initialize(project=PROJECT_ID)
        print("Authentication and initialization successful.")
    except Exception as auth_error:
        print("Authentication failed. Error details:", auth_error)


Earth Engine initialized successfully.


________________

#### **Defining collections - required arguments are `start_date=` and `end_date=`.**

**Optional arguments are for specifying tile(s), boundary/geometry, or relative orbit(s) (orbits for Sentinel-2 only)**

Below are multiple examples of how to define/initialize image collections for Sentinel-2 MSI data using the `Sentinel2Collection` class

- specifying Military Grid Reference System (MGRS) tile(s) (see https://mappingsupport.com/p2/coordinates-mgrs-gissurfer-maps.html for more information and an interactive map illustrating the location(s) and size of MGRS tiles)

In [5]:
#General way to define a Sentinel-2 image collection using start date, end date, and a single MGRS tile
S2_col = Sentinel2Collection(start_date='2023-06-01', end_date='2023-06-30', tile='12TUL')

- specifying a list of MGRS tiles

In [15]:
#General way to define a Sentinel-2 image collection using start date, end date, and multiple MGRS tiles
S2_col = Sentinel2Collection(start_date='2023-06-01', end_date='2023-06-30', tile=['12TUL', '12TUM', '12TUN'])

- using relative orbit number instead of tile(s) to isolate an entire swath

In [7]:
#Rather than using tiles, we can use a boundary (ee.Geometry) or if using Sentinel-2 we can use relative orbits - here we use a relative orbit number which provides a full swath of data
S2_col_orbit_filter = Sentinel2Collection(start_date='2023-06-01', end_date='2023-06-30', relative_orbit_number=127)

- using a region of interest / boundary / geometry to filter to images overlapping the geometry

In [8]:
#Here is an example of using a boundary to filter the collection - first defining a boundary, in this case the county of Salt Lake City
counties = ee.FeatureCollection('TIGER/2018/Counties')
salt_lake_county = counties.filter(ee.Filter.And(
    ee.Filter.eq('NAME', 'Salt Lake'),
    ee.Filter.eq('STATEFP', '49')))
salt_lake_geometry = salt_lake_county.geometry()

S2_col_boundary_filter = Sentinel2Collection(start_date='2023-06-01', end_date='2023-06-15', boundary=salt_lake_geometry)

- using relative orbit numbers and cloud percentage threshold (no images with clouds greater than set aeral percentage)

In [9]:
#You can filter for clouds by setting the cloud percentage threshold - here we set it to 15%
S2_col_low_clouds = Sentinel2Collection(start_date='2023-06-01', end_date='2023-06-30', relative_orbit_number=127, cloud_percentage_threshold=15)

- using relative orbit numbers and NoData threshold (no images with more than the set threshold of percentage of NoData pixels in the image)


In [10]:
#You can also filter for images with a lot of NoData - which happens more than you'd think - here we set the threshold to 15% as well
S2_col_no_blank_images = Sentinel2Collection(start_date='2023-06-01', end_date='2023-06-30', relative_orbit_number=127, nodata_threshold=15)

> REMINDER:
>
> If you would like to learn how to visualize images processed with `RadGEEToolbox`, [please follow the `Complete_ReadMe_Example.ipynb`, `Palettes_and_Visualization.ipynb`, or `S1_SAR_Backscatter_Basic_Usage.ipynb` example notebooks on GitHub](https://github.com/radwinskis/RadGEEToolbox/tree/main/Example%20Notebooks) ⚠⚠⚠

_____________________________

As GEE uses deferred execution, GEE doesn't immediately compute results when you define a collection. The processing is deferred until you explicitly ask for an output. 

This can be done by printing the dates of a defined collection, to verify the collection is defined correctly and is not an empty collection.

________________

**Below is an example of how to quickly print the dates for any `RadGEEToolbox` collection object using `.dates`**

In this case, printing all of the dates from the Sentinel-2 collection defined above (`S2_col`)

In [11]:
print(S2_col.dates)

['2023-06-01', '2023-06-01', '2023-06-02', '2023-06-02', '2023-06-04', '2023-06-04', '2023-06-04', '2023-06-06', '2023-06-06', '2023-06-07', '2023-06-07', '2023-06-09', '2023-06-09', '2023-06-09', '2023-06-11', '2023-06-11', '2023-06-12', '2023-06-12', '2023-06-12', '2023-06-14', '2023-06-14', '2023-06-14', '2023-06-16', '2023-06-16', '2023-06-17', '2023-06-17', '2023-06-19', '2023-06-19', '2023-06-19', '2023-06-21', '2023-06-21', '2023-06-22', '2023-06-22', '2023-06-22', '2023-06-24', '2023-06-24', '2023-06-24', '2023-06-26', '2023-06-26', '2023-06-27', '2023-06-27', '2023-06-29', '2023-06-29', '2023-06-29']


_________

**Defining a LandsatCollection object is very similar to Sentinel-2, however, Landsat tiles use a different grid system, the Worldwide Reference System (WRS-2), and thus the arguments for filtering by tile(s) are slightly different**

See https://maps.eatlas.org.au/index.html?intro=false&z=7&ll=131.46626,-15.36800&l0=ea_ref%3AWorld_USGS_Landsat-WRS-2_Descending,ea_ea-be%3AWorld_Bright-Earth-e-Atlas-basemap,google_HYBRID,google_TERRAIN,google_SATELLITE,google_ROADMAP&o0=,0.3&v0=,,f,f,f,f for more information and an interactive map of WRS-2 tiles.

_________

**Below are some examples of how to define and filter a Landsat collection**

In [12]:
# Similar examples for Landsat collections - showing how to filter using tiles or boundaries
col = LandsatCollection(start_date='2023-06-01', end_date='2023-06-30', tile_row=32, tile_path=38)
tile_filtered_col = LandsatCollection(start_date='2023-06-01', end_date='2023-06-30', tile_row=32, tile_path=38, cloud_percentage_threshold=50)
SLC_filtered_col = LandsatCollection(start_date='2023-06-01', end_date='2023-06-30', boundary=salt_lake_geometry, cloud_percentage_threshold=15)

__________

**You may want to view the metadata of the defined collection to view band names and available properties - as shown below**

The `.collection` attribute converts the RadGEEToolbox collection object to an ee.ImageCollection object to perform native GEE operations

Below we print the metadata of the image collection as a demo

In [13]:
# Showing how to access the collection information by using the getInfo() method after converting the LandsatCollection object to an ee.ImageCollection object. 
# This will return a dictionary with metadata about the collection.

print(SLC_filtered_col.collection.getInfo())

{'type': 'ImageCollection', 'bands': [], 'features': [{'type': 'Image', 'bands': [{'id': 'SR_B1', 'data_type': {'type': 'PixelType', 'precision': 'int', 'min': 0, 'max': 65535}, 'dimensions': [7891, 8001], 'crs': 'EPSG:32612', 'crs_transform': [30, 0, 167985, 0, -30, 4587315]}, {'id': 'SR_B2', 'data_type': {'type': 'PixelType', 'precision': 'int', 'min': 0, 'max': 65535}, 'dimensions': [7891, 8001], 'crs': 'EPSG:32612', 'crs_transform': [30, 0, 167985, 0, -30, 4587315]}, {'id': 'SR_B3', 'data_type': {'type': 'PixelType', 'precision': 'int', 'min': 0, 'max': 65535}, 'dimensions': [7891, 8001], 'crs': 'EPSG:32612', 'crs_transform': [30, 0, 167985, 0, -30, 4587315]}, {'id': 'SR_B4', 'data_type': {'type': 'PixelType', 'precision': 'int', 'min': 0, 'max': 65535}, 'dimensions': [7891, 8001], 'crs': 'EPSG:32612', 'crs_transform': [30, 0, 167985, 0, -30, 4587315]}, {'id': 'SR_B5', 'data_type': {'type': 'PixelType', 'precision': 'int', 'min': 0, 'max': 65535}, 'dimensions': [7891, 8001], 'crs':

Below we print the crs projection type of the image collection as a demo

In [14]:
# Similar to the previous example, but here we access the CRS (Coordinate Reference System) of the first band of the first feature in the collection. 
# This approach can be useful for accessing specific metadata about the images in the collection, or accessing the properties of a single image in the collection.

print(SLC_filtered_col.collection.getInfo()['features'][0]['bands'][0]['crs'])

EPSG:32612


After calling `.getInfo()`, the server-side ee.ImageCollection is converted into a native Python dictionary representing the object’s structure. This allows you to inspect metadata like projection, band names, image dimensions, and more.

__________

**Data management is an important aspect of remote sensing workflows and RadGEEToolbox provides flexibility to convert image collections back and forth from RadGEEToolbox and GEE image collection objects, which allows inclusion of custom GEE API functions and image processing workflows while retaining the ability to use RadGEEToolbox functionality at any time**

If we print the variable of one of the collections we see it is a RadGEEToolbox object

In [12]:
print(S2_col)

<RadGEEToolbox.Sentinel2Collection.Sentinel2Collection object at 0x000002253CEBF8F0>


________

**Below is an example of how to convert back to a GEE collection then convert back to a RadGEEToolbox collection**

First we convert a Sentinel2Collection object to a GEE object using `.collection`, as verified by the printout below

In [13]:
# We can turn a Sentinel2Collection or LandsatCollection object into an Earth Engine image collection using the collection attribute
S2_gee_col = S2_col.collection
print('The collection is now a', type(S2_gee_col))

The collection is now a <class 'ee.imagecollection.ImageCollection'>


You may now perform native GEE operations on this collection. 

_________
Once you are ready to convert back, the following demonstrates the easy conversion from a GEE object to a Sentinel2Collection object - this is identical for the `LandsatCollection` and `Sentinel1Collection` classes. 

**When initializing the class object, just feed in the ee.ImageCollection using the argument `collection=`**

The printout below verifies the collection is once again a RadGEEToolbox object

In [14]:
# Say you have an Earth Engine image collection object but you want to turn it into a Sentinel2Collection or LandsatCollection object, 
# just feed it in as a collection!
S2_col = Sentinel2Collection(collection=S2_gee_col)
print('The collection is back to a', type(S2_col))

The collection is back to a <class 'RadGEEToolbox.Sentinel2Collection.Sentinel2Collection'>


_______

## Supplemental examples

**Attributes**

________
How to store and print the list of dates in the `S2_col` image collection using `.dates`

In [15]:
# We can easily print the dates of all of the images in the collection using the dates attribute - this is a client-side operation
S2_dates = S2_col.dates
print('Readable list of image dates (client-side)', S2_dates)

Readable list of image dates (client-side) ['2023-06-01', '2023-06-01', '2023-06-02', '2023-06-02', '2023-06-04', '2023-06-04', '2023-06-04', '2023-06-06', '2023-06-06', '2023-06-07', '2023-06-07', '2023-06-09', '2023-06-09', '2023-06-09', '2023-06-11', '2023-06-11', '2023-06-12', '2023-06-12', '2023-06-12', '2023-06-14', '2023-06-14', '2023-06-14', '2023-06-16', '2023-06-16', '2023-06-17', '2023-06-17', '2023-06-19', '2023-06-19', '2023-06-19', '2023-06-21', '2023-06-21', '2023-06-22', '2023-06-22', '2023-06-22', '2023-06-24', '2023-06-24', '2023-06-24', '2023-06-26', '2023-06-26', '2023-06-27', '2023-06-27', '2023-06-29', '2023-06-29', '2023-06-29']


`.dates` makes a client-side request using `.getInfo()` to convert the server-side list of image dates into a native Python list. The result is cached, so printing or reusing the list does not repeatedly trigger new requests.

However, in some cases you may want to retain the list as a server-side `ee.List`—for example, when performing iterative operations inside native Earth Engine functions. 

For this, use `.dates_list`.

---

Using `.dates_list` to retrieve a server-side list of dates from the `S2_col` image collection is useful for operations that must remain on the Earth Engine server. 

The printout below shows the object is an `ee.List`, not a Python list.


In [16]:
# Alternatively, we can make a list of server-side dates for iteration, if needed
S2_dates_server_side = S2_col.dates_list
print('Server side dates are of type:', type(S2_dates_server_side))

Server side dates are of type: <class 'ee.ee_list.List'>


__________

How to mask **clouds** out of a multispectral image collection using `.masked_clouds_collection`

In [17]:
# You can easily mask out clouds or water in the image collections
S2_masked_clouds = S2_col.masked_clouds_collection

___________

How to mask to **water** with a multispectral image collection using `.masked_to_water_collection`

In [18]:
S2_masked_to_water = S2_col.masked_to_water_collection

________

**Utilizing methods for general data management**
- Masking to polygon using `.mask_to_polygon()`


In [19]:
# Mask entire collection based on geometry
masked_S2_col = S2_col_orbit_filter.mask_to_polygon(salt_lake_geometry)

- Mosacing images with same date using `.MosaicByDate`


In [20]:
# Mosaic images in collection that share an image date
mosaiced_S2_col = S2_col_boundary_filter.MosaicByDate

- Masking water out of images automatically using `.masked_water_collection`

In [21]:
# Mask water pixels from each single image using quality bands
water_masked_S2_col = S2_col_boundary_filter.masked_water_collection

- Masking images to water automatically using `.masked_water_collection_NDWI()`

In [22]:
# Mask water pixels from each single image using NDWI - where values less than the specified threshold are masked in each image
water_masked_S2_col = S2_col_boundary_filter.masked_water_collection_NDWI(threshold=0)

___________
**Example chaining of methods - where we first mosaic the collection using `.MosaicByDate`, mask the collection to water pixels using `.masked_to_water_collection`, then calculate relative turbidity for each image using `.turbidity`**

In [23]:
# Example chaining of methods - where we first mosaic the collection, mask the collection to water pixels, then calculate relative turbidity for each image
turbidity_chain_example = S2_col.MosaicByDate.masked_to_water_collection.turbidity
print(turbidity_chain_example.dates)

['2023-06-01', '2023-06-02', '2023-06-04', '2023-06-06', '2023-06-07', '2023-06-09', '2023-06-11', '2023-06-12', '2023-06-14', '2023-06-16', '2023-06-17', '2023-06-19', '2023-06-21', '2023-06-22', '2023-06-24', '2023-06-26', '2023-06-27', '2023-06-29']


__________
**Arguably, two of the most useful method functions when exploring an image collection are image_grab() or image_pick() - which allow you to iteratively select images from an image collection, which is helpful when visualizing the imagery**

- Example using `.image_grab()`, grabbing the most recent image in the collection and printing the date

In [24]:
# Select image from collection based on index
image_from_S2_collection = mosaiced_S2_col.image_grab(-1)
print('Image date: ', image_from_S2_collection.getInfo()['properties']['Date_Filter'])

Image date:  2023-06-14


- Example using `.image_pick()`, where you specify the date of the image you want to pick from the collection.

First let's pick a date, `date_of_interest` and print the date to verify

In [25]:
date_of_interest = mosaiced_S2_col.dates[-1]
print('Date of interest: ', date_of_interest)

Date of interest:  2023-06-14


Then use `.image_pick()` to select the image from the collection

We print the date of the selected image to verify it matches the date we want

In [26]:
# Select image from collection based on date
image_from_S2_collection = mosaiced_S2_col.image_pick(date_of_interest)

#Verify the date of the image matches the date we selected
print('Date of selected image: ', image_from_S2_collection.getInfo()['properties']['Date_Filter'])

Date of selected image:  2023-06-14


_______________________________
**Using static functions**

The below examples show how you can use RadGEEToolbox functions on Earth Engine objects - where we define the Earth Engine object by converting a RadGEEToolbox collection to an Earth Engine object. Then, to apply the function we use `.map()` to iterate the function across the collection (for functions which require an ee.Image as the input)

**Note: All functionality offered by static functions are built into the RadGEEToolbox functionality. These are only useful when wanting to work outside of RadGEEToolbox class objects and solely with Earth Engine objects**

- Using `.image_dater` to add the RadGEEToolbox recognized date to the properties of each image in the collection. 

This is mandatory if you anticipate using RadGEEToolbox functionality as static functions, as they will look for an image property called `Date_Filter` which is not present by default. 

In [27]:
# adding image date properties to images (very general example)
S2_col_date_props = S2_col.collection.map(Sentinel2Collection.image_dater)

- Using `MaskWaterS2` to automatically mask water from Sentinel-2 imagery

In [28]:
# masking water from images
S2_water_masked_col = S2_col.collection.map(Sentinel2Collection.MaskWaterS2)

- Using `PixelAreaSum()` to calculate the area of a class from a classified image

In [29]:
# calculating the surface area of pixels of interest as square meters(water pixels from NDWI for example)
water_area = Sentinel2Collection.PixelAreaSum(image=mosaiced_S2_col.ndwi.image_grab(-1), band_name='ndwi', geometry=salt_lake_geometry, threshold=0, scale=10)

print('Square meters of water in image:', water_area.getInfo().get('properties')['ndwi'])

Square meters of water in image: 1647328630.1742606


_______________________
Finally, a powerful example illustrating how simple it is to create a valuable time series dataset, in this case calculating the area of water within Salt Lake County over an entire collection.

We stack `.ndwi` and `.PixelAreaSumCollection()` to process the orginal collection to NDWI and create the time series in one line of code

Then show how to print the resulting area calculations using `.aggregate_array()`. This is necessary, as Earth Engine doesn't allow outputs of different types from the input data - so a list can not be directly returned from an image collection. To work around this, the surface area of the class of interest is stored as a property for each image in the collection under the band name of the class of interest. In this case, the band name is 'ndwi', so we print out the list of image properties with the name of 'ndwi' and use `.getInfo()` to convert the server-side list to a client-side list.

For best data management, it is suggested to store the list as a dataframe and export as a csv or table type of preference for further analyses - or else processing will be slowed by repetitive client-side requests.

In [None]:
# Showing how to make an image collection with pixel area calculated for all images in the collection (using ndwi images as example), and how to assess 
# the area calculations using aggregate_array()

area_col = mosaiced_S2_col.ndwi.PixelAreaSumCollection(band_name='ndwi', geometry=salt_lake_geometry, threshold=0, scale=50)
print('Square meters of water in images are:', area_col.ExportProperties('ndwi'))

Square meters of water in images are: [1069594768.6520188, 251369607.87758926, 247235936.65393806, 496807659.3774899, 192290154.215422, 1646335856.4432714]


Dates of images corresponding to water area list

In [31]:
print('Dates of images:', mosaiced_S2_col.dates)

Dates of images: ['2023-06-01', '2023-06-04', '2023-06-06', '2023-06-09', '2023-06-11', '2023-06-14']


### *Please refer to the [RadGEEToolbox documentation](https://radgeetoolbox.readthedocs.io/en/latest/) for more information and a comprehensive list of available functionality*