# GlaSEE pipeline

Full classification pipeline for Sentinel-2 TOA, Sentinel-2 SR, and Landsat 8/9 images. 

## Requirements:
1. Google Earth Engine (GEE) account: used to query imagery and the DEM (if no DEM is provided). Sign up for a free account [here](https://earthengine.google.com/new_signup/). 

2. Google Drive folder: Create a folder where output snow cover statistics will be saved. Enter the name of this folder as the `out_folder` variable below. If you don't create the folder ahead of time, duplicates of the same folder will be created for each output file!

## Define image search settings and paths

In [1]:
import os
import ee
import geemap
import sys

# -----Define Google Drive folder for outputs
# Note: Make sure this folder already exists and is the only folder in your "My Drive" with that name. 
out_folder = 'glacier-snow-cover_exports'

# -----Import pipeline utilities
# Assumes pipeline_utils.py is in the same folder as this notebook
script_path = os.getcwd()
sys.path.append(script_path)
import glasee_pipeline_utils as utils

# -----Define image search settings
# Date and month ranges (inclusive)
date_start = '2014-05-01' #default 2014-05-01 start
date_end = '2025-08-15' #default 2025-08-15 end
month_start = 5 #May = 5
month_end = 10 #Oct = 10
# Minimum fill portion percentage of the AOI (0–100), used to remove images after mosaicking by day. 
min_aoi_coverage = 70
# Whether to mask clouds using the respective cloud mask via the geedim package
mask_clouds = True

  import pkg_resources


## Authenticate and/or Initialize Google Earth Engine (GEE)

Replace the project ID with your GEE project. Default = `ee-{GEE-username}`

In [2]:
project_id = "ee-ellynenderlin"

try:
    ee.Initialize(project=project_id)
except:
    ee.Authenticate()
    ee.Initialize(project=project_id)

## Run the pipeline for a single glacier

### Select the Area of Interest (AOI) from the Randolph Glacier Inventory (RGI) dataset

This cell will plot the RGI dataset on a map. To find a glacier, click on the wrench in the upper right toolbox of the map, and use the "Inspector" to click on a polygon and view the its properties. Right click on the "rgi_id" property to highlight and then copy. Replace the `rgi_id` variable below with your selected site. 

In [3]:
# Load the RGI v7 dataset
rgi = ee.FeatureCollection("projects/ee-raineyaberle/assets/glacier-snow-cover-mapping/RGI2000-v7-G")

# Select a glacier by the RGI v7 ID
rgi_id = 'RGI2000-v7.0-G-01-14773' #RGI2000-v7.0-G-01-11350 (Wolverine), RGI2000-v7.0-G-01-05299 (Gulkana), and RGI2000-v7.0-G-01-05740 (Kennicott)
                                 
# Grab the geometry
aoi = rgi.filter(ee.Filter.eq('rgi_id', rgi_id))
aoi = aoi.geometry()
aoi_area = aoi.area().getInfo() # save area [m^2] for splitting date ranges later
print(f"Glacier area = {int(aoi_area/1e6)} km2")

# Create a Map
Map = geemap.Map()
Map.addLayer(rgi, {'color': 'blue', 'opacity':  0.5}, 'RGI v7')
Map.addLayer(aoi, {'color': 'orange', 'opacity': 0.8}, 'AOI')
Map.centerObject(aoi)
Map

Glacier area = 16 km2


Map(center=[0, 0], controls=(WidgetControl(options=['position', 'transparent_bg'], widget=SearchDataGUI(childr…

### Load the Digital Elevation Model (DEM)

Default: use the ArcticDEM Mosaic where there is > 90 % coverage. Otherwise, use the NASADEM. For sites that use the ArcticDEM Mosaic, elevations are reprojected to the EGM96 geoid to match the vertical datum of NASADEM. 

In [4]:
# Query GEE for DEM
dem = utils.query_gee_for_dem(aoi)

# Add DEM to map
# grab min and max elevations for color limits
minMax = dem.reduceRegion(reducer=ee.Reducer.minMax(),
                          geometry=aoi, 
                          scale=30,
                          maxPixels=1e9,
                          bestEffort=True)
elev_min = minMax.get('elevation_min')
elev_max = minMax.get('elevation_max')
print(f'Elevation range = {int(elev_min.getInfo())} to {int(elev_max.getInfo())} m')
# colors based on the "terrain" palette from matplotlib
palette = ['#333399', '#0d7fe5', '#00be90','#55dd77','#c6f48e','#e3db8a','#aa926b','#8e6e67','#c6b6b3','#ffffff']
Map.addLayer(dem, {'palette': palette, 'min': elev_min, 'max': elev_max}, 'DEM')


Querying GEE for DEM
ArcticDEM coverage = 100 %
Using ArcticDEM Mosaic
Elevation range = 958 to 2069 m


Run Sentinel-2 Top of Atmosphere (TOA): 2016 onwards

In [None]:
dataset = "Sentinel-2_TOA"
utils.run_classification_pipeline(aoi, aoi_area, dem, dataset, date_start, date_end, month_start, month_end, 
                                  min_aoi_coverage, mask_clouds, out_folder, rgi_id, scale=None, verbose=False)

Run Sentinel-2 Surface Reflectance (SR): 2019 onwards

In [5]:
dataset = "Sentinel-2_SR"
utils.run_classification_pipeline(aoi, aoi_area, dem, dataset, date_start, date_end, month_start, month_end, 
                                  min_aoi_coverage, mask_clouds, out_folder, rgi_id, scale=None, verbose=False)

AOI area < 500 km2 — splitting date range by month.
Number of date ranges = 40
Exporting snow cover statistics to glacier-snow-cover_exports Google Drive folder with file naming convention: RGI2000-v7.0-G-01-14773_Sentinel-2_SR_snow_cover_stats_DATE-START_DATE-END.csv
To monitor export tasks, see your Google Cloud Console or GEE Task Manager: https://code.earthengine.google.com/tasks
('2019-05-01', '2019-05-31')
...current queue length 2900
('2019-06-01', '2019-06-30')
...current queue length 2901
('2019-07-01', '2019-07-31')
...current queue length 2902
('2019-08-01', '2019-08-31')
...current queue length 2903
('2019-09-01', '2019-09-30')
...current queue length 2904
('2019-10-01', '2019-10-31')
...current queue length 2905
('2020-05-01', '2020-05-31')
...current queue length 2906
('2020-06-01', '2020-06-30')
...current queue length 2907
('2020-07-01', '2020-07-31')
...current queue length 2908
('2020-08-01', '2020-08-31')
...current queue length 2909
('2020-09-01', '2020-09-30')
...c

Run Landsat 8/9 SR: 2013 onwards

In [6]:
dataset = "Landsat"
utils.run_classification_pipeline(aoi, aoi_area, dem, dataset, date_start, date_end, month_start, month_end, 
                                  min_aoi_coverage, mask_clouds, out_folder, rgi_id, scale=None, verbose=False)

AOI area < 500 km2 — splitting date range by month.
Number of date ranges = 70
Exporting snow cover statistics to glacier-snow-cover_exports Google Drive folder with file naming convention: RGI2000-v7.0-G-01-14773_Landsat_snow_cover_stats_DATE-START_DATE-END.csv
To monitor export tasks, see your Google Cloud Console or GEE Task Manager: https://code.earthengine.google.com/tasks
('2014-05-01', '2014-05-31')
...current queue length 2934
('2014-06-01', '2014-06-30')
...current queue length 2935
('2014-07-01', '2014-07-31')
...current queue length 2936
('2014-08-01', '2014-08-31')
...current queue length 2937
('2014-09-01', '2014-09-30')
...current queue length 2938
('2014-10-01', '2014-10-31')
...current queue length 2939
('2015-05-01', '2015-05-31')
...current queue length 2940
('2015-06-01', '2015-06-30')
...current queue length 2941
('2015-07-01', '2015-07-31')
...current queue length 2942
('2015-08-01', '2015-08-31')
...current queue length 2943
('2015-09-01', '2015-09-30')
...current

## Run the pipeline for multiple glaciers

First, create a list of glacier IDs for analysis. 

Below is an example selection of glaciers, where the full RGI v. 7 collection is filtered by RGI O2 region ("o2region") and area ("area_km2"). For the full list of properties available for filtering, see the [RGI v. 7 documentation](https://www.glims.org/rgi_user_guide/products/glacier_product.html#full-list-of-attributes) or run the following command: 

`rgi.first().propertyNames().getInfo()`

In [8]:
# Load the RGI v. 7 dataset
rgi = ee.FeatureCollection("projects/ee-raineyaberle/assets/glacier-snow-cover-mapping/RGI2000-v7-G")

# Filter by region and subregion
rgi_r1r2 = rgi.filter(ee.Filter.eq('o2region', '01-05')) # filter by RGI subregion (e.g., "01-05" = Alaska, St. Elias)

# Filter by glacier area
rgi_area_filt = rgi_r1r2.filter(ee.Filter.gt('area_km2', 10)) # area km2
rgi_area_filt = rgi_area_filt.filter(ee.Filter.lt('area_km2', 100)) # area km2 (pixel scaling applies for >3000km^2 and issues arise at >10000km^2)

# Get the list of RGI IDs
id_list = rgi_area_filt.aggregate_array('rgi_id')
id_list = id_list.getInfo()
# print(id_list[52]) #use this to double-check you are restarting on the right index
print('Number of glaciers selected:', len(id_list))

RGI2000-v7.0-G-01-14773
Number of glaciers selected: 96


In [None]:
# Iterate over RGI IDs
for i in range(0,len(id_list))[53:55]: #add [0:1] to test for one
    rgi_id = id_list[i]
    print("Glacier #",i)
    print("Glacier ID used for output file names:", rgi_id)

    # grab glacier area of interest
    aoi = rgi.filter(ee.Filter.eq('rgi_id', rgi_id))
    aoi = aoi.geometry()
    aoi_area = aoi.area().getInfo() # save area [m^2] for splitting date ranges later
    print(f"Glacier area = {aoi_area/1e6} km2")

    # Query GEE for DEM
    dem = utils.query_gee_for_dem(aoi) 

    # Run pipeline for each dataset
    for dataset in ['Sentinel-2_TOA','Sentinel-2_SR','Landsat']: #specify ['Sentinel-2_TOA','Sentinel-2_SR','Landsat'] for all satellites
        utils.run_classification_pipeline(aoi, aoi_area, dem, dataset, date_start, date_end, month_start, month_end, 
                                          min_aoi_coverage, mask_clouds, out_folder, rgi_id, scale=None, verbose=False)

    print('...moving on... \n')
    

Glacier # 53
Glacier ID used for output file names: RGI2000-v7.0-G-01-15706
Glacier area = 56.75453740224053 km2

Querying GEE for DEM
ArcticDEM coverage = 100 %
Using ArcticDEM Mosaic
AOI area < 500 km2 — splitting date range by month.
Number of date ranges = 58
Exporting snow cover statistics to glacier-snow-cover_exports Google Drive folder with file naming convention: RGI2000-v7.0-G-01-15706_Sentinel-2_TOA_snow_cover_stats_DATE-START_DATE-END.csv
To monitor export tasks, see your Google Cloud Console or GEE Task Manager: https://code.earthengine.google.com/tasks
('2016-05-01', '2016-05-31')
...current queue length 2989
('2016-06-01', '2016-06-30')
...current queue length 2990
('2016-07-01', '2016-07-31')
...current queue length 2989
('2016-08-01', '2016-08-31')
...current queue length 2990
('2016-09-01', '2016-09-30')
...current queue length 2991
('2016-10-01', '2016-10-31')
...current queue length 2992
('2017-05-01', '2017-05-31')
...current queue length 2993
('2017-06-01', '2017-