# Classify snow-covered area (SCA): full pipeline

Rainey Aberle

Department of Geosciences, Boise State University

2022

### Requirements:
- Planet account with access to PlanetScope imagery through the NASA CSDA contract. Sign up __[here](https://www.planet.com/markets/nasa/)__.
- Area of Interest (AOI) shapefile: where snow will be classified in each image. 
- PlanetScope 4-band image collection over the AOI. Download images using `planetAPI_image_download.ipynb` or through [PlanetExplorer](https://www.planet.com/explorer/). 
- Google Earth Engine (GEE) account: used to pull DEM over the AOI. Sign up for a free account [here](). 


### Outline:
__0. Setup__ paths in directory, AOI file location - _modify this section!_

__1. Mosaic images__ captured in the same hour

__2. Adjust image radiometry__ using median surface reflectance at the top perentile of elevations

__3. Classify SCA__ and use the snow elevations distribution to estimate the seasonal snowline

----------

### 0. Setup

#### Define paths in directory, image file extensions, and desired settings. 
Modify lines located within the following:

`#### MODIFY HERE ####`  

`#####################`

In [None]:
##### MODIFY HERE #####
# -----Path to planet-snow
base_path = '/Users/raineyaberle/Research/PhD/Planet_snow_cover/planet-snow/'

# -----Paths in directory
site_name = 'Gulkana'
# path to images
im_path = base_path+'../study-sites/'+site_name+'/imagery/PlanetScope/2016-2021/'
# path to AOI including the name of the shapefile
AOI_fn = base_path+'../../GIS_data/RGI_outlines/'+site_name+'_RGI.shp'
# path for output images
out_path = im_path+'../'
# path for output figures
figures_out_path = im_path+'../../../figures/'

# -----Image file extensions (for mosaicing)
ext = 'SR_clip'

# -----Determine settings
plot_results = True # = True to plot figures of results for each image where applicable
skip_clipped = False # = True to skip images where bands appear "clipped", i.e. max blue SR < 0.8
crop_to_AOI = True # = True to crop images to AOI before calculating SCA
save_outputs = True # = True to save SCA images to file
save_figures = True # = True to save SCA output figures to file

#######################

# -----Import packages
import os
import numpy as np
import glob
import subprocess
from osgeo import gdal
import matplotlib.dates as mdates
from matplotlib.dates import DateFormatter
from matplotlib.patches import Rectangle
from matplotlib import pyplot as plt, dates
import rasterio as rio
import rasterio.features
from rasterio.mask import mask
from rasterio.plot import show
from shapely.geometry import Polygon, shape
import shapely.geometry
from scipy.interpolate import interp2d
from scipy import stats
import pandas as pd
import geopandas as gpd
import sys
import ee
import fiona
import pickle

# -----Add path to functions
sys.path.insert(1, base_path+'functions/')
import ps_pipeline_utils as f

# -----Load AOI as geopandas.GeoDataFrame
AOI = gpd.read_file(AOI_fn)

# -----Set paths for output files
im_mosaic_path = out_path+'mosaics/'
im_adj_path = out_path+'adjusted-filtered/'


#### Authenticate and initialize Google Earth Engine (GEE). 

__Note:__ The first time you run the following cell, you will be asked to authenticate your GEE account for use in this notebook. This will send you to an external web page, where you will walk through the GEE authentication workflow and copy an authentication code back in this notebook when prompted. 

In [None]:
try:
    ee.Initialize()
except: 
    ee.Authenticate()
    ee.Initialize()

### 1. Mosaic images by date

Mosaic all images captured within the same hour to increase area coverage of each image over the AOI. Images captured in different hours are more likely to have drastic variations in illumination. Adapted from code developed by Jukes Liu. 

If you have `plot_results` set to `True`, I suggest using the output figures to filter out images, such as those that are completely saturated (some small regions of saturation are okay), barely cover the AOI, or have thick cloud cover over a large portion of the AOI. Place unusable images into a separate folder (e.g., `unusable_images`) or otherwise remove them from the `mosaics/` folder before proceeding to step __2.__ below. This takes extra time, but will help to improve the resulting SCA and snow line elevation time series. 

Note that images with no data over the AOI are skipped in this step. Issues with illumination or radiometry will be further filtered and adjusted in the next step.  

In [None]:
# -----Create image mosaics output directory if it does not already exist
if os.path.isdir(im_mosaic_path)==0:
    os.mkdir(im_mosaic_path)
    print(im_mosaic_path+' directory made')

# -----Load file names with proper extension
os.chdir(im_path)
im_fns = glob.glob('*'+ext+'*')
im_fns.sort() # sort chronologically

# ----Mosaic images by date
f.mosaic_ims_by_date(im_path, im_fns, ext, im_mosaic_path, AOI, plot_results)

### 2. Adjust image radiometry

Here, we will mitigate issues related to varying illumination and general radiometry by first creating a polygon(s) representing the of an area within the AOI that is likely covered with snow year-round using the upper 30th percentile of elevations. The polygon(s) will then be used to stretch the image, assuming the median surface reflectance value within the polygon is equal to that predicted for snow, and that the darkest point in the image has a surface reflectance of 0. Images with no real data values within the AOI or in the polygon(s) will be skipped. 

In [None]:
# -----Query GEE for DEM
os.chdir(im_mosaic_path) 
im_mosaic_fns = glob.glob('*.tif') # read mosaicked image filenames
im_mosaic_fns.sort() # sort chronologically
DEM, DEM_x, DEM_y, AOI_UTM = f.query_GEE_for_DEM(AOI, im_mosaic_path, im_mosaic_fns)

# -----Create a polygon(s) of the top 30th percentile elevations within the AOI
polygon, im_fn, im, r, g, b, im_x, im_y = f.create_top_elev_AOI_poly(AOI_UTM, im_mosaic_path, im_mosaic_fns, DEM, DEM_x, DEM_y)
# plot
if plot_results:
    fig = plt.figure(figsize=(8,8))
    plt.imshow(np.dstack([r, g, b]), extent=(np.min(im_x), np.max(im_x), np.min(im_y), np.max(im_y)))
    plt.plot(*AOI_UTM.geometry[0].exterior.xy, color='white', linewidth=2, label='AOI')
    count=0 # count used to only display one polygon in legend
    for geom in polygon.geoms:
        xs, ys = geom.exterior.xy
        if count==0:
            plt.plot([x for x in xs], [y for y in ys], color='orange', label='polygon(s)')
        else:
            plt.plot([x for x in xs], [y for y in ys], color='orange', label='_nolegend_')
        count+=1            
    plt.xlabel('Easting [m]')
    plt.ylabel('Northing [m]')
    plt.title(im_fn)
    fig.legend(loc='upper right')
    fig.tight_layout()
    plt.show()
    
# -----Loop through images
im_adj_fns = [] # list of output adjusted images
for im_mosaic_fn in im_mosaic_fns:
    
    # load image
    print(im_mosaic_fn)
    im = rio.open(im_mosaic_fn)
    
    # adjust radiometry
    im_adj_fn = f.adjust_image_radiometry(im, im_mosaic_fn, im_mosaic_path, polygon, im_adj_path, skip_clipped, plot_results)
    im_adj_fns += [im_adj_fn] # append image file name to list
                     
    print('----------')
    print(' ')

### 3. Classify SCA

In [None]:
# -----Load image classifier and feature columns
clf_fn = base_path+'inputs-outputs/all_sites_classifier.sav'
clf = pickle.load(open(clf_fn, 'rb'))
feature_cols_fn = base_path+'inputs-outputs/all_sites_classifier_feature_cols.pkl'
feature_cols = pickle.load(open(feature_cols_fn,'rb'))

# -----Initialize image dates
im_dts = [] # image datetimes

# -----Crop images if previously selected
if crop_to_AOI==True:
    
    # Crop images if previously selected
    im_cropped_path = f.crop_images_to_AOI(im_path, im_fns, AOI_UTM)
    # grab cropped image names
    os.chdir(cropped_im_path) # change directory
    im_fns_crop = glob.glob('*_crop.tif')
    im_fns_crop.sort() # sort file names by date
    im_fns_loop = im_fns_crop # image file names to use in loop
    
else:
    
    os.chdir(im_adj_path)
    im_fns_loop = im_fns # im_names to use in loop

# -----Make directory for output figures (if it does not already exist in file)
if save_figures and os.path.exists(figures_out_path)==False:
    os.mkdir(figures_out_path)
    print('made directory for output figures:' + figures_out_path)
        
# -----Create figure for snow elevations box plot
fig2, ax = plt.subplots(figsize=(16,8))
ax.set_ylabel('Snow elevations [m a.s.l.]')
ax.xaxis.set_major_formatter(dates.DateFormatter('%Y'))

# -----Initialize DataFrame to hold stats summary
df = pd.DataFrame(columns=('site_name', 'datetime', 'im_elev_min', 'im_elev_max', 'snow_elev_min', 'snow_elev_max', 
                           'snow_elev_median', 'snow_elev_10th_perc', 'snow_elev_90th_perc'))

# -----Loop through images
im_classified_path = im_path+'classified/'
i=0 # loop counter
for im_fn in im_fns_loop:

    # extract datetime from image name
    im_dt = np.datetime64(im_fn[0:4] + '-' + im_fn[4:6] + '-' + im_fn[6:8]
                          + 'T' + im_fn[9:11] + ':00:00')
    im_dts = im_dts + [im_dt]

    # open image
    im = rio.open(im_fn)

    # classify snow
    im_x, im_y, im_classified = f.classify_image(im, im_fn, clf, feature_cols, im_classified_path)   
    
    # determine snow elevations
    plot_output = True
    im_elev_min, im_elev_max, snow_elev, fig = f.determine_snow_elevs(DEM, DEM_x, DEM_y, im, im_classified, im_dt, im_x, im_y, plot_output)
    
    # calculate and plot stats
    iqr = stats.iqr(snow_elev, 
                rng=(10, 90))
    med = np.median(snow_elev)
    ax.add_patch(Rectangle((im_dt-np.timedelta64(1, 'D'), med-iqr/2), 
                           width=2*np.timedelta64(1, 'D'), height=iqr, color='blue'))
    ax.scatter([im_dt, im_dt], [np.min(snow_elev), np.max(snow_elev)], color='blue', s=10)
    ax.scatter(im_dt, med, facecolor='white', edgecolor='black', s=20)

    # save stats in pandas DataFrame
    df_row = pd.DataFrame({'site_name':site_name, 'datetime':im_dt, 'im_elev_min':im_elev_min, 'im_elev_min':im_elev_max, 
                           'snow_elev_min':np.min(snow_elev), 'snow_elev_min':np.max(snow_elev), 'snow_elev_median':med,  
                           'snow_elev_10th_perc':med-iqr/2, 'snow_elev_90th_perc':med+iqr/2}, index=[0])
    df = pd.concat([df, df_row], ignore_index=True)
    
    # save figure
    if save_figures==True:
        fig.savefig(figures_out_path+im_fn[0:15]+'_PlanetScope_SCA.png', dpi=200, facecolor='white', edgecolor='none')
        print('figure saved to file')

    i+=1 # increase loop counter

# -----Save figure and data table
if plot_output and save_figures:
    fig2.savefig(figures_out_path+site_name+'_snow_elevs.png', dpi=200, facecolor='white', edgecolor='none')
    print('snow elevations figure saved to file')
if save_outputs:
    df.to_csv(path_or_buf=images_out_path+site_name+'_snow_elevs_stats.csv', sep=',', na_rep='', header=True)
    print('data table saved to file')

# -----Stop timer
print('Time elapsed: '+str(np.round((time.time()-t1)/60, 2))+' minutes')

# -----Display complete figure 2
fig2