# Classify snow-covered area (SCA) in PlanetScope imagery: full pipeline

Rainey Aberle

Department of Geosciences, Boise State University

2022

### Requirements:
- Planet account with access to PlanetScope imagery through the NASA CSDA contract. Sign up __[here](https://www.planet.com/markets/nasa/)__.
- Area of Interest (AOI) shapefile: where snow will be classified in each image. 
- PlanetScope 4-band image collection over the AOI. Download images using `planetAPI_image_download.ipynb` or through __[PlanetExplorer](https://www.planet.com/explorer/)__. 
- Google Earth Engine (GEE) account: used to pull DEM over the AOI. Sign up for a free account __[here](https://earthengine.google.com/new_signup/)__. 


### Outline:
__0. Setup__ paths in directory, AOI file location - _modify this section!_

__1. Mosaic images__ captured in the same hour

__2. Adjust image radiometry__ using median surface reflectance at the top or bottom perentile of elevations

__3. Classify SCA__ and use the snow elevations distribution to estimate the seasonal snowline

__4. Estimate snow line__ and snow line elevations

----------

## 0. Setup

#### Define paths in directory, image file extensions, and desired settings. 
Modify lines located within the following:

`#### MODIFY HERE ####`  

`#####################`

In [None]:
##### MODIFY HERE #####

# -----Paths in directory
site_name = 'SouthCascade'
# path to snow-cover-mapping
base_path = '/Users/raineyaberle/Research/PhD/snow_cover_mapping/snow-cover-mapping/'
# path to images
im_path = base_path + '../study-sites/' + site_name + '/imagery/PlanetScope/2016-2022/'
# path to AOI including the name of the shapefile
AOI_fn = im_path + '../../../glacier_outlines/' + site_name + '_USGS_*.shp'
# path to DEM including the name of the tif file
# Note: set DEM_fn=None if you want to use the ASTER GDEM on Google Earth Engine
DEM_fn = im_path + '../../../DEMs/' + site_name + '*_DEM_filled.tif'
# path for output images
out_path = im_path + '../'
# path for output figures
figures_out_path = im_path + '../../../figures/'

# -----Determine settings
plot_results = True # = True to plot figures of results for each image where applicable
skip_clipped = False # = True to skip images where bands appear "clipped", i.e. max blue SR < 0.8
crop_to_AOI = True # = True to crop images to AOI before calculating SCA
save_outputs = True # = True to save SCA images to file
save_figures = True # = True to save SCA output figures to file

#######################

# -----Import packages
import os
import numpy as np
import glob
import subprocess
from matplotlib.patches import Rectangle
from matplotlib import pyplot as plt, dates
import rasterio as rio
import xarray as xr
import rioxarray as rxr
from scipy import stats
import pandas as pd
import geopandas as gpd
import sys
import time
import ee
import pickle
from time import mktime

# -----Add path to functions
sys.path.insert(1, base_path+'functions/')
import pipeline_utils_PlanetScope as pf

# -----Set paths for output files
im_mask_path = out_path + 'masked/'
im_mosaic_path = out_path + 'mosaics/'
im_adj_path = out_path + 'adjusted/'
im_classified_path = out_path + 'classified/'
snowlines_path = out_path + 'snowlines/'

# -----Load AOI as gpd.GeoDataFrame
AOI_fn = glob.glob(AOI_fn)[0]
AOI = gpd.read_file(AOI_fn)
    
# -----Load DEM as Xarray DataSet
if DEM_fn==None:
    
    # Authenticate and initialize Google Earth Engine
    # Note: The first time you run this, you will be asked to authenticate your GEE account 
    # for use in this notebook. This will send you to an external web page, where you will 
    # walk through the GEE authentication workflow and copy an authentication code back 
    # in this notebook when prompted. 
    try:
        ee.Initialize()
    except: 
        ee.Authenticate()
        ee.Initialize()
    # query GEE for DEM
    DEM, AOI_UTM = pf.query_GEE_for_DEM(AOI)
    
else:
    
    # reproject the AOI to WGS to solve for the optimal UTM zone
    AOI_WGS = AOI.to_crs(4326)
    AOI_WGS_centroid = [AOI_WGS.geometry[0].centroid.xy[0][0],
                        AOI_WGS.geometry[0].centroid.xy[1][0]]
    epsg_UTM = pf.convert_wgs_to_utm(AOI_WGS_centroid[0], AOI_WGS_centroid[1])
    # reproject AOI to UTM
    AOI_UTM = AOI.to_crs(str(epsg_UTM))
    # load DEM as xarray DataSet
    DEM_fn = glob.glob(DEM_fn)[0]
    DEM_rio = rio.open(DEM_fn) # open using rasterio to access the transform
    DEM = xr.open_dataset(DEM_fn)
    DEM = DEM.rename({'band_data': 'elevation'})
    # reproject the DEM to the optimal UTM zone
    DEM = DEM.rio.reproject(str('EPSG:'+epsg_UTM))

## 1. Mask image pixels with clouds, shadows, and heavy haze using associated Usable Data Mask (`udm`) files.  

In [None]:
# -----Read surface reflectance file names
os.chdir(im_path)
im_fns = glob.glob('*SR*.tif')
im_fns = sorted(im_fns) # sort chronologically

# ----Mask images
for im_fn in im_fns:
    
    print(im_fn)
    plot_results=True
    pf.mask_im_pixels(im_path, im_fn, im_mask_path, save_outputs, plot_results)
    print(' ')

## 2. Mosaic images by date

Mosaic all images captured within the same hour to increase area coverage of each image over the AOI. Images captured in different hours are more likely to have drastic variations in illumination. Adapted from code developed by [Jukes Liu](https://github.com/julialiu18). 

Note that images with no data over the AOI are skipped in this step. Issues with illumination or radiometry will be further filtered and adjusted in the next step.  

In [None]:
# -----Read masked image file names
os.chdir(im_mask_path)
im_mask_fns = glob.glob('*_mask.tif')
im_mask_fns = sorted(im_mask_fns) # sort chronologically

# ----Mosaic images by date
pf.mosaic_ims_by_date(im_mask_path, im_mask_fns, im_mosaic_path, AOI_UTM, plot_results)

## 3. Adjust image radiometry

Mitigate issues related to varying illumination and general radiometry by first creating a polygon(s) representing the of an area within the AOI that is likely covered with snow year-round using the upper 30th percentile of elevations. The polygon(s) will then be used to stretch the image, assuming the median surface reflectance value within the polygon is equal to that predicted for snow, and that the darkest point in the image has a surface reflectance of 0. Images with no real data values within the AOI or in the polygon(s) will be skipped. 

In [None]:
# -----Read mosaicked image file names
os.chdir(im_mosaic_path)
im_mosaic_fns = glob.glob('*.tif')
im_mosaic_fns = sorted(im_mosaic_fns)

# -----Create a polygon(s) of the top 20th percentile elevations within the AOI
plot_results=True 
polygon_top, polygon_bottom, im_mosaic_fn, im_mosaic = pf.create_AOI_elev_polys(AOI_UTM, im_mosaic_path, im_mosaic_fns, DEM)
# plot
if plot_results:
    fig, ax = plt.subplots(figsize=(8,8))
    ax.imshow(np.dstack([im_mosaic.data[2], im_mosaic.data[1], im_mosaic.data[0]]), 
               extent=(np.min(im_mosaic.x), np.max(im_mosaic.x), 
                       np.min(im_mosaic.y), np.max(im_mosaic.y)))
    AOI_UTM.plot(ax=ax, facecolor='none', edgecolor='black', linewidth=2, label='AOI')
    for count, geom in enumerate(polygon_top.geoms):
        xs, ys = geom.exterior.xy
        if count==0:
            ax.plot([x for x in xs], [y for y in ys], color='c', label='top polygon(s)')
        else:
            ax.plot([x for x in xs], [y for y in ys], color='c', label='_nolegend_')
    for count, geom in enumerate(polygon_bottom.geoms):
        xs, ys = geom.exterior.xy
        if count==0:
            ax.plot([x for x in xs], [y for y in ys], color='orange', label='bottom polygon(s)')
        else:
            ax.plot([x for x in xs], [y for y in ys], color='orange', label='_nolegend_')
    ax.set_xlabel('Easting [m]')
    ax.set_ylabel('Northing [m]')
    ax.set_title(im_mosaic_fn)
    fig.legend(loc='upper right')
    fig.tight_layout()
    plt.show()
    
# -----Loop through images
for im_mosaic_fn in im_mosaic_fns:
    
    # load image
    print(im_mosaic_fn)
    # adjust radiometry
    plot_results=True
    im_adj_fn, im_adj_method = pf.adjust_image_radiometry(im_mosaic_fn, im_mosaic_path, polygon_top, polygon_bottom, AOI_UTM, im_adj_path, skip_clipped, plot_results)
    print('image adjustment method = ' + im_adj_method)
    print('----------')
    print(' ')

## 4. Classify images

All adjusted images will be classified using the pre-trained classifier into the following classes:
- 1 = Snow
- 2 = Shadowed snow
- 3 = Ice
- 4 = Bare ground
- 5 = Water

The resulting classified image collection cropped to the AOI if `crop_to_AOI = True` and will be saved to the `im_classified_path` folder in directory if `save_outputs = True`. 

In [None]:
# -----Read adjusted image file names
os.chdir(im_adj_path)
im_adj_fns = glob.glob('*.tif')
im_adj_fns = sorted(im_adj_fns)

# start timer
t1 = time.time()

# -----Load image classifier and feature columns
clf_fn = base_path+'inputs-outputs/PS_classifier_all_sites.sav'
clf = pickle.load(open(clf_fn, 'rb'))
feature_cols_fn = base_path+'inputs-outputs/PS_feature_cols.pkl'
feature_cols = pickle.load(open(feature_cols_fn,'rb'))

# -----Loop through images
# image datetimes
im_dts = [] 
# DataFrame to hold stats summary
df = pd.DataFrame(columns=('site_name', 'datetime', 'im_elev_min', 'im_elev_max', 'snow_elev_min', 'snow_elev_max', 
                           'snow_elev_median', 'snow_elev_10th_perc', 'snow_elev_90th_perc'))
for im_adj_fn in im_adj_fns:

    print(im_adj_fn)
    
    # extract datetime from image name
    im_dt = np.datetime64(im_adj_fn[0:4] + '-' + im_adj_fn[4:6] + '-' + im_adj_fn[6:8]
                          + 'T' + im_adj_fn[9:11] + ':00:00')
    im_dts = im_dts + [im_dt]

    # classify snow
    try:
        im_classified_fn, im_adj = pf.classify_image(im_adj_fn, im_adj_path, 
                                                     clf, feature_cols, crop_to_AOI, AOI_UTM, im_classified_path)  
    except:
        print('error occured during classification, skipping...')
        continue
        
    # plot
    if plot_results:
        im_classified = rxr.open_rasterio(im_classified_path + im_classified_fn)
        im_classified = im_classified.where(im_classified!=-9999)
        fig, ax, sl_points_AOI = pf.plot_im_classified_histogram_contour(im_adj, im_classified, DEM, AOI_UTM, contour=None)
        ax[1].legend(loc='best')
        plt.show()
        
    print(' ')
        
# -----Stop timer
print('Time elapsed: '+str(np.round((time.time()-t1)/60, 2))+' minutes')


## 5. Estimate seasonal snow line and snow line elevations

In [None]:
# -----Read classified image file names
os.chdir(im_classified_path)
im_classified_fns = glob.glob('*.tif')
im_classified_fns = sorted(im_classified_fns)

# -----Create directories for outputs if they do not exist
# snowlines folder
if save_outputs and os.path.exists(snowlines_path)==False:
    os.mkdir(snowlines_path)
    print('made directory for output snowlines:' + snowlines_path)
# figures folder
if save_figures and os.path.exists(figures_out_path)==False:
    os.mkdir(figures_out_path)
    print('made directory for output figures:' + figures_out_path)

# -----Intialize variables
results_df = pd.DataFrame(columns=['study_site', 'datetime', 'snowlines_coords', 'snowlines_elevs', 'snowlines_elevs_median'])

# -----Loop through classified image filenames
for im_classified_fn in im_classified_fns:
    
    # extract datetime from image file name
    im_date = im_classified_fn[0:11]
    im_dt = np.datetime64(im_classified_fn[0:4] + '-' + im_classified_fn[4:6] + '-' + im_classified_fn[6:8]
                          + 'T' + im_classified_fn[9:11] + ':00:00')
    
    print(im_date)
    
    # load adjusted image from the same date
    os.chdir(im_adj_path)
    im_adj_fn = glob.glob(im_date + '*.tif')[0]
    
    # estimate snow line
    # try:
    fig, ax, sl_est, sl_est_elev = pf.delineate_snow_line(im_adj_fn, im_adj_path, im_classified_fn, im_classified_path, AOI_UTM, DEM)
    plt.show()
    
    # calculate median snow line elevation
    sl_est_elev_median = np.nanmedian(sl_est_elev)

    # compile results in df
    result_df = pd.DataFrame({'study_site': site_name, 
                              'datetime': im_dt, 
                              'snowlines_coords': [sl_est], 
                              'snowlines_elevs': [sl_est_elev], 
                              'snowlines_elevs_median': sl_est_elev_median})
    # concatenate to results_df
    results_df = pd.concat([results_df, result_df])
    
    # save figure
    if save_figures:
        fig.savefig(figures_out_path+'PlanetScope_' + im_date + '_SCA.png', dpi=300, facecolor='white', edgecolor='none')
        print('figure saved to file')
    print(' ')
        
#     except:
        
#         print('error in snowline delineation, continuing...')
#         print('----------')
#         print(' ')
#         pass
    
# -----Plot median snow line elevations
fig2, ax2 = plt.subplots(figsize=(10,6))
ax2.plot(results_df['datetime'], results_df['snowlines_elevs_median'], '.b')
ax2.set_xlabel('Image capture date')
ax2.set_ylabel('Median snow line elevation [m]')
ax2.grid()
fig2.suptitle(site_name + ' Glacier snow line elevations')
plt.show()

# -----Save results
if save_outputs:
    snowlines_fn =  site_name + '_snowlines.pkl'
    results_df.to_pickle(snowlines_path + snowlines_fn)
    print('snowline data table saved to file:' + snowlines_path + snowlines_fn)
if save_figures:
    figure_sl_fn = site_name + '_sl_elevs_median.png'
    fig2.savefig(figures_out_path + figure_sl_fn, 
                 facecolor='white', edgecolor='none')
    print('figure saved to file:' + figures_out_path + figure_sl_fn)

### _Optional:_ Compile individual figures into a .gif and delete individual figures

In [None]:
from PIL import Image as PIL_Image
from IPython.display import Image as IPy_Image

# -----Make a .gif of output images
os.chdir(figures_out_path)
fig_fns = glob.glob('PlanetScope_*_SCA.png') # load all output figure file names
fig_fns = sorted(fig_fns) # sort chronologically
# grab figures date range for .gif file name
fig_start_date = fig_fns[0][3:-8] # first figure date
fig_end_date = fig_fns[-1][3:-8] # final figure date
frames = [PIL_Image.open(im) for im in fig_fns]
frame_one = frames[0]
gif_fn = ('PlanetScope_' + fig_start_date[0:8] + '_' + fig_end_date[0:8] + '_SCA.gif' )
frame_one.save(figures_out_path + gif_fn, format="GIF", append_images=frames, save_all=True, duration=2000, loop=0)
print('GIF saved to file:' + figures_out_path + gif_fn)

# -----Display .gif
IPy_Image(filename = figures_out_path + gif_fn)

# -----Clean up: delete individual figure files
for fig_fn in fig_fns:
    os.remove(fig_fn)
print('Individual figure files deleted.')

## 6. Fit Fourier series model to snowline time series

Adapted from code developed by [Jukes Liu](https://github.com/CryoGARS-Glaciology/Fourier-terminus-models)

This code fits time series of median snow line elevations using Fourier Series with the optimal number of terms (approximately the number of years in the time series) chosen using Monte Carlo simulations. 500 Fourier Series models are generated for each time series. The model IQR is calculated and exported to a new csv file.

In [None]:
# -----Set up X and Y data for model fitting
# read snowlines file name
snowlines_path = im_path + '../snowlines_old/'
os.chdir(snowlines_path)
snowlines_fn = glob.glob('*_snowlines*.pkl')[0]
snowlines = pd.read_pickle(snowlines_fn)

# grab X and Y data from snowline dates and median elevations
datetimes = np.ravel(snowlines['datetime'])
snowlines_elevs_median= np.array(np.ravel(snowlines['snowlines_elevs_median']), dtype=float)
# convert datatimes to floats
X = np.array([mktime(x.timetuple()) for x in datetimes], dtype=float)
# remove NaNs
X = datetimes[~np.isnan(snowlines_elevs_median)]
Y = snowlines_elevs_median[~np.isnan(snowlines_elevs_median)]
# convert dates to days after the first image date capture
day1 = np.datetime64('2016-05-01')
X = np.array([pd.Timedelta(day - day1, 'D').total_seconds() / 86400 for day in X])
# grab number of years from snowline datetimes
# used to create the range of model terms for testing
nyears = snowlines['datetime'].iloc[-1].year - snowlines['datetime'].iloc[0].year 

# display data for model fitting
print('Number of years detected in dataset: ' + str(nyears))
fig = plt.figure(figsize=(12, 6))
plt.plot(X, Y, '.k')
plt.xlabel('Days since '+str(day1))
plt.ylabel('Snowline elevation median [m]')
plt.grid()
plt.show()

In [None]:
# -----Generate optimized fourier model for snowline timeseries
X_mod, Y_mod, Y_mod_err = pf.optimized_fourier_model(X, Y, nyears, plot_results=True)


In [None]:
X_mod = X_mod = np.linspace(X[0], X[-1], num=100)
for i in np.arange(np.shape(Y_mod)[0]):
    plt.plot(X_mod, Y_mod[i,:])
plt.plot(X, Y, '.k')
# plt.ylim(np.nanmin(Y)-100, np.nanmax(Y)+100) 
plt.show()

In [None]:
nmc = 500 # number of monte carlo simulations
# initialize coefficients data frame
cols = [val[0] for val in fit_best.params.items()]
X_mod = np.linspace(X[0], X[-1], num=100) # points at which to evaluate the model
Y_mod = np.zeros((nmc, len(X_mod))) # array to hold modeled Y values
Y_mod_err = np.zeros(nmc) # array to hold error associated with each model
print('Conducting Monte Carlo simulations to generate 500 Fourier models...')
# loop through Monte Carlo simulations
for i in np.arange(0,nmc):

    # split into training and testing data
    X_train, X_test, Y_train, Y_test = train_test_split(X, Y, train_size=pTrain, shuffle=True)

    # fit fourier model to training data
    fit = Fit({y: fourier_series_symb(x, f=w, n=fourier_n)},
                x=X_train, y=Y_train).execute()

       print(str(i)+ ' '+ str(len(fit.params)))

    # apply fourier model to testing data
    Y_pred = fit.model(x=X_test, **fit.params).y

    # calculate mean error
    Y_mod_err[i] = np.sum(np.abs(Y_test - Y_pred)) / len(Y_test)

    # apply the model to the full X data
    c = [c[1] for c in fit.params.items()] # coefficient values
    Y_mod[i,:] = fourier_model(c, X_mod)

# plot results