## **Zonal Stats: Exploring spectra by land cover**

#### Reflection exercises for each group:

1. At the top of the Colab notebook that you share with everyone, please include your responses to the following questions:
  - Choose at least 1 dataset to explore in more detail.
  - What is the projection for this dataset?
  - Where can you find more information on how the data were collected and how to interpret the metadata?
  - Think about what data type each variable is.
  - Is it vector or raster data? What properties exist for each dataset?
  - What resolution are your data?

2. At the top of the Colab notebook, write a short summary detailing the processing steps in the notebook and your results.
  - Although these topics may be far removed from your own interests, how could these steps and analyses help in your own work?

3. OPTIONAL - Expand your script by adding additional processing, analysis, or other data.

As you're working through your exercise, **add code chunks to further document your scripts. Add additional comments to the code itself to clarify complicated processes.**




---



### Grouping pixels by land cover type

In this section, we'll group pixels by landcover type, and then plot the average spectral signature for each landcover type. We'll be using the GEE function reduceRegion to get the average value for each band by landcover type.

This project was inpsired by the following NEON Tutorial: [Spectral signatures by NDVI threshold in Python](https://www.neonscience.org/resources/learning-hub/tutorials/calc-refl-ndvi-py). Check it out (or any of their other awesome tutorials!) to learn more!


In [None]:
import ee
import geemap
import geemap.colormaps as cm
ee.Authenticate()
ee.Initialize(project='')
geemap.ee_initialize(project='')

In [None]:
# Import SOAP data
aopSDR = ee.ImageCollection('projects/neon-prod-earthengine/assets/DP3-30006-001')
SOAP_2019_sdr = aopSDR \
  .filterDate('2019-01-01', '2019-12-31') \
  .filterMetadata('NEON_SITE', 'equals', 'SOAP') \
  .first()

aopRGB = ee.ImageCollection('projects/neon-prod-earthengine/assets/DP3-30010-001')
SOAP_2019_rgb = aopRGB \
  .filterDate('2019-01-01', '2019-12-31') \
  .filterMetadata('NEON_SITE', 'equals', 'SOAP') \
  .first()

# Create our region of interest (ROI)
full_mask = ee.Image.constant(1).clip(SOAP_2019_sdr.geometry()) \
                .updateMask(SOAP_2019_sdr.select(['B001']).mask()) \
                .reduceToVectors(maxPixels=1e13, scale=100)
ROI = full_mask.geometry().buffer(-30)

# Import NLCD data, clip to SOAP
NLCD = ee.ImageCollection("USGS/NLCD_RELEASES/2021_REL/NLCD") \
          .filterBounds(ROI) \
          .first() \
          .clip(ROI)
# Set up visualization params
RGB_bands = ['B053', 'B035', 'B019'] # These are the band names for the red, green, blue bands in the SDR data
rgbVis = {'min': 0, 'max': 255, 'gamma': 0.8} # This sets a nice range of values for mapping the RGB data

In [None]:
# View NLCD data for SOAP
m = geemap.Map()
m.addLayer(SOAP_2019_rgb, rgbVis, 'SOAP 2019 RGB')
m.addLayer(NLCD.select('landcover'), None, 'landcover')
m.centerObject(ROI, 10)
m

In [None]:
import numpy as np    # import additional packages for dataframe manipulation, plotting in python
import pandas as pd
import seaborn as sns

# Select the WL_FWHM_B*** band properties (using regex)
properties = SOAP_2019_sdr.toDictionary()
wl_fwhm_dict = properties.select(['WL_FWHM_B+\d{3}'])

# Pull out the wavelength, fwhm values to a list
wl_fwhm_list = wl_fwhm_dict.values()

# Function to pull out the wavelength values only and convert the string to float
def get_wavelengths(x):
  str_split = ee.String(x).split(',')
  first_elem = ee.Number.parse((str_split.get(0)))
  return first_elem

# apply the function to the wavelength full-width-half-max list
wavelengths = wl_fwhm_list.map(get_wavelengths)

# Set up some additional lists and dictionaries that will allow us to nicely print the wavelength values in nanometers, and use the NLCD color palette
wavelengthsL = np.array(wavelengths.getInfo()).astype('str')  # get a list of wavelengths measured in the SDR data

palette = ee.List(NLCD.get("landcover_class_palette")).getInfo()  # Get the color palette, landcover code values, landcover names from NLCD
values = ee.List(NLCD.get("landcover_class_values")).getInfo()
names =  ee.List(NLCD.get("landcover_class_names")).map(lambda s: ee.String(s).slice(0, ee.String(s).index(':'))).getInfo()


nlcd_dict = {values[i]: names[i] for i in range(16)}  # create a dictionary linking the landcover names and lancover codes
palette = ['#'+s.upper() for s in palette]  # convert palette values to hex format
colDict = dict(zip(names, palette))         # create dict of landcover_names to colors

In [None]:
## Get the average reflectance value for each wavelength, for each vegetation type

# Set up for input to reduceRegion
SOAP_2019_toReduce = SOAP_2019_sdr.select('B.*') # just select the wavelength bands
lcbandNum = SOAP_2019_toReduce.bandNames().length() # calculate the number of bands to reduce

# Apply reduceRegion to the SDR data with landcover
means = SOAP_2019_toReduce.addBands(NLCD.select('landcover')).reduceRegion(
  reducer=ee.Reducer.mean().repeat(lcbandNum).group(groupField=lcbandNum, groupName='landcover'),
  geometry=ROI,
  scale=30,
  maxPixels=1e13)

In [None]:
# Create a pandas dataframe from a featureCollection created using the output of reduceRegions
reformatted_means = ee.FeatureCollection(ee.List(means.get('groups')).map(lambda obj: ee.Feature(None, obj)))
lc_wv_df = geemap.ee_to_df(reformatted_means)
lc_wv_df

In [None]:
# Clean up the pandas dataframe for plotting
lc_wv_df['landcover_name'] = lc_wv_df['landcover'].transform(lambda x: nlcd_dict[x])  # create a landcover name column
lc_wv_df[wavelengthsL] = pd.DataFrame(lc_wv_df['mean'].tolist()) # expand the "mean" column into 400+ columns (1 column per wavelength)

# Create a pandas dataframe with the columns landcover_name, wavelength, reflectance for plotting
df = pd.melt(lc_wv_df,
             id_vars=['landcover_name'],
             value_vars=wavelengthsL,
             var_name='wavelength (nm)',
             value_name='reflectance') \
        .astype(dtype= {"landcover_name":"str",
                        "wavelength (nm)":"float64",
                        "reflectance":"float64"})
df

In [None]:
# Plot spectra by landcover type
ax = sns.lineplot(data=df, x="wavelength (nm)", y="reflectance", hue="landcover_name", palette=colDict)
sns.move_legend(ax, "upper left", bbox_to_anchor=(1, 1))




---



## Additional Resources

* **End-to-End Google Earth Engine**: If you'd like to continue exploring the Earth Engine processes and applications, <a href="https://courses.spatialthoughts.com/end-to-end-gee.html#automatic-conversion-of-javascript-code-to-python" target="_blank"> SpatialThoughts Course - Ujaval Gandhi </a> has some nice examples you can follow.

