<a href="https://colab.research.google.com/github/jjmcnelis/VegMapper/blob/devel/gee/vegMapper.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# vegMapper

https://github.com/NaiaraSPinto/VegMapper

## Setup

After common packages are imported (e.g. *ee*, *numpy*, *pandas*, *matplotlib* -- all are available by default in Google Colab), the next cell will attempt to perform some important housekeeping steps.

### Requirements

This notebook was developed in Python 3.8; it has several dependencies:

* *ee* (Earth Engine Python API)
* *numpy*
* *pandas*
* *matplotlib*
* *geemap*

_geemap_ is the only package that is not available by default in Google Colab. If _geemap_ is not available, the next cell will try to install it with pip and import again.

In [None]:
from json import dumps
from io import StringIO
from IPython.display import HTML
#import matplotlib.pyplot as plt
#import numpy as np
import pandas as pd
import os.path
import ee

try:
    import geemap
except ImportError as e:
    !pip install -q geemap
    import geemap

[?25l[K     |▊                               | 10kB 17.6MB/s eta 0:00:01[K     |█▍                              | 20kB 23.9MB/s eta 0:00:01[K     |██▏                             | 30kB 27.5MB/s eta 0:00:01[K     |██▉                             | 40kB 29.4MB/s eta 0:00:01[K     |███▋                            | 51kB 31.3MB/s eta 0:00:01[K     |████▎                           | 61kB 33.4MB/s eta 0:00:01[K     |█████                           | 71kB 32.5MB/s eta 0:00:01[K     |█████▊                          | 81kB 31.8MB/s eta 0:00:01[K     |██████▌                         | 92kB 30.6MB/s eta 0:00:01[K     |███████▏                        | 102kB 30.4MB/s eta 0:00:01[K     |████████                        | 112kB 30.4MB/s eta 0:00:01[K     |████████▋                       | 122kB 30.4MB/s eta 0:00:01[K     |█████████▍                      | 133kB 30.4MB/s eta 0:00:01[K     |██████████                      | 143kB 30.4MB/s eta 0:00:01[K     |██████████▊  

### User configuration

Configure the workflow parameters in the cell below. These settings determine how the workflow selects and operates on the data/imagery.

In [None]:
# Temporal coverage for imagery used in the analysis:
startDate = '2017-04-01'  #@param {type: "date"}
endDate = '2017-09-30'    #@param {type: "date"}

# Target scale for zonal statistics and the output stack:
scale = 30

### Authenticate for GEE and Google Drive

>**Quickstart:**
>Run the next cell and follow the instructions to authenticate. Click the links displayed below the cell and log in (each one should open in a new browser tab); copy your temporary token, then paste it into the prompt and press enter.

(I can probably merge the auth together with a little digging into the GEE/Colab APIs, but for now) expect to see two prompts for your Google login info:
1. for GEE access (*REQUIRED*)
2. for Google Drive access (*OPTIONAL*, only available in Colab)

You can upload/download to the Colab environment in one of (at least) two ways if Google Drive is not accessible (i.e. no space remaining): 
1. using the File Manager (on the left in the Colab interface), or 
2. using interactive prompts as you progress through the notebook.

However, if you're running this notebook *outside* of the Colab environment (i.e. in the common Jupyter notebook client) then you will need to call *pandas* manually to read/write the input & output (with `pd.read_csv` and `<df>.to_csv`, respectively).

In [None]:
####################################################################
# My GEE credentials are tied to a personal Google account. The 1st
# prompt provides a link opening a new tab to the familiar Google 
# login page. After logging in, copy the token into the prompt and
# hit enter. This step is required in order to proceed.
####################################################################
ee.Authenticate()
ee.Initialize()

####################################################################
# This next part checks if the notebook is running in Colab first. 
# If so, you will be prompted the user to log in AGAIN for Drive 
# access. All previous comments about auth/tokens apply. Execute 
# the next cell to skip mounting Drive into the Colab environment.
####################################################################
DRIVE = "/content/drive/MyDrive"
#DRIVE = "/content/drive/Shareddrives"

if 'google.colab' in str(get_ipython()):
    from google.colab import drive, files
    try:
        drive.mount("/content/drive")
    except Exception as e:
        print("The next cell executed. Will skip mounting Drive.")

####################################################################
# Housekeeping -- please ignore the remainder of this cell
####################################################################

def _validate_path_and_read_input_csv_data():
    # If input csv path is 'None', prompt the user for upload.
    if not relative_path_to_input_csv_in_drive:
        csv = None
    elif not os.path.isdir(DRIVE):
        # Error out if path is not provided and Drive isnt mounted.
        raise Exception("ERROR: Cannot determine if Drive is mounted.")
    else:
        # Otherwise make sure the input csv path is valid.
        if os.path.isfile(relative_path_to_input_csv_in_drive):
            csv = relative_path_to_input_csv_in_drive
        else:
            # Assume its an absolute path if the relative path is invalid.
            csv = os.path.join(DRIVE, relative_path_to_input_csv_in_drive)
    if not csv:
        # Prompt the user to upload their csv if 'csv' is 'None'.
        uploads = files.upload()
        if len(uploads)==0:
            raise Exception("ERROR: Received no files. Try again.")
        elif len(uploads)>1:
            raise Exception("ERROR: Received multiple files. Try again.")
        else:
            csv = list(uploads)[0]
    # Finally, attempt to load the input csv to a pandas data frame:
    try:
        df = pd.read_csv(csv)
        display(df.info())
    except FileNotFoundError as e:
        raise Exception("POSSIBLE BUG: Please notify jmcnelis@jpl.nasa.gov")
    except Exception as e:
        raise e
    return df

To authorize access needed by Earth Engine, open the following URL in a web browser and follow the instructions. If the web browser does not start automatically, please manually browse the URL below.

    https://accounts.google.com/o/oauth2/auth?client_id=517222506229-vsmmajv00ul0bs7p89v5m89qs8eb9359.apps.googleusercontent.com&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fearthengine+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdevstorage.full_control&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&response_type=code&code_challenge=obFd1Vyu2u5814xJFxOz8sIhTtmOv5EAYogjL7Mx-W0&code_challenge_method=S256

The authorization workflow will generate a code, which you should paste in the box below. 
Enter verification code: 4/1AY0e-g7QWb1c_hQVaG0f6QrsjOMYW6htpBYty3Q2-5-9NSrtdNTEshcssug

Successfully saved authorization token.
Mounted at /content/drive


### Load xy data from an input csv

As mentioned before, this procedure builds a stack of images and calculates the zonal statistics within regions defined by an input feature dataset. 

*Remember: CSV is the only supported input file format at this time.* It should have these columns at a minimum:

* *latitude* (float)
* *longitude* (float)
The following cell will read your input table of XY positions (which should be provided in the *latitude* and *longitude* columns) and any additional data columns then print some high level details (assuming your paths are configured properly).

#### Option a: read files from Google Drive

You should see a folder "drive/" in the default workspace when you open the file browser panel (to the immediate left inside the Colab environment).

><u>Please read this note about paths configured in the next cell:</u>    
>Input and output paths should be set relative to the root of the _drive_ directory shown _OR_, alternatively, you can provide absolute paths to your input/output file(s).

Try to remember to unmount Drive once you're finished in Colaboratory. You can do that by calling this other function from the drive module: `google.colab.drive.flush_and_unmount()`

### Option b: upload/read files to Colaboratory

*The workflow requires input features to determine the areas in which to calculate zonal statistics.*

Make sure a suitable file exists in the colab workspace or in Google Drive. You can provide one in either of two ways:

1. Navigate to an input CSV in Google Drive and copy its path into the cell below (assuming Drive is mounted), or
2. Run the next cell as-is and upload a file to the workspace when prompted.

If the second option, run the next cell and click *Choose Files* to upload a file. You may also click *Cancel Upload* to abort the cell and move on.

In [None]:
# Set this to 'None' to be prompted to upload your input csv:
relative_path_to_input_csv_in_drive = f"tests/vegMapper/smartin.csv"

# Set this to 'None' to be prompted to download your outputs csv: 
relative_path_to_output_csv_in_drive = f"tests/vegMapper/out/outputs.csv"

### Error out here if any inputs are invalid >>>
pts = _validate_path_and_read_input_csv_data()

print("Success! Please proceed with the notebook.")

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 5 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   longitude   100 non-null    float64
 1   latitude    100 non-null    float64
 2   obs_year    50 non-null     float64
 3   class       50 non-null     object 
 4   class_2017  100 non-null    object 
dtypes: float64(3), object(2)
memory usage: 4.0+ KB


None

Success! Please proceed with the notebook.


#### Make xy geometries as a Feature Collection

Make geometries for each XY position in the input table so we can efficiently generate zonal statistics over our final image stack at the end of the procedure. See the API documentation for information about [*ee.FeatureCollection*](>https://developers.google.com/earth-engine/guides/feature_collections)s.

In [None]:
def get_geom(x):
    return ee.Geometry.Point(x['longitude'], x['latitude'])

pfc = ee.FeatureCollection(pts.apply(get_geom, axis=1).tolist())

type(pfc)

ee.featurecollection.FeatureCollection


### Region of Interest

Get the minimum bounding extent of the points in the input CSV. Add an arbitrary buffer around the minimum extent and then get a ee.Geometry.Rectangle to represent the ROI.

In [None]:
lon_min = pts['longitude'].min()
lon_max = pts['longitude'].max()
lat_min = pts['latitude'].min()
lat_max = pts['latitude'].max()

roi_poly = [[lon_min, lat_max],
            [lon_min, lat_min],
            [lon_max, lat_min],
            [lon_max, lat_max]]

roi = ee.Geometry.Polygon(coords=roi_poly)

type(roi)

ee.geometry.Geometry

Plot the region of interest polygon on a map to see the coverage.

In [None]:
roi_center = [pts['latitude'].mean(), pts['longitude'].mean()]

M = geemap.Map(center=roi_center, zoom=7, width="90%")
M.addLayer(roi, {'color': "red"}, name='ROI')
M.addLayer(pfc, name='Sites')
M

## Imagery

### Sentinel-1

We want to use preprocessed, analysis-ready data from the *S1_GRD* collection to calculate *radar volume index*.

>A previous version of this notebook used the *S1_GRD_FLOAT* collection (rather than the *S1_GRD* collection) because the data are in power units, and are thus immediately suitable to calculate *radar volume index* (as opposed to the *S1_GRD* collection, which gives the data in decibels (dB), i.e., on a logarithmic scale).
>
>See [this page](https://developers.google.com/earth-engine/guides/sentinel1) for more information about Sentinel-1 data accessible through GEE.

The next few cells take the following steps:

1. pulls target data from *both* S1 collections (*S1_GRD* and *S1_GRD_FLOAT*)
2. applies inverse transform to *S1_GRD* so they are represented in raw power
3. calculates summary statistics and verifies transform by comparing with data from *S1_GRD_FLOAT*

Grab the data from the *S1_GRD_FLOAT* collection.

In [None]:
_s1 = (ee.ImageCollection("COPERNICUS/S1_GRD_FLOAT")
        .filter(ee.Filter.listContains('transmitterReceiverPolarisation', 'VV'))
        .filter(ee.Filter.eq("instrumentMode", "IW"))
        .filter(ee.Filter.eq("orbitProperties_pass", "DESCENDING"))
        .filterDate(startDate, endDate)
        .filterBounds(roi)
        .select(['VV', 'VH'])
        .mean())

type(_s1)  # this includes 113 images for our inputs on 2021-05-12

ee.image.Image

The result should be a multi-band *Image*. Imagery were selected for our time/place of interest and composited to one mean raster per band in the output image.

>**Converting decibels to raw power**
>
>Imagery in the Earth Engine 'COPERNICUS/S1_GRD' Sentinel-1 ImageCollection is consists of Level-1 Ground Range Detected (GRD) scenes processed to backscatter coefficient (σ°) in decibels (dB). The backscatter coefficient represents target backscattering area (radar cross-section) per unit ground area. Because it can vary by several orders of magnitude, it is converted to dB as 10*log10σ°. It measures whether the radiated terrain scatters the incident microwave radiation preferentially away from the SAR sensor dB < 0) or towards the SAR sensor dB > 0). This scattering behavior depends on the physical characteristics of the terrain, primarily the geometry of the terrain elements and their electromagnetic characteristics.
>
>More info about this process may be found [here](https://developers.google.com/earth-engine/guides/sentinel1#sentinel-1-preprocessing) in the GEE docs.

Define and apply a function to do the inverse transform back into raw power units. Then do the same thing as in the cell above to select data from the *S1_GRD* collection, but apply the transform to all bands before getting the composite image.

In [None]:
def xform_s1(x):
    return ee.Image(10).pow(x.divide(10))

s1 = (ee.ImageCollection("COPERNICUS/S1_GRD")
        .filter(ee.Filter.listContains('transmitterReceiverPolarisation', 'VV'))
        .filter(ee.Filter.eq("instrumentMode", "IW"))
        .filter(ee.Filter.eq("orbitProperties_pass", "DESCENDING"))
        .filterDate(startDate, endDate)
        .filterBounds(roi)
        .select(['VV', 'VH'])
        .map(xform_s1)
        .mean())

type(s1)  # also includes 113 images for our inputs on 2021-05-12

ee.image.Image

Quick comparison between the two using the `gee.image_stats` convenience function.

In [None]:
def get_stats(img, region=roi, scale=30):
    return geemap.image_stats(img=img, region=roi, scale=scale).getInfo()

display(HTML("<h3>S1_GRD_FLOAT</h3>"))
display(pd.DataFrame(get_stats(_s1)))
display(HTML("<h3>S1_GRD</h3>"))
display(pd.DataFrame(get_stats(s1)))

Unnamed: 0,max,mean,min,std,sum
VH,31.514315,0.057534,0.00074,0.068617,4516671.0
VV,151.913361,0.273832,0.00261,0.370733,21496950.0


Unnamed: 0,max,mean,min,std,sum
VH,31.514315,0.057538,0.00074,0.06862,4516934.0
VV,151.91336,0.273848,0.00261,0.370749,21498200.0


>*Assume the transform function is working properly if the values in both tables above are similar.*     
>If that's the case, we are good to proceed with the data from *S1_GRD*.

Dereference the data from the *S1_GRD_FLOAT* collection as we no longer need it.

In [None]:
_s1 = None

#### Calculate radar volume index

Add a new band to the output image containing the radar volume index calculated from *VV* and *VH*: `4 * VH / (VH + VV)`

In [None]:
s1out = s1.addBands(s1.expression(
    expression="4 * VH / (VH + VV)",
    opt_map={'VV': s1.select('VV'),
             'VH': s1.select('VH')}
).rename('radar_volume_index'))

type(s1out)

ee.image.Image

Plot the histograms for *VV* and *VH* (or just get simple summary stats like before).

In [None]:
# s1tmp = geemap.ee_to_numpy(s1comp, bands=["VV", "VH"], region=roi)

# # Plot histogram for VV:
# counts, bins = np.histogram(a=s1tmp[:,:,0].flatten())
# plt.hist(s1tmp[:,:,0], bins=bins)
# plt.ylim(0., 3.)
# plt.xlim(-18., -5.)
# plt.show()

# # Plot histogram for VH:
# counts, bins = np.histogram(a=s1tmp[:,:,1].flatten())
# plt.hist(s1tmp[:,:,1], bins=bins)
# plt.ylim(0., 3.)
# plt.xlim(-18., -5.)
# plt.show()

s1stats = get_stats(s1out)

display(pd.DataFrame(s1stats))

Unnamed: 0,max,mean,min,std,sum
VH,31.514315,0.057538,0.00074,0.06862,4516934.0
VV,151.91336,0.273848,0.00261,0.370749,21498200.0
radar_volume_index,3.588767,0.752902,0.003941,0.132861,59105900.0


Now plot all the bands (including the computed *radar_volume_index* image/band) for visual inspection:

In [None]:
def drawMap(image, style_func, width="90%", **kwargs):
    M = geemap.Map(**kwargs)
    for band in image.bandNames().getInfo():
        M.addLayer(image.select(band), **style_func(band))
    M.addLayerControl()
    return M

def s1style(b: str, vis_params: dict={}):
    if b.endswith("radar_volume_index"):
        vis_params = {'min': s1stats['min'][b], 'max': s1stats['max'][b]}
    return {'vis_params': vis_params, 
            'shown': b.endswith("radar_volume_index"), 
            'name': b}

drawMap(image=s1out, style_func=s1style, center=roi_center, zoom=7, width="80%")

### ALOS2

https://developers.google.com/earth-engine/datasets/catalog/JAXA_ALOS_PALSAR_YEARLY_SAR

This dataset from ALOS2 only has one timestep per year, so modify the start and end dates before applying *filterDate* to the *ImageCollection*.

In [None]:
years = [f'{startDate.split("-")[0]}-01-01', 
         f'{endDate.split("-")[0]}-12-31']

alos2 = (ee.ImageCollection('JAXA/ALOS/PALSAR/YEARLY/SAR')
           .filterDate(*years)
           .filterBounds(roi)
           .select(['HV','HH'])
           .mean())

type(alos2)

ee.image.Image

#### Calculate radar volume index

Add a new band to the output image containing the ALOS2 radar volume index calculated as: `4 * HV / (HV + HH)`

In [None]:
alos2out = alos2.addBands(alos2.expression(
    expression="4 * HV / (HV + HH)", 
    opt_map={'HV': alos2.select('HV'),
             'HH': alos2.select('HH')}
).rename('radar_volume_index'))

type(alos2out)

ee.image.Image

Calculate and display a table of summary stats.

In [None]:
alos2stats = get_stats(alos2out)

display(pd.DataFrame(alos2stats))

Unnamed: 0,max,mean,min,std,sum
HH,65535,6690.43736,0,2823.646256,525226600000.0
HV,65535,3759.369117,0,1601.820148,295125800000.0
radar_volume_index,4,1.44122,0,0.288143,113141700.0


### Landsat 8 SR

https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C01_T1_SR

>This example describes exactly what we need to do to produce the Landsat 8 median composite:    
https://developers.google.com/earth-engine/guides/ic_composite_mosaic

Select Landsat 8 surface reflectance images and apply a quality mask by mapping a function over all the images in the collection that match our filtering criteria. Then, calculate NDVI and add it to the image as a new band.

In [None]:
def mask_l8sr(image):
    cloudShadowBitMask = 1<<3
    cloudBitMask = 1<<5
    qa = image.select('pixel_qa')
    mask = qa.bitwiseAnd(cloudShadowBitMask).eq(0).And(
           qa.bitwiseAnd(cloudBitMask).eq(0))
    return image.updateMask(mask)

# Select Landsat 8 surface reflectance scenes and make a median composite.
l8sr = (ee.ImageCollection('LANDSAT/LC08/C01/T1_SR')
          .filterDate(startDate, endDate)
          .filterBounds(roi)
          .map(mask_l8sr)
          .median())

# Use the composite bands 5 and 4 to produce an NDVI image.
l8out = l8sr.addBands(l8sr.normalizedDifference(['B5','B4']).rename('ndvi'))

type(l8out)

ee.image.Image

Draw NDVI and the other surface reflectance bands on a map.

In [None]:
def l8style(b: str, vis_params: dict={}):
    if b.endswith("ndvi"):
        vis_params = {'min': -1.0, 'max': 1.0, 'palette': 'red,yellow,green'}
    return {'vis_params': vis_params, 'shown': b.endswith("ndvi"), 'name': b}

drawMap(image=l8out, style_func=l8style, center=roi_center, zoom=7, width="75%")

### VCF from MODIS

In [None]:
modis = (ee.ImageCollection('MODIS/006/MOD44B')
           .filterDate(*years)
           .filterBounds(roi)
           .select('Percent_Tree_Cover')
           .first())

type(modis)

ee.image.Image

## Run the prediction

Not clear on this so it's removed for now:

```python
prior_mean = [0.06491638, -26.63132179, 0.05590800, -29.64091620]
prior_mean_int = ee.Number(6.07)
prediction = (ee.Image(prior_mean_int)
              #.add((Fmod_tc_aoi.multiply(prior_mean[0]))
              .add(s1.select('radar_volume_index').multiply(prior_mean[1]))
              .add(l8sr.select('ndvi').multiply(prior_mean[2]))
              #.add(smooth.select('constant').multiply(prior_mean[3]))))
              ).clip(roi)
predx = prediction.exp()
pred_final = ee.Image(predx.divide(predx.add(1)))
type(pred_final)
```

### Build the stack

>Documentation for the image method *clipToBoundsAndScale* is helpful to understanding this step in GEE: https://developers.google.com/earth-engine/apidocs/ee-image-cliptoboundsandscale
>
>But I ended up using regular *clip* for now.
>
>Also see this information on compositing and image projections:      
>https://developers.google.com/earth-engine/guides/projections#the-default-projection

Configure preferences here to determine how the stack is created with a common grid. All imagery *added* to the stack using the *ee.Image.addBands* method will inherit the projection and scale of the parent image.

In [None]:
def prepare_image(image, pre: str, crs: str="EPSG:4326", scale: int=30):
    renamed = image.rename([f'{pre}-{b}' for b in image.bandNames().getInfo()])
    return renamed.setDefaultProjection(crs=crs, scale=scale).clip(roi)

s1out = prepare_image(image=s1out, pre="S1")
alos2out = prepare_image(image=alos2out, pre="ALOS2")
l8out = prepare_image(image=l8out, pre="L8")
modis = prepare_image(image=modis, pre="MODIS")

stack = None  # Assemble the stack in a loop.
for i in [s1out, alos2out, l8out, modis]:
    if stack is not None:
        stack = stack.addBands(i)
    else:
        stack = i

type(stack)

ee.image.Image

In [None]:
#help(s1._apply_crs_and_affine)  #.select("VV"))
#help(s1._apply_selection_and_scale)
#help(s1._apply_spatial_transformations)
#help(s1._apply_visualization)
#help(s1.prepare_for_export)

In [None]:
# def prefix_bands(image, prefix: str):
#     return [f'{prefix}-{b}' for b in image.bandNames().getInfo()]
# # Get initial stack from the landsat imagery and rename bands.
# stack = l8out.clip(roi).rename(prefix_bands(l8out.clip(roi), "L8"))
# # Add the rest of the images to the stack.
# for img, pre in [(s1out, "S1"), (alos2out, "ALOS2"), (modis, "MODIS")]:
#     stack = stack.addBands(img.clip(roi).rename(prefix_bands(img, pre)))

#stack = ee.ImageCollection.fromImages(images)

# Generate rough stats about all bands in the stack.
stack_stats = pd.DataFrame(get_stats(stack, scale=500))

stack_stats

Unnamed: 0,max,mean,min,std,sum
ALOS2-HH,32349.0,6691.016643,844.0,1253.527563,1890375000.0
ALOS2-HV,16654.0,3760.244466,317.0,877.089124,1062360000.0
ALOS2-radar_volume_index,1.706744,1.429725,0.311985,0.141228,403932.1
L8-B1,8864.0,279.431814,-999.0,308.443599,67849700.0
L8-B10,3089.0,2923.127373,2606.0,31.296906,709773600.0
L8-B11,3060.0,2897.599636,2609.0,28.489068,703575100.0
L8-B2,8902.0,326.189632,-731.0,312.551332,79203110.0
L8-B3,8872.0,568.998756,-158.0,340.020739,138160300.0
L8-B4,8928.0,434.191482,-304.0,354.881632,105427400.0
L8-B5,9148.0,3049.482867,68.5,658.561315,740454400.0


Get some information about the spatial properties of the stack.

In [None]:
#stack.select("MODIS-Percent_Tree_Cover").projection().nominalScale().getInfo()

Draw a map of all the bands with *geemap*.

In [None]:
def allstyle(b: str, vis_params: dict={}, shown: bool=False):
    if b.endswith("radar_volume_index"):
        vis_params = stack_stats.loc[b][['min','max']].to_dict()
        shown = True
    if b.endswith("ndvi"):
        vis_params = {'min': -1.0, 'max':  1.0, 'palette': 'red,yellow,green'}
        shown = True
    return {'name': b, 'viz_params': vis_params, 'shown': shown}

drawMap(image=stack, style_func=allstyle, center=roi_center, zoom=7)

Verify spatial referencing information by displaying a dictionary of SRS information for each band.

In [None]:
srs = {}
for b in stack.bandNames().getInfo():
    p = stack.select(b).projection()
    srs[b] = p.getInfo()
    srs[b]['nominalScale'] = p.nominalScale().getInfo()
    del srs[b]['type']
# All image have identical projections if this test returns true:
len(list(set([str(p) for p in list(srs.values())]))) == 1

True

## Zonal statistics

Map over the feature collection after building the stack.

In [None]:
outputs = stack.reduceRegions(collection=pfc,
                              reducer=ee.Reducer.mean(),
                              #crs=stack.projection(),
                              scale=30)

type(outputs)

ee.featurecollection.FeatureCollection

Get the new `ee.FeatureCollection` as a dictionary then call the *pandas* convenience function `json_normalize` to translate to a `pandas.DataFrame`. 

In [None]:
outputs = pd.json_normalize(outputs.getInfo()['features'])

outputs.describe()

Unnamed: 0,properties.ALOS2-HH,properties.ALOS2-HV,properties.ALOS2-radar_volume_index,properties.L8-B1,properties.L8-B10,properties.L8-B11,properties.L8-B2,properties.L8-B3,properties.L8-B4,properties.L8-B5,properties.L8-B6,properties.L8-B7,properties.L8-ndvi,properties.L8-pixel_qa,properties.L8-radsat_qa,properties.L8-sr_aerosol,properties.MODIS-Percent_Tree_Cover,properties.S1-VH,properties.S1-VV,properties.S1-radar_volume_index
count,99.0,99.0,99.0,98.0,98.0,98.0,98.0,98.0,98.0,98.0,98.0,98.0,97.0,98.0,98.0,98.0,98.0,99.0,99.0,99.0
mean,6206.212121,2537.868687,1.181532,328.091837,2937.392857,2905.857143,370.872449,632.234694,500.785714,3258.994898,1694.520408,843.27551,0.71783,322.020408,0.0,148.204082,42.591837,0.036224,0.18903,0.680338
std,2771.08424,1114.151023,0.322871,205.914347,19.914929,16.127872,229.06571,302.660311,363.598496,899.380295,549.985407,468.016356,0.227795,0.202031,0.0,51.743212,24.901802,0.02077,0.102332,0.228091
min,894.0,487.0,0.279173,-310.0,2843.0,2852.0,-190.0,-65.0,-12.0,291.0,164.5,117.5,-0.282382,322.0,0.0,68.0,4.0,0.002358,0.008135,0.112101
25%,4734.5,1935.5,0.998658,197.5,2929.625,2897.625,216.5,400.375,233.875,2786.5,1382.625,518.375,0.594579,322.0,0.0,96.0,19.5,0.026829,0.120503,0.555444
50%,5947.0,2506.0,1.173162,301.75,2940.75,2909.0,313.75,575.75,373.0,3396.5,1623.5,691.25,0.826921,322.0,0.0,152.5,39.5,0.036689,0.189384,0.652357
75%,7340.0,3138.0,1.372365,423.75,2947.875,2916.5,450.5,819.875,667.0,3813.125,1989.125,1026.0,0.880791,322.0,0.0,192.0,66.0,0.042238,0.231536,0.783544
max,18923.0,6586.0,2.167319,1029.0,2988.0,2941.0,1206.0,1538.0,1651.0,5558.0,3322.0,2568.0,0.906323,324.0,0.0,228.0,83.0,0.177845,0.79575,2.247066


Rename the columns according to the index in our *stack_stats* table from a few cells ago. (GEE appends the word "properties" fitting with common GIS convention.)

>We could put some automated + hands-on validation routines at the bottom of the ipynb, e.g. a row-picker to render this table and a map widget next to it.

Here's the first row of data after renaming the columns:

In [None]:
outputs_names = {f"properties.{i}": i for i in stack_stats.index.tolist()}

outputs = outputs.rename(mapper=outputs_names, axis=1)

outputs.iloc[0].to_frame(name="ROW_0")

Unnamed: 0,ROW_0
type,Feature
id,0
geometry.type,Point
geometry.coordinates,"[-76.54613982, -8.220041421]"
ALOS2-HH,15154
ALOS2-HV,1137
ALOS2-radar_volume_index,0.279173
L8-B1,743.5
L8-B10,2960
L8-B11,2925.5


## Outputs

### Save to Google Drive

>*Important: Make sure to give a path that's inside the Drive directory.*

Write to Google Drive with the `to_csv` method.

In [None]:
outputs.to_csv(relative_path_to_output_csv_in_drive, index=None)

drive.flush_and_unmount()  # Dont forget to unmount Drive when youre done.

### Download to local disk

Run this next cell to save to your local machine as a CSV.

In [None]:
# Write a CSV into the colaboratory workspace.
outputs.to_csv("outputs.csv", index=None)

# This function triggers a prompt for you to save the file to local disk.
files.download(filename="outputs.csv")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

## References

* https://developers.google.com/earth-engine/guides/resample#resampling
* https://developers.google.com/earth-engine/tutorials/community/extract-raster-values-for-points#understanding_which_pixels_are_included_in_polygon_statistics
  * https://developers.google.com/earth-engine/tutorials/community/extract-raster-values-for-points#notes_on_crs_and_scale
* https://developers.google.com/earth-engine/tutorials/community/extract-raster-values-for-points#zonalstatsfc_params_%E2%87%92_eefeaturecollection
* https://developers.google.com/earth-engine/tutorials/community/beginners-cookbook#example_exporting_data

### Notes

Important concepts in GEE:

* Scale: https://developers.google.com/earth-engine/guides/scale
* Projections: https://developers.google.com/earth-engine/guides/projections
  * *The default projection*: https://developers.google.com/earth-engine/guides/projections#the-default-projection
  * *Composites have no projection*: https://developers.google.com/earth-engine/guides/ic_reducing#Composites-have-no-projection

I was wary of using `reproject` at first because of how it's described in [the GEE documentation](https://developers.google.com/earth-engine/guides/projections#reprojecting), but now I see that it's a must to achieve the common grid. (GEE does everything else for me in a sensible way _EXCEPT_ for this, IMO.)

* https://developers.google.com/earth-engine/guides/image_math#colab-python_1