# Instructions

This program has two main functionalities. It will display satillite data overlayed over an imported shapefiile (imported from Google Earth Engine). The second–-and more unique functionality--is to download the aggregate data into a csv file (or multiple if time-series analysis is desired).
    
As the user, you can change a few of the parameters to obtain the data that you want.
    
### Shapefile

The shapefile that this program draws from can be any shapefile or certain csv files. The process for selecting your desired shapefile is simple. There are two ways to do so.

#### Google Earth Engine Asset
1. Upload shapefile to Google Earth Engine under the Assets tab
2. Obtain the path to the shapefile. If you click on the asset, this is the TableID.
3. In the User Variables section, assign the variable `FEATURE_COLLECTION_PATH` to the path <br><br>
**THE PATH MUST BE SURROUNDED BY QUOTES**<br>
good example: `FEATURE_COLLECTION_PATH = 'users/ccbitt23/testDHS2'` <br>
bad example: `FEATURE_COLLECTION_PATH = users/ccbitt23/testDSH2`

#### Local CSV file
1. Make sure your csv file has exactly three columns such that the first one is the name, the second the latitude, and third the longitude.
2. Obtain the path to the csv. If it is located in your working directory, it is simply the name of the csv. Make sure to add the .csv to the end of it.
3. In the User Variables section, assign the variable `FEATURE_COLLECTION_PATH` to the path <br><br>
**THE PATH MUST BE SURROUNDED BY QUOTES**<br>
good example: `FEATURE_COLLECTION_PATH = 'indotest_latlon.csv'` <br>
bad example: `FEATURE_COLLECTION_PATH = indotest_latlon.csv`

### Date Range

While this program can be used to gather data for a single month or a single year, it's main strength is the ability to gather data for time-series analysis. Setting the beginning and end dates is also simple.
* If you want to analyze by month:
    1. In the User Variables section, assign the variable `START_DATE` to your desired start date as a string of MM-YYYY
    2. In the User Variables section, assign the variable `END_DATE` to your desired end date as a string of MM-YYYY <br>
    
good example:<br>`START_DATE = '04-2014'` <br> `END_DATE = '08-2014`

* If you want to analyze by year:
    1. In the User Variables section, assign the variable `START_DATE` to your desired start date as a string of YYYY
    2. In the User Variables section, assign the variable `END_DATE` to your desired end date as a string of YYYY <br>
    
good example:<br>`START_DATE = '2014'` <br> `END_DATE = '2017`

### Changing Satellite Data

Changing the satellite data set is the most difficult change, but it is still not very difficult. The [Google Earth Engine Data Catalog](https://developers.google.com/earth-engine/datasets/catalog) is a complete directory of the datasets that are compatible with Google Earth Engine.
1. Aquire the path for the desired dataset. This is the part of the Earth Engine Snippet in quotes.
    * In the case of `ee.ImageCollection("AAFC/ACI")` the path would be `"AAFC/ACI"`
2. In the User Variables section, assign the variable `SATELLITE_PATH` to the path identified in Step 1.
3. Determine the band that you want the program to use for data analysis.
4. In the User Variables section, assign the variable `BAND_SELECT` to the desired band **making sure to keep capitalization consistent.**

### A Last Note

This program is really aimed at creating aggregate statistics. If you want to find the sum of tree-loss pixels (for example), this is possible. You'll just have to delve into what I've written outside of the easy-to-use user variables.

## User Variables

In [39]:
FEATURE_COLLECTION_PATH = 'indotest_latlon.csv'

EXPAND_FILE = True
EXPAND_DISTANCE = 10_000     # in meters

START_DATE = '10-2014'
END_DATE = '04-2015'

SATELLITE_PATH = 'MODIS/006/MOD13Q1'
BAND_SELECT = 'EVI'

DOWNLOAD_DATA = True   # setting this to False will prevent the program from download csv files

## Defining Various Functions

In [40]:
#csv_path functions and variable

ATTRIBUTE_NAME = ''

def line_to_point(line):
    '''Translates a list of strings in the form 'name, latitude, longitude' into a ee.Geometry.Point object with that
       name and centered on the lat, long point.'''
    line_list = line.split(',')
    identity = line_list[0]
    latlong = [float(line_list[2]), float(line_list[1])]
    return ee.Feature(ee.Geometry.Point(latlong), {ATTRIBUTE_NAME: identity})

def csv_path():
    '''Reads a csv file with three headers(name, lat, long) and converts it to a FeatureCollection.
       Returns that collection.'''
    # gain access to global var
    global ATTRIBUTE_NAME
    # read the csv
    with open(FEATURE_COLLECTION_PATH, 'r') as fp:
        headers = fp.readline()
        ATTRIBUTE_NAME = headers.split(',')[0]
        csv_contents = fp.read()
    # break up by new_line
    list_contents = csv_contents.split('\n')
    # turn into features
    feature_list = list(map(line_to_point, list_contents))
    return ee.FeatureCollection(feature_list)
    
# ee.Filter.date creation functions
def indvMonthFilter(date):
    '''Function takes one strings formated as MM-YYYY to represent the desired month of the desired year.
       Returns a tuple of format (ee.Filter object, input date).'''
    month, year = date.split('-')
    # create start date
    start_date = year + '-' + month + '-01'
    # determine last day of the month    
    if month == '2' or month == '02':
        temp_year = int(year)
        end_day = 29 if temp_year%4 == 0 else 28
    else:
        days_30 = ['4', '04', '6', '06', '9', '09', '11']
        end_day = 30 if month in days_30 else 31
    # create end date
    end_date = year + '-' + month + '-' + str(end_day)

    month.rjust(2, '0')
    date = '-'.join((month, year))
    
    
    return (ee.Filter.date(start_date, end_date), date)

def indvYearFilter(year):
    '''Function takes one strings formated as YYYY to represent the start and the end of a time filter.
       Returns a tuple of format (ee.Filter object, input date).'''
    # Create dates that are readable by ee.Filter.date
    start_date = year + '-01-01'
    end_date = year + '-12-31'
    
    return (ee.Filter.date(start_date, end_date), year)

def yearFilter():
    '''Function accepts no input and returns a list of ee.Filter.date objects that encompass the desired duration.
       Only works if the desired length of analysis is years.'''
    # create the range of years over which to scan
    start = int(START_DATE)
    end = int(END_DATE)
    year_range = [str(num) for num in range(start, end + 1)]
    
    # append a filter for each year to a list
    date_list = []
    for yr in year_range:
        date_list.append(indvYearFilter(yr))
    
    return date_list


def monthConverter(tup):
    tup[0] = month
    month = str(month).rjust(2, '0')
    return (month, tup[1])


def monthFilter():
    '''Function accepts no input and returns a list of ee.Filter.date objects that encompass the desired duration.
       Only works if the desired length of analysis is months.'''
    # reading inputs
    start = START_DATE
    end = END_DATE
    start_month, start_year = start.split('-')
    end_month, end_year = end.split('-')
    start_month = int(start_month)
    end_month = int(end_month)
    start_year = int(start_year)
    end_year = int(end_year)
    
    date_list = []
    # if the desired length of comparision is less than a year
    if start_year == end_year:
        month_list = range(start_month, end_month + 1)    # what months are actually desired
        for year, month in zip(itertools.repeat(start_year), month_list):
            target_month = str(month) + '-' + str(year)
            date_list.append(indvMonthFilter(target_month))     # make a filter for them
    else:
        # beginning year
        month_list = range(start_month, 13)
        for year, month in zip(itertools.repeat(start_year), month_list):
            target_month = str(month) + '-' + str(year)
            date_list.append(indvMonthFilter(target_month))
        # any middle years
        middle_years = range(start_year + 1, end_year)
        months_list = range(1, 13)
        middle_pairs = itertools.product(months_list, middle_years)
        middle_strings = ['-'.join(map(str, date_tuple)) for date_tuple in middle_pairs]
        for target_month in middle_strings:
            date_list.append(indvMonthFilter(target_month))
        # final year
        for year, month in zip(itertools.repeat(end_year), range(1, end_month + 1)):
            target_month = str(month) + '-' + str(year)
            date_list.append(indvMonthFilter(target_month))

    return date_list

# Translate ee.Filter.date into ee.ImageCollection
# CHECK THIS FUNCTION IF THE PROGRAM THROWS ANY ERRORS OR EXCEPTIONS
def tupleToCollection(inp_tuple):
    '''Function accepts a tuple in the form (ee.Filter.date, 'date') and returns a tuple of the form
       (ee.ImageCollection, the same date string).'''
    # unpack
    temp_filter, temp_date = inp_tuple
    # make ee.ImageCollection
    temp = ee.ImageCollection(SATELLITE_PATH) \
        .filter(temp_filter) \
        .select(BAND_SELECT).map(lambda image: image.clip(regions)) \
        .reduce(ee.Reducer.mean()) \
        .multiply(0.0001)
    
    return (temp, temp_date)

## Basic Set Up

In [41]:
# dependancies
import geemap
import ee
import itertools
import pandas as pd

# initialize key components
Map = geemap.Map()
ee.Initialize()

# import shapefile asset
if '.csv' not in FEATURE_COLLECTION_PATH:
    raw_regions = ee.FeatureCollection(FEATURE_COLLECTION_PATH)
else:
    raw_regions = csv_path()

# shapefile expansion
if EXPAND_FILE:
    regions = raw_regions.map(lambda point: point.buffer(EXPAND_DISTANCE))
else: regions = raw_regions
    
# ensure uniform formatting between START_DATE and END_DATE:
fail_1 = '-' in START_DATE and '-' not in END_DATE
fail_2 = '-' not in START_DATE and '-' in END_DATE
if fail_1 or fail_2: 
    raise Exception('Check the formatting of START_DATE and END_DATE')
    
# identify month/year analysis
TIME_ANALYSIS = 'month' if '-' in START_DATE else 'year'

## Create ImageCollection Map Object

In [42]:
if START_DATE == END_DATE:
    if TIME_ANALYSIS == 'month':
        filter_list = [indvMonthFilter(START_DATE)]
    else:
        filter_list = [indvYearFilter(START_DATE)]
elif TIME_ANALYSIS == 'month':
    filter_list = monthFilter()
else:
    filter_list = yearFilter()
    
collection_map = map(tupleToCollection, filter_list)

## Visualization

In [43]:
# View the average EVI for the first specified time period
# Note that the colored dataset is clipped to the region of interest
# In the bands tab, if you scroll to the bottom, there is a javascript dictionary that looks similar to colorizedVis
# replacing colorizedVis with whatever dictionary is provided will likely get the 'standard' colors for whatever dataset
colorizedVis = {
  'min': 0.0,
  'max': 1.0,
  'palette': [
    'FFFFFF', 'CE7E45', 'DF923D', 'F1B555', 'FCD163', '99B718', '74A901',
    '66A000', '529400', '3E8601', '207401', '056201', '004C00', '023B01',
    '012E01', '011D01', '011301'
  ],
}

# make a ee.Filter.date object in order to have something over which to map the satellite data
if TIME_ANALYSIS == 'month':
    temp_filter = indvMonthFilter(START_DATE)[0]
else: temp_filter = indvYearFilter(START_DATE)[0]
    
# Import satellite data
# CHECK THIS FUNCTION IF THE PROGRAM THROWS ANY ERRORS OR EXCEPTIONS
evi = ee.ImageCollection(SATELLITE_PATH) \
    .filter(temp_filter) \
    .select(BAND_SELECT).map(lambda image: image.clip(regions)) \
    .reduce(ee.Reducer.mean()) \
    .multiply(0.0001)

Map.addLayer(evi, colorizedVis, 'EVI')
Map.addLayer(regions, {}, 'Geometric Boundaries')
Map

Map(center=[40, -100], controls=(WidgetControl(options=['position'], widget=HBox(children=(ToggleButton(value=…

## Downloading Statistics

In [44]:
if DOWNLOAD_DATA:
    for collection in collection_map:
        areas, date = collection
        csv_name = './' + date + '.csv'
        geemap.zonal_statistics(areas, regions, csv_name, statistics_type='MEAN', scale=1000)
        df = pd.read_csv(csv_name)
        if TIME_ANALYSIS == 'month':
            df['Month'] = date
        else: df['Year'] = date
        df.to_csv(csv_name)

Computing statistics ...
Generating URL ...
Downloading data from https://earthengine.googleapis.com/v1alpha/projects/earthengine-legacy/tables/c4700a92c827cc8b19c5fe0f91fe3a36-4b4d569908b49ffe3024de71f496416c:getFeatures
Please wait ...
Data downloaded to /Users/cbitting/Documents/GEE/10-2014.csv
Computing statistics ...
Generating URL ...
Downloading data from https://earthengine.googleapis.com/v1alpha/projects/earthengine-legacy/tables/d6293b2a46c03c4e2733d8d1a17810ea-f3ec96d6fba408c380e09f528d7e6f3b:getFeatures
Please wait ...
Data downloaded to /Users/cbitting/Documents/GEE/11-2014.csv
Computing statistics ...
Generating URL ...
Downloading data from https://earthengine.googleapis.com/v1alpha/projects/earthengine-legacy/tables/b22524b889f6a1a28e0ed8fef4421d39-56ada6d1e14a79782bd7eb0c51665091:getFeatures
Please wait ...
Data downloaded to /Users/cbitting/Documents/GEE/12-2014.csv
Computing statistics ...
Generating URL ...
Downloading data from https://earthengine.googleapis.com/v1

# Credits

* Base script (JavaScript) written by Manuel Gimond
* Translated into Python/Jupyter Notebook by Caleb Bitting
* Updated by Caleb Bitting:
    * Allowed for local export of CSV file
    * Allowed for user to change parameters in a simple manner
    * Allowed for expansion of  point-based shapefile
    * Allowed for generation of time-series data
    * Allowed for the input of CSV file and generation of a shapefile based upon that input
    * Created the instructions