<font face="Calibri" size="2"> <i>eSBAE - Notebook Series - Part 3, version 0.4, March 2023. Andreas Vollrath, UN-Food and Agricultural Organization, Rome</i>
</font>

![title](images/header.png)

# III - eSBAE Time-Series Extraction
### Extract various time-series data for large sets of points from Google Earth Engine
-------

This notebook takes you through the process of extracting time-series for a set of points using [Google's Earth Engine](https://earthengine.google.com/). The script is optimized to deal with thousands of points and will use parallelization to efficiently extract the information from the platform.

**You will need**:
- a valid Earth Engine account ([sign up here](https://code.earthengine.google.com/register))
- an uploaded table of points (Feature Collection) 
- the table needs a unique point identifier (Point ID)

**You should be aware, that:** 

- As a SEPAL user: this notebook does **not need huge resources**, as processing is done on the platform. A **m2 instance** or lower is sufficient.  
- The extraction can take up to days (>100000 points). If you are on SEPAL, make use of the **"keep instance running"** option within the user report dashboard. However, **do not forget** to shut down your machine once processing finished. 
- A logfile is created within your tmp-folder. Interruption of connectivity to the SEPAL server may lead to block the output of the Jupyter notebook. **This does not mean the processing stopped.** You can see in esbae_log_(time) if the processing is still on going. 
- You can restart the kernel and execute all cells, and extraction will **start where it stopped**. This is also valid, if your instance has been shut down before processing was completely finished.

### 1 - Import libs

**ONLY EXECUTE THIS CELL**

In [None]:
# initialize EE    
import ee
try:
    ee.Initialize(opt_url='https://earthengine-highvolume.googleapis.com')
except:
    ee.Authenticate()
    ee.Initialize(opt_url='https://earthengine-highvolume.googleapis.com')
    
from sampling_handler import TimeSeriesExtraction

### 2 - Basic Input Variables

**FILL IN YOUR INPUTS**

In [None]:
esbae = TimeSeriesExtraction(
    # your project name that you use fo all of the notebooks
    project_name  = 'Belize_MRV',
    
    # your start and end date. 
    # NOTE that this should go further back to the past than the 
    # envisaged monitoing period for calibration purposes
    ts_start      = '1995-01-01',      # YYYY-MM-DD format
    ts_end        = '2023-06-01',        # YYYY-MM-DD format
    
    # satellite platform (for now only Landsat is supported)
    satellite     = 'Landsat',
    
    # at what resolution in metres you want to extract (shall conform with forest definition MMU)
    scale         = 70, # pixel size in metres
    
    # wether the TS will be extracted on a bounding box with diameter scale with original scale (e.g 30m for Landsat) of the underlying data (True), 
    # or if the underlying data is rescaled to the scale (False)
    # setting it to True might be more accurate, but tends to be slower
    bounds_reduce = True,
    
    # bands
    bands         =  [
        'green', 'red', 'nir', 'swir1', 'swir2',   # reflectance bands
        'ndfi', 'ndmi', 'ndvi',                    # indices
        'brightness', 'greenness', 'wetness'       # Tasseled Cap 
    ]    
)

### 3- Advanced parameter settings

**Edit for advanced users only, otherwise just execute**

In [None]:
# landsat related parameters
lsat_params = {
    'l9': True,
    'l8': True,
    'l7': True,
    'l5': True,
    'l4': True,
    'brdf': True,
    'bands': esbae.bands,
    'max_cc': 75    # percent
} 

# apply the basic configuration set in the cell above
esbae.lsat_params = lsat_params
esbae.workers = 10                   # this defines how many parallel requests will be send to EarthEngine at a time
esbae.max_points_per_chunk = 100     # this defines the maximum amount of points as send per request to Earth Engine at a time

# this defines the chunk sizes (in degree) to create the requests
#esbae.grid_size_levels = [0.1, 0.075, 0.05]   # optimized for 1km systematic grid
esbae.grid_size_levels = [0.2, 0.15, 0.1]    # optimized for 2km systematic grid
#esbae.grid_size_levels = [0.4, 0.3, 0.2]     # optimized for 4km systematic grid

# if you haven't created and uploaded the samples with notebook 2, you can select your own Feature Collection in this way
esbae.config_dict['design_params']['pid'] = 'sampleid'
esbae.config_dict['design_params']['ee_samples_fc'] = 'users/andreasvollrath/Ethiopia-premrv/Oromia_points'

### 4 - Check for already processed data (optional)

This is useful for large points sizes and when the connection to Sepal gets interrupted. Usually processing will continue, but it is not straightforward to track progress. 
You can instead restart the kernel and see if processing has been finished with the following line of code.

In [None]:
esbae.check_if_completed()

### 5 - Run the time-series data extraction

**Execute only**

In [None]:
esbae.get_time_series_data()