# Geographical and Ecological Earth Observation (GEEO) - Introduction

`GEEO` is a processing pipeline and collection of algorithms for obtaining Analysis-Ready-Data (ARD) from Landsat and Sentinel-2 using the Google Earth Engine Python API. The modules are organized along different hierarchical levels, and processing instructions are either defined via 

1) a parameter file (.yml)

or 

2) python dictionary. 

The dictionary is a more interactive alternative that allows to include `GEEO` in your other image processing workflows. The 'keys' of the dict must have the same variable names as expected from the blueprint file. Keys which are not defined will receive the default values from the blueprint.

## Creating a parameter file
`GEEO` contains the function `create_parameter_file` which uses a blueprint .yml-file to create a new .yml-file which the user can edit.

In [2]:
from geeo import create_parameter_file

# create a parameter file
create_parameter_file('introduction', overwrite=False)



Check out the newly created yml-file. It contains all the variables needed to instruct level-2 and level-3 processing, as well as export settings. 
The LEVEL-2 section starts of with basic settings regarding study area and overall time window (SPACE AND TIME), the desired sensor and masking settings (SENSOR AND DATA QUALITY SETTINGS), as well as which features/bands to include (BANDS | INDICES | FEATURES). 

We stick to the default settings to illustrate basic level-2 processing. If we - in theory - wanted to run these default settings we would have to simply call the `run_param()` onto the yml-file. In practice, of course you usually wish to at least adjust the study area and time window, and probably also request advanced image features to be exported. 

In order to illustrate the way `GEEO` reads the yml-file we are going to first load the parameters as python dictionary into our interactive environment. To dot his, we use the `load_parameters` function.

In [1]:
from geeo import load_parameters

# load the parameter file
prm = load_parameters('introduction.yml')

prm

{'YEAR_MIN': 2023,
 'YEAR_MAX': 2023,
 'MONTH_MIN': 1,
 'MONTH_MAX': 12,
 'DOY_MIN': 1,
 'DOY_MAX': 366,
 'DATE_MIN': None,
 'DATE_MAX': None,
 'ROI': [12.9, 52.2, 13.9, 52.7],
 'ROI_SIMPLIFY_GEOM_TO_BBOX': False,
 'SENSORS': ['L9', 'L8', 'L7', 'L5', 'L4'],
 'MAX_CLOUD': 75,
 'EXCLUDE_SLCOFF': False,
 'GCP_MIN_LANDSAT': 1,
 'MASKS_LANDSAT': ['cloud', 'cshadow', 'snow', 'fill', 'dilated'],
 'MASKS_LANDSAT_CONF': 'Medium',
 'MASKS_S2': 'CPLUS',
 'MASKS_S2_CPLUS': 0.6,
 'MASKS_S2_PROB': 30,
 'MASKS_S2_NIR_THRESH_SHADOW': 0.2,
 'ERODE_DILATE': False,
 'ERODE_RADIUS': 60,
 'DILATE_RADIUS': 120,
 'ERODE_DILATE_SCALE': 90,
 'BLUE_MAX_MASKING': None,
 'FEATURES': ['BLU', 'GRN', 'RED', 'NIR', 'SW1', 'SW2'],
 'DEM': False,
 'UMX': None,
 'UMX_SUM_TO_ONE': True,
 'UMX_NON_NEGATIVE': True,
 'UMX_REMOVE_INPUT_FEATURES': True,
 'TSM': False,
 'FOLD_YEAR': False,
 'FOLD_MONTH': False,
 'FOLD_CUSTOM': {'year': None, 'month': None, 'doy': None, 'date': None},
 'TSI': None,
 'TSI_BASE_IMGCOL': 'TSS',
 '

The `load_parameters` function converts the yml-file into a python dictionary containing all defined variables. **The dictionary is the central data structure used to save input and output variables when interacting with `GEEO`.** All processing routines (`level2/level2.py`, `level3/level3.py`, and `misc/export.py`) rely on this structure containing the required instructions and also return the same dictionary with updated variables (more on this below).

As planned above we are first going to inspect the variables defined by default and then run the processing routine. 
For now we are only concerned with the LEVEL-2 and EXPORT section of the yml-file (and associated variables in the dictionary above). Specifically, we want to inspect the variables:

- **YEAR_MIN**: 2023
- **YEAR_MAX**: 2023
- **MONTH_MIN**: 1
- **MONTH_MAX**: 12
- **ROI**: [12.9, 52.2, 13.9, 52.7]
- **SENSORS**: ['L9', 'L8', 'L7', 'L5', 'L4']
- **MAX_CLOUD**: 75
- **MASKS_LANDSAT**: ['cloud', 'cshadow', 'snow', 'fill', 'dilated']
- **MASKS_LANDSAT_CONF**: 'Medium'
- **FEATURES**: ['BLU', 'GRN', 'RED', 'NIR', 'SW1', 'SW2']
- **EXPORT_IMAGE**: False



For a (detailed) description on the valid options for each variable, please inspect the comments in the yml-file. As you can see from the variables above we are asking for Landsat-4, -5, -7, -8, and-9 data for all months in the year 2023. The region-of-interest (ROI) is a lat/lon defined bounding box covering Berlin. We would like to only use scenes with less than 75% cloud cover and mask clouds, cloud shadows, snow, fill values, and dilated pixels with medium confidence (rather conservative masking). We only want to process the blue (BLU), green (GRN), red (RED), near-infrared (NIR), shortwave-infrared 1 (SW1), and shortwave-infrared 2 (SW1) bands. Exporting any image (more on the different products later) is not desired at this stage.

## Running a parameter file or dictionary

Now that we have a parameter file we can run the process and inspect the results.

We can either run individiual modules (`level2/level2.py`, `level3/level3.py`, and `misc/export.py`) (less frequently desired) or the entire processing chain (more frequently desired). Running all modules simply requires the function `run_param`:

In [2]:
# import the required modules
from geeo import run_param

# run the parameter file
prm_processed = run_param(prm)

prm_processed

{'YEAR_MIN': 2023,
 'YEAR_MAX': 2023,
 'MONTH_MIN': 1,
 'MONTH_MAX': 12,
 'DOY_MIN': 1,
 'DOY_MAX': 366,
 'DATE_MIN': None,
 'DATE_MAX': None,
 'ROI': [12.9, 52.2, 13.9, 52.7],
 'ROI_SIMPLIFY_GEOM_TO_BBOX': False,
 'SENSORS': ['L9', 'L8', 'L7', 'L5', 'L4'],
 'MAX_CLOUD': 75,
 'EXCLUDE_SLCOFF': False,
 'GCP_MIN_LANDSAT': 1,
 'MASKS_LANDSAT': ['cloud', 'cshadow', 'snow', 'fill', 'dilated'],
 'MASKS_LANDSAT_CONF': 'Medium',
 'MASKS_S2': 'CPLUS',
 'MASKS_S2_CPLUS': 0.6,
 'MASKS_S2_PROB': 30,
 'MASKS_S2_NIR_THRESH_SHADOW': 0.2,
 'ERODE_DILATE': False,
 'ERODE_RADIUS': 60,
 'DILATE_RADIUS': 120,
 'ERODE_DILATE_SCALE': 90,
 'BLUE_MAX_MASKING': None,
 'FEATURES': ['BLU', 'GRN', 'RED', 'NIR', 'SW1', 'SW2'],
 'DEM': False,
 'UMX': None,
 'UMX_SUM_TO_ONE': True,
 'UMX_NON_NEGATIVE': True,
 'UMX_REMOVE_INPUT_FEATURES': True,
 'TSM': False,
 'FOLD_YEAR': False,
 'FOLD_MONTH': False,
 'FOLD_CUSTOM': {'year': None, 'month': None, 'doy': None, 'date': None},
 'TSI': None,
 'TSI_BASE_IMGCOL': 'TSS',
 '

As you can see the output variable `prm_processed` contains the exact same dictionary structure with some updated and additional variables.
The variable of interest for now is the Time-Series-Stack `TSS` variable, the very basic ee.ImageCollection variable containing the desired data as specified above.

In [3]:
TSS = prm_processed.get('TSS')
TSS

<ee.imagecollection.ImageCollection at 0x2a432ff1a90>

We can use the [eerepr](https://github.com/aazuspan/eerepr) python package to allow for rendered interactive exploration of server-side variables.

In [4]:
import eerepr
TSS

As you can see our TSS variable is an `ee.ImageCollection` containing 172 `ee.Image` objects which sufficed our filter criteria above. Each image contains the seven specified bands + the mask as separate band (internally required for some higher-level processing later on). 

The essence is that `GEEO` always returns `ee.Image` or `ee.ImageCollection` objects which can then be treated and modified further to specified needs. 

## Visualizing Images

For illustration purposes let us visualize one of the images in a map view. `GEEO` has a built-in function for basic visualization purposes.

We can add `ee.Image`s to our map object using the `add()` function within the VisMap class. I want to visualize the 27th image in the ee.ImageCollection. First I will have to get this specific image form the collection:

In [5]:
import ee
img = ee.Image(TSS.toList(TSS.size()).get(27))
img

In [6]:
from geeo import VisMap

# Create map
M = VisMap()
M.add(img.select(['NIR', 'SW1', 'RED']), roi=prm_processed.get('ROI'), name='TSS_image')
M.show()

Map(center=[0.0, 0.0], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zoom_out_t…

## Updating a parameter dictionary

We can also update a parameter dictionary or yml-file directly in the code/console.

Let's say we wanted to switch the study area and also calculate STMs, both not yet specified in the `introduction.yml` file. Instead of modifying the file we can modify the dictionary.

Let's extract the bounding box of our new hypothetical study area using the map window. We draw a rectangle with the tools on the left and select and copy the coordinates in the bottom right corner.

In [None]:
VisMap().show()

Map(center=[0.0, 0.0], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zoom_out_t…

Generally, if we specify standard processing settings in a yml-file but want to interactively change certain variables we can use the `merge_parameters` function.

In [9]:
from geeo import merge_parameters

# new study area
new_roi = [11.212921, 47.543627, 11.491699, 47.692663]

# update the parameter file
prm = merge_parameters(load_parameters('introduction.yml'), {'ROI': new_roi, 'YEAR_MIN': 2021, 'YEAR_MAX': 2024})

prm_processed = run_param(prm)

prm_processed

{'YEAR_MIN': 2021,
 'YEAR_MAX': 2024,
 'MONTH_MIN': 1,
 'MONTH_MAX': 12,
 'DOY_MIN': 1,
 'DOY_MAX': 366,
 'DATE_MIN': None,
 'DATE_MAX': None,
 'ROI': [11.212921, 47.543627, 11.491699, 47.692663],
 'ROI_SIMPLIFY_GEOM_TO_BBOX': False,
 'SENSORS': ['L9', 'L8', 'L7', 'L5', 'L4'],
 'MAX_CLOUD': 75,
 'EXCLUDE_SLCOFF': False,
 'GCP_MIN_LANDSAT': 1,
 'MASKS_LANDSAT': ['cloud', 'cshadow', 'snow', 'fill', 'dilated'],
 'MASKS_LANDSAT_CONF': 'Medium',
 'MASKS_S2': 'CPLUS',
 'MASKS_S2_CPLUS': 0.6,
 'MASKS_S2_PROB': 30,
 'MASKS_S2_NIR_THRESH_SHADOW': 0.2,
 'ERODE_DILATE': False,
 'ERODE_RADIUS': 60,
 'DILATE_RADIUS': 120,
 'ERODE_DILATE_SCALE': 90,
 'BLUE_MAX_MASKING': None,
 'FEATURES': ['BLU', 'GRN', 'RED', 'NIR', 'SW1', 'SW2'],
 'DEM': False,
 'UMX': None,
 'UMX_SUM_TO_ONE': True,
 'UMX_NON_NEGATIVE': True,
 'UMX_REMOVE_INPUT_FEATURES': True,
 'TSM': False,
 'FOLD_YEAR': False,
 'FOLD_MONTH': False,
 'FOLD_CUSTOM': {'year': None, 'month': None, 'doy': None, 'date': None},
 'TSI': None,
 'TSI_BAS

In [12]:
TSS = prm_processed.get('TSS')

# Create map
M = VisMap()
M.add(ee.Image(TSS.toList(TSS.size()).get(10)).select(['NIR', 'SW1', 'RED']), roi=prm_processed.get('ROI'), name='TSS_image')
M.show()

Map(center=[0.0, 0.0], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zoom_out_t…