# Run script - Google Earth Engine 

In this script we will gather catchment data from satellite products (e.g. tree cover, NDVI, elevation) using Google Earth Engine (GEE). GEE allows us to directly use satellite data, avoiding the struggle of downloading them. Before using it, you need to create an account: https://signup.earthengine.google.com/#!/

This scripts only works in the conda environment **sr_env**. In this environment all required packages are available. If you have **not** installed and activated this environment before opening this script, you should check the installation section in the *README* file. 


### 1. Getting started
First, import all the required packages.

In [1]:
# import packages
import ee
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import geopandas as gpd
import os
import glob
from pathlib import Path

Before using the Earth Engine API or earthengine command line tool, you must perform a one-time authentication that authorizes access to Earth Engine on behalf of your Google account. Below you run the authentication command. A URL will be provided that generates an authorization code upon agreement. Copy the authorization code and enter it in the box below.

In [2]:
# Trigger the authentication flow.
ee.Authenticate()

# Initialize the library.
ee.Initialize()

Enter verification code:  4/1AX4XfWhR8SmpZ90Kjmj347KKskeFUfNu_FKCtepFeWaBmZueOZCP31zn_d8



Successfully saved authorization token.


*** Earth Engine *** Authenticate calls from this Earth Engine Python client will fail after 2022-05-09: please upgrade. https://developers.google.com/earth-engine/guides/python_install


After authentication we can import all the python functions defined in the scripts *f_earth_engine.py*.

In [3]:
from f_earth_engine import *

### 2. Define working directory
Here we define the working directory, where all the scripts and data are saved. Make sure that you generate within this working directory the following subdirectories with the data:\
/work_dir/data/forcing/*netcdf forcing files*\
/work_dir/data/shapes/*catchment shapefiles*\
/work_dir/data/gsim_discharge/*gsim discharge timeseries*

In [4]:
# Check current working directory (helpful when filling in work_dir below)
os.getcwd()

'/home/fransjevanoors/global_sr_module/scripts'

In [5]:
# define your working directory
work_dir=Path("/work/users/vanoorschot/fransje/scripts/GLOBAL_SR/global_sr_module")
work_dir=Path("/home/fransjevanoors/global_sr_module")

### 3. Load your list of catchment IDs
Here we load the list of catchments IDs that was generated in the *run_script_main*.

In [6]:
catch_id_list = np.genfromtxt(f'{work_dir}/output/catch_id_list.txt',dtype='str')

### 4. Earth Engine treecover
We are interested in the treecover in a catchment. For this we use the MODIS treecover data (https://modis.gsfc.nasa.gov/data/dataprod/mod44.php). This product includes the percentage tree cover, non tree cover, and bare soil on a 250x250 m grid. Here we regrid the tree cover to a 1x1 km grid (to reduce computational costs), average the values over the time period of interest and extract the catchment statistics (mean, max, min and std).

First we create the output directory:

In [7]:
# make output directory
if not os.path.exists(f'{work_dir}/output/earth_engine_timeseries/treecover'):
    os.makedirs(f'{work_dir}/output/earth_engine_timeseries/treecover')

Now we run the *preprocess_treecover_data* and *catchment_treecover* functions from the *f_earth_engine.py* script. The output is a dataframe with the treecover statistics for each catchment.

In [8]:
# define your time period
start_date = '2000-01-01'
end_date = '2020-12-31'

# define your directories
shape_dir = Path(f'{work_dir}/data/shapes/')
out_dir = Path(f'{work_dir}/output/earth_engine_timeseries/treecover')

# preprocess your modis satellite data for your time period (interpolation and averaging)
(MOD44B_tree_res, MOD44B_nontree_res) = preprocess_treecover_data(start_date,end_date)

# loop over catch ids
for catch_id in catch_id_list:
    # extract catchment values and store in dataframe
    catchment_treecover(MOD44B_tree_res, MOD44B_nontree_res, catch_id, shape_dir, out_dir)

In [17]:
# print treecover statistics for catchment [0] in catch_id_list
catch_id = catch_id_list[0]
c = pd.read_csv(f'{out_dir}/{catch_id}.csv',index_col=0)
c.head()

Unnamed: 0,max_tc,mean_tc,min_tc,std_tc,max_ntc,mean_ntc,min_ntc,std_ntc,mean_nonveg
br_0000495,40.438792,40.389229,40.329825,0.027467,47.847697,47.804183,47.755285,0.024777,11.806588


In [16]:
# update conda earth engine -> update environment.yml