# Example notebook to use the GloUrbEE extraction tool

First, you need to create a Google Cloud Project with the EarthEngine API enabled. 
Then create a service account with the EarthEngine Viewer role, and download a login json key for this service account. 

## Login to Google Cloud

The next cell load Earth Engine package and login to the API using the service account.

In [None]:
import ee

credentials = ee.ServiceAccountCredentials('samuel@ee-glourb.iam.gserviceaccount.com', './earthengine-key.json')
ee.Initialize(credentials)

# Set your Google Cloud EarthEngine-enabled project name (for asset exports)
ee_project_name='ee-glourb'

## Import the required packages

In [None]:
import geemap

from glourbee import (
    data_management,
    classification,
    visualization,
    dgo_metrics,
    assets_management,
    workflow,
    dgo_indicators
)

# The next lines are useful if you make modifications in the module files and want to reload it without restarting the notebook kernel
import importlib
importlib.reload(data_management)
importlib.reload(classification)
importlib.reload(visualization)
importlib.reload(dgo_metrics)
importlib.reload(assets_management)
importlib.reload(workflow)
importlib.reload(dgo_indicators)

## 1. Classical usage workflow


### Upload the DGOs to GEE

In [None]:
# If it is the first time you process those DGOs in this Google Cloud project, you have to upload them
dgo_assetId, dgo_features = assets_management.uploadDGOs('./example_data/Yamuna_segm_2km_UTF8.shp', ee_project_name=ee_project_name, simplify_tolerance=15)

In [None]:
# Next times, you can only load the uploaded asset
#dgo_features = ee.FeatureCollection(dgo_assetId)
dgo_features = ee.FeatureCollection('projects/ee-glourb/assets/dgos/Yamuna_simplified_final_108009982a724786aeaf11cc1b6fe6df')

In [None]:
# You can also list available assets with this command
ee.data.listAssets({'parent': 'projects/ee-glourb/assets/dgos'})

### Set parameters

In [None]:
glourbmetrics_params = {
    'ee_project_name': ee_project_name,
    'dgo_asset': 'projects/ee-glourb/assets/dgos/Yamuna_segm_2km_UTF8_final_b718cac924cf4fc3871f9465b180f716',
    'start': '1980-01-01',
    'end': '2030-12-31',
    'cloud_filter': 80,
    'cloud_masking': True,
    'mosaic_same_day': True,
    'split_size': 25,
}

### Start workflow

Please note carefully the run_id returned by the glourbMetrics function. It will allow you to export the final result.

In [None]:
run_id = workflow.startWorkflow(**glourbmetrics_params)

### Monitor workflow tasks

If you restarted the notebook kernel and want to check the state of your previous computation tasks, you can use the following cell to retrieve your running tasks (replace "run_id" by the corresponding id in str format).

In [None]:
run_id = 'fdbb11e270dc4a5db410c94f7c9e8533'
tasks = workflow.workflowState(run_id)

In [None]:
# Check all the details if needed
tasks

In [None]:
# Cancel the workflow
workflow.cancelWorkflow(run_id)

### Export results
When all the computation tasks are complete, use the follwing to merge the result into one file and download it locally.

In [None]:
workflow.getResults(run_id=run_id, ee_project_name=ee_project_name, output_csv='./example_data/yamuna_2km.csv')

In [None]:
# Check your results, then please clean the computation results on GEE
workflow.cleanAssets(run_id, ee_project_name)

### Calculate some indicators
Independent of the raw metrics extraction workflow, it is possible to extract indicators based on the JRC Global Surface Water Mapping Layers :

In [None]:
#dgo_features = ee.FeatureCollection('projects/ee-glourb/assets/dgos/study_area_SA_cut2_final_111f57de97eb4876909538e1fefbf1a6')
workflow.indicatorsWorkflow(dgos_asset = dgo_features, output_csv = './example_data/indicateurs_fanny.csv')

## 2. Step-by-step workflow

Only functional for small area of interest and small amount of DGOs. But very useful to find the right parameters for your data.
### Select data to analyse

In [None]:
# Load DGOs
dgo_features = ee.FeatureCollection('projects/ee-glourb/assets/dgos/Yamuna_river_500m_segm_final_8a47038c4eff4013b2230b8ee0b5acfa')

In [None]:
# Select some DGOs
dgo_list = [57,35,42]
selected_dgo = dgo_features.filter(ee.Filter.inList('DGO_FID', dgo_list))

In [None]:
# Or select a range of DGOs
dgo_list = list(range(15,20))
selected_dgo = dgo_features.filter(ee.Filter.inList('DGO_FID', dgo_list))

### Classify images

In [None]:
# Create a region of interset (union of all the DGOs)
roi = selected_dgo.union(1)

# Get the landsat image collection for your ROI
collection = data_management.getLandsatCollection(start=ee.Date('1990-01-01'), 
                                                  end=ee.Date('1990-12-31'), 
                                                  cloud_filter=80, # Maximum cloud coverage accepted (%)
                                                  cloud_masking=True, # Set to False if you don't want to mask the clouds on accepted images
                                                  mosaic_same_day=True, # Set to False if you don't want to merge all images by day
                                                  roi=roi) 

# Calculate MNDWI, NDVI and NDWI
collection = classification.calculateIndicators(collection)

# Classify the objects using the indicators
collection = classification.classifyObjects(collection)

### Layers visualisation (optional)

At this point of the workflow, you can create an interactive map for an individual landsat image to check all the previously calculated layers.
This feature is only available in a jupyter notebook.

If you have more than 100 DGOs, this will not be functional (work in progress).

In [None]:
from ipywidgets import interact, fixed

# Get the landsat_product_id of the collection
time_starts = collection.aggregate_array('system:time_start').getInfo()

map = interact(visualization.imageVisualization, 
         collection=fixed(collection), 
         dgo_shp=fixed(selected_dgo), 
         time_starts=time_starts)

# Show the map in the notebook output
map

### Download layers (optional)

If your ROI is not too big, it's possible to download the layers to your local disk. See below the content of each output band.

| band number | content |   
|---|---|
| 1 | blue |
| 2 | green |
| 3 | red |
| 4 | nir |
| 5 | swir1 |
| 6 | swir2 |
| 7 | qa_pixel |
| 8 | mndwi |
| 9 | ndvi |
| 10 | ndwi |
| 11 | water |
| 12 | vegetation |
| 13 | active channel |


In [None]:
# TODO: Update needed
data_management.imageDownload(collection=collection, 
                              landsat_id=landsat_id, 
                              roi=roi, 
                              scale=90, # Downgrading is recommended to reduce the file size
                              output='./example_data/landsat_export.tif')

### Calculate metrics

In [None]:
# Metrics calculation
metrics = dgo_metrics.calculateDGOsMetrics(collection=collection,
                                           dgos=selected_dgo)

### Export data
#### Asset-managed method

In [None]:
assets_management.downloadMetrics(metrics, './example_data/properties.csv', ee_project_name=ee_project_name)

#### Other methods (deprecated)

##### Option 1 - Direct download

You can download calculated metrics as pandas dataframe. Technically, the metric calculation is made at this point of the workflow, so the computation time can be long.

In [None]:
from itertools import chain
import pandas as pd

# Get metrics as pandas dataframe
local_data = metrics.getInfo()
metrics_df = pd.DataFrame([dgo['properties'] for dgo in local_data['features']])

# Save a csv file
metrics_df.to_csv('./example_data/properties.csv')

##### Option 2 - Export to Google Drive

If you have too much DGOs and/or dates, the request will take too much memory and you may need to do a Google Drive export.

In [None]:
# Set the output Google Drive directory name (your GEE Service Account needs to have write access to this directory)
output_dir = 'GEE_Exports_sdunesme'

# Set the output filename
output_name = 'lhassa_metrics_v2'

Start the computation and export task with the cell below.

In [None]:
import time

task = ee.batch.Export.table.toDrive(  
    collection=metrics,
    folder=output_dir,
    description=output_name,
    fileFormat='CSV')
task.start()

You can check the task status with the following loop.

In [None]:
while task.active():
    print('\r{}'.format(task.status()), end=" ")
    time.sleep(5)
print('\r{}'.format(task.status()), end=" ")

Download the file from your Google Drive to your local drive and load it in this notebook with the cell below.

##### Option 3 - EarthEngine Asset

In [None]:
task = ee.batch.Export.table.toAsset(
    collection=metrics,
    description='Lhassa metrics',
    assetId='projects/earthengine-371715/assets/lhassa_metrics'
)
task.start()

In [None]:
import time
while task.active():
    print('\r{}'.format(task.status()), end=" ")
    time.sleep(5)
print('\r{}'.format(task.status()), end=" ")