# earthengine-dask

> Scale up concurrent requests to Earth Engine interactive endpoints with Dask

# Prerequisites

- A [Google Earth Engine](https://earthengine.google.com/) account.
- Access to a Google Cloud Platform (GCP) [project with the Earth Engine API enabled](https://developers.google.com/earth-engine/cloud/earthengine_cloud_project_setup).
- A [Coiled](https://www.coiled.io/) account that is [setup to use the GCP project](https://docs.coiled.io/user_guide/setup/gcp/cli.html).

# Installation

```sh
TODO...
```

# How to use

## Import Python packages

In [1]:
import altair as alt
import ee
from earthengine_dask.core import ClusterGEE
import google.auth
import pandas as pd

## Authenticate & Initialize Earth Engine

Get credentials and the GCP project ID, authenticating if necessary.

In [2]:
try:
    credentials, project_id = google.auth.default()
except google.auth.exceptions.DefaultCredentialsError:
    !gcloud auth application-default login
    credentials, project_id = google.auth.default()

ee.Initialize(credentials=credentials, project=project_id)

## Start Dask Cluster

Start up a Earth Engine enabled cluster. This may take a few minutes to complete.

In [3]:
cluster = ClusterGEE(
    name='test-class-cluster',
    n_workers=2,
    worker_cpu=8,
    region='us-central1',
)

Output()

Output()

Google Application Default Credentials have been written to a file on your Coiled VM(s).
These credentials will potentially be valid until explicitly revoked by running
gcloud auth application-default revoke


Retrieve a client for the cluster, and display it.

In [4]:
client = cluster.get_client()
client

0,1
Connection method: Cluster object,Cluster type: earthengine_dask.ClusterGEE
Dashboard: https://cluster-zoqmf.dask.host/mnc9NHlG1-Dp_CBQ/status,

0,1
Dashboard: https://cluster-zoqmf.dask.host/mnc9NHlG1-Dp_CBQ/status,Workers: 2
Total threads: 16,Total memory: 61.17 GiB

0,1
Comm: tls://10.2.0.48:8786,Workers: 2
Dashboard: http://10.2.0.48:8787/status,Total threads: 16
Started: Just now,Total memory: 61.17 GiB

0,1
Comm: tls://10.2.0.47:42155,Total threads: 8
Dashboard: http://10.2.0.47:8787/status,Memory: 30.58 GiB
Nanny: tls://10.2.0.47:45609,
Local directory: /scratch/dask-scratch-space/worker-rk_7aw52,Local directory: /scratch/dask-scratch-space/worker-rk_7aw52

0,1
Comm: tls://10.2.0.46:39499,Total threads: 8
Dashboard: http://10.2.0.46:8787/status,Memory: 30.58 GiB
Nanny: tls://10.2.0.46:39223,
Local directory: /scratch/dask-scratch-space/worker-d0zam3_g,Local directory: /scratch/dask-scratch-space/worker-d0zam3_g


## Submit Jobs

Test it out by:
- Defining a function that can be distributed,
- Submitting jobs running the function to workers, 
- Gathering the results locally, and
- Displaying the results

In [5]:
# Get a list of countries to analyze.
country_fc = ee.FeatureCollection('USDOS/LSIB_SIMPLE/2017')
country_list = country_fc.aggregate_array('country_na').distinct().sort().getInfo()

# Write a function that can be run by the cluster workers. 
def get_country_stats(country_name):
    country = country_fc.filter(ee.Filter.eq('country_na', country_name))
    elev = ee.ImageCollection("COPERNICUS/DEM/GLO30").select('DEM').mosaic()
    return {
        'country': country_name, 
        'area_km2': country.geometry().area().multiply(1e-6).round().getInfo(), 
        'mean_elev': elev.reduceRegion(reducer=ee.Reducer.mean(),
                                       geometry=country.geometry(),
                                       scale=10000,
                                       ).get('DEM').getInfo(),
    }

# Create and submit jobs among the workers.
submitted_jobs = [
    client.submit(get_country_stats, country, retries=5)
    for country in country_list
]

# Gather up the results and display them.
results = client.gather(submitted_jobs)
df = pd.DataFrame(results)
df

Unnamed: 0,country,area_km2,mean_elev
0,Abyei Area,10460,402.592190
1,Afghanistan,642093,1809.717311
2,Akrotiri,127,60.796081
3,Aksai Chin,30448,5324.949965
4,Albania,28638,689.740847
...,...,...,...
279,West Bank,5813,341.715816
280,Western Sahara,269689,253.175100
281,Yemen,454682,932.051351
282,Zambia,754032,1118.995950


## Shut down the cluster

In [6]:
cluster.shutdown()