<a href="https://colab.research.google.com/github/tylere/forest-data-partnership/blob/sustainable-sourcing-layers-cloud-function/PUBLIC_Sustainable_Sourcing_Layers_Cloud_Function.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sustainable Sourcing Layers in a Cloud Function

This notebook demonstrates deployment of Google Cloud functions to get commodity and forest information in a format designed to be integrated to other workflows.

The 2025a release of Forest Data Partnership sustainable sourcing layers including palm, rubber, cocoa and coffee are used for commodity information.  For details on how these layers were produced, see [the technical documentation on GitHub](https://github.com/google/forest-data-partnership/tree/main/models).  In particular, see [the limitations](https://github.com/google/forest-data-partnership/tree/main/models#limitations).  See also the [Forest Data Partnership publisher catalog](https://developers.google.com/earth-engine/datasets/publisher/forestdatapartnership) for dataset descriptions.  See [this Earth Engine Code Editor script](https://goo.gle/fodapa-layers) for a demonstration of how choice of thresholds affects the mapped results.

Note that users of commercial projects will need to request access through [this form](https://docs.google.com/forms/d/e/1FAIpQLSe7L3eh6t2JIPqEtAQwXwY7ZmW52v8W5vrIi4QN_XYgTNJZLw/viewform).

**WARNING**: These demos consume billable resources and may result in charges to your account!

In [None]:
# USE YOUR OWN PROJECT!
PROJECT = 'forest-data-partnership'
REGION = 'us-central1'

In [None]:
!gcloud auth login --project {PROJECT} --billing-project {PROJECT} --update-adc

# Create the Cloud Function and deploy it

See [this quickstart](https://cloud.google.com/run/docs/quickstarts/functions/deploy-functions-gcloud) for more details.

First, make a directory to hold the code needed for the function.

In [None]:
!mkdir suso_function

## Create functions to get the layers

The `writefile` magic command is used to write the contents of the cell into a local directory.  This is the source code that's used to power the Cloud function.  Be sure to change the project ID to use your own project!

In [None]:
%%writefile suso_function/suso_layers_2025a.py

import google.auth
import ee

# First, initialize.
credentials, _ = google.auth.default(
    scopes=['https://www.googleapis.com/auth/earthengine']
)
ee.Initialize(credentials, project='forest-data-partnership', opt_url='https://earthengine-highvolume.googleapis.com')

# See https://github.com/google/forest-data-partnership/tree/main/models.
COCOA_2025A = ee.ImageCollection("projects/forestdatapartnership/assets/cocoa/model_2025a")
COFFEE_2025A = ee.ImageCollection("projects/forestdatapartnership/assets/coffee/model_2025a")
PALM_2025A = ee.ImageCollection("projects/forestdatapartnership/assets/palm/model_2025a")
RUBBER_2025A = ee.ImageCollection("projects/forestdatapartnership/assets/rubber/model_2025a")

filter2020 = ee.Filter.calendarRange(2020, 2020, 'year')
filter2023 = ee.Filter.calendarRange(2023, 2023, 'year')

cocoa2020 = COCOA_2025A.filter(filter2020).mosaic().rename('cocoa_2020')
cocoa2023 = COCOA_2025A.filter(filter2023).mosaic().rename('cocoa_2023')
coffee2020 = COFFEE_2025A.filter(filter2020).mosaic().rename('coffee_2020')
coffee2023 = COFFEE_2025A.filter(filter2023).mosaic().rename('coffee_2023')
palm2020 = PALM_2025A.filter(filter2020).mosaic().rename('palm_2020')
palm2023 = PALM_2025A.filter(filter2023).mosaic().rename('palm_2023')
rubber2020 = RUBBER_2025A.filter(filter2020).mosaic().rename('rubber_2020')
rubber2023 = RUBBER_2025A.filter(filter2023).mosaic().rename('rubber_2023')

# See https://eartharxiv.org/repository/view/9085/.
nf = ee.ImageCollection(
  'projects/computing-engine-190414/assets/biosphere_models/public/forest_typology/natural_forest_2020_v1_0')
nf_image = nf.mosaic().divide(255).selfMask()

# THRESHOLDS FOR DEMONSTRATION ONLY! Tune these to your needs.
thresholds = [
    0.5,
    0.45,
    0.96,
    0.89,
    0.5
  ]

# A mini-ensemble of GDM and fodapa data products.
ensemble = ee.Image.cat(
  nf_image.rename('forest'),
  cocoa2020.rename('cocoa'),
  coffee2020.rename('coffee'),
  palm2020.rename('palm'),
  rubber2020.rename('rubber')
).unmask(0)

# Threshold the probabilities.  THRESHOLDS FOR DEMONSTRATION ONLY!
crop_names = ['forest', 'cocoa', 'coffee', 'palm', 'rubber']
thresholded = ensemble.select(crop_names).gt(ee.Image(thresholds))

# Unclassified means no predicted presence at the specified thresholds.
unclassified = thresholded.reduce('sum').eq(0)

# Confusion means two or more classes predicted presence.
confusion = thresholded.reduce('sum').gt(1).selfMask()

def get_suso_layers_2025a() -> ee.Image:
    """Returns the stack of probability images in separate bands."""
    return ee.Image.cat(
        nf_image.rename('natural_forest_2020'),
        cocoa2020.rename('cocoa_probability_2020'),
        cocoa2023.rename('cocoa_probability_2023'),
        coffee2020.rename('coffee_probability_2020'),
        coffee2023.rename('coffee_probability_2023'),
        palm2020.rename('palm_probability_2020'),
        palm2023.rename('palm_probability_2023'),
        rubber2020.rename('rubber_probability_2020'),
        rubber2023.rename('rubber_probability_2023'),
    )

def get_areas_image() -> ee.Image:
    """Returns data for area calculations in square meters."""
    return ee.Image.cat(
      thresholded,
      unclassified.rename('unclassified'),
      confusion.rename('confusion')
    ).multiply(ee.Image.pixelArea())

### Import the code locally for testing

Run a few simple sanity checks

In [None]:
import sys
sys.path.append('/content/suso_function')
import ee
from suso_layers_2025a import get_suso_layers_2025a
from suso_layers_2025a import get_areas_image
# Check the image metadata and bands.
print(get_suso_layers_2025a().getInfo())
# Check a sample of the image.
# See https://code.earthengine.google.com/41a305c1d22aca1de630f5ea46c251a5.
test_point = ee.Geometry.Point(104.33, -3.41)
print(get_suso_layers_2025a().reduceRegion(
    ee.Reducer.mean(), test_point, 10).getInfo())
print(get_areas_image().reduceRegion(
    ee.Reducer.mean(), test_point, 10).getInfo())

In [None]:
%%writefile suso_function/main.py

import json
import ee
from flask import jsonify
import functions_framework
import logging
import requests

import google.auth
import google.cloud.logging
from google.api_core import retry

from suso_layers_2025a import get_areas_image

client = google.cloud.logging.Client()
client.setup_logging()


@retry.Retry()
def get_suso_stats(geojson):
    """Get area stats for the provided geojson polygon."""
    region = ee.Geometry(geojson)
    feature_area = ee.Number(region.area(10))
    suso_image = get_areas_image()
    # Sum of pixel areas in square meters.
    stats = suso_image.reduceRegion(
        reducer=ee.Reducer.sum(),
        geometry=region,
        scale=10
    )
    # Gini index.
    # See https://en.wikipedia.org/wiki/Decision_tree_learning#Gini_impurity.
    crop_names = ['forest', 'cocoa', 'coffee', 'palm', 'rubber']
    gini = ee.Number(1).subtract(ee.List(
        [ee.Number(stats.get(c)).divide(feature_area) for c in crop_names]
    ).reduce(ee.Reducer.sum()))
    # Update the EE dictionary.
    stats = stats.set('gini', gini).set('total_area', feature_area)
    # Request the result to the client and return it.
    return stats.getInfo()


@functions_framework.http
def main(request):
  """Handle requests in a format (geojson) suitable for BigQuery."""
  credentials, _ = google.auth.default(
      scopes=['https://www.googleapis.com/auth/earthengine']
  )
  ee.Initialize(credentials, project='forest-data-partnership')
  try:
    replies = []
    request_json = request.get_json(silent=True)
    calls = request_json['calls']
    for call in calls:
      geo_json = json.loads(call[0])
      try:
        logging.info([geo_json])
        response = get_suso_stats(geo_json)
        logging.info(response)
        replies.append(json.dumps(response))
      except Exception as e:
        logging.error(str(e))
        replies.append(json.dumps( { "errorMessage": str(e) } ))
    return jsonify(replies=replies, status=200, mimetype='application/json')
  except Exception as e:
    error_string = str(e)
    logging.error(error_string)
    return jsonify(error=error_string, status=400, mimetype='application/json')

In [None]:
%%writefile suso_function/requirements.txt
earthengine-api
flask
functions-framework
google-api-core
google-cloud-logging
requests

## Deploy the function

Once this is done, the function is ready to test.  It's helpful to follow the links in the deployment output to monitor and/or debug your Cloud Function.  In particular, see the metrics and logs to help resolve perfpormance issues.

In [None]:
!gcloud functions deploy 'suso_function' \
  --gen2 \
  --region={REGION} \
  --project={PROJECT} \
  --runtime=python312 \
  --source='suso_function' \
  --entry-point=main \
  --trigger-http \
  --no-allow-unauthenticated \
  --timeout=300s

## Load WHISP example data

Here we will get the WHISP example data from GitHub and convert to an `ee.FeatureCollection`.

In [None]:
import json

fc_list = !curl https://raw.githubusercontent.com/forestdatapartnership/whisp/main/tests/fixtures/geojson_example.geojson
fc_obj = json.loads("\n".join(fc_list))
features = fc_obj['features']
feature = features[0]
feature

In [None]:
# Get the geometries.
geoms = [f['geometry'] for f in features]

In [None]:
geoms[0]

In [None]:
json.dumps(geoms[0], separators=(',', ':'))

In [None]:
import ee
ee.Initialize(project='forest-data-partnership')

In [None]:
print(ee.Geometry(geoms[0]).getInfo())

## Test the deployed Cloud Function

Note that the Compute Engine default service account is being used for authentication to Earth Engine.  For commercial access to the sustainable sourcing layers, ensure that the service account is approved for commercial access ([request form](https://docs.google.com/forms/d/e/1FAIpQLSe7L3eh6t2JIPqEtAQwXwY7ZmW52v8W5vrIi4QN_XYgTNJZLw/viewform)).

In [None]:
!gcloud auth print-identity-token

Make a test request out of the WHISP sample data.

In [None]:
test_calls = [[json.dumps(g), 'foo_string', 'bar_string'] for g in geoms]
test_request = json.dumps({'calls': test_calls}, separators=(',', ':')).join("''")

In [None]:
test_request

Make the request (might take a while).

In [None]:
responses = !curl -X POST https://{REGION}-{PROJECT}.cloudfunctions.net/suso_function \
  -H "Authorization: bearer $(gcloud auth print-identity-token)" \
  -H "Content-Type: application/json" \
  -d {test_request}

### Inspect the output of the function

The keys are useful for making the SQL to use in BigQuery.

In [None]:
print(len(responses))
response = responses[0]
response_json = json.loads(response)
replies = response_json['replies']
print(len(replies))
reply_0 = replies[0]
reply_0_json = json.loads(reply_0)
reply_0_json.keys()

# Connect to the Cloud Function from BigQuery

Follow [this BigQuery guide](https://cloud.google.com/bigquery/docs/remote-functions#create_a_remote_function) to set up a connection to the Cloud Function deployed previously.  Once the connection is set up, create a function to use in queries.  Run this SQL in BQ, replacing with your project ID.

```
CREATE OR REPLACE FUNCTION `forest-data-partnership.WHISP_DEMO.suso_function`(geom STRING) RETURNS STRING
REMOTE WITH CONNECTION `forest-data-partnership.us-central1.suso_function`
OPTIONS (
  endpoint = 'https://us-central1-forest-data-partnership.cloudfunctions.net/suso_function',
  max_batching_rows = 1
)
```

Once that's done, you can use your `suso_function` function in queries!  The keys extracted from the test response are useful for building the `SQL` that represents this query.  Note that the input table must have a geometry column and that the geometries are passed to the function as GeoJSON strings:

In [None]:
SQL_TEMPLATE = [f"JSON_EXTRACT_SCALAR(json_data, '$.{key}') AS {key}," for key in reply_0_json.keys()]
SQL_TEMPLATE = ['SELECT', 'geometry,'] + SQL_TEMPLATE
SQL_TEMPLATE = SQL_TEMPLATE + [
    'FROM',
    '`forest-data-partnership.WHISP_DEMO.input_examples`,',
    'UNNEST([SAFE.PARSE_JSON(`forest-data-partnership.WHISP_DEMO`.suso_function(ST_ASGEOJSON(geometry)))]) AS json_data']

print('\n'.join(SQL_TEMPLATE))

# Next Steps

- Take that `SQL` blob over to BigQuery and run it!
- Try the [WHISP Cloud Function demo notebook](https://colab.research.google.com/drive/1NCaPOoxqmAEWb8c8V0kEHVunbW1yHVkL?resourcekey=0-HJ3ou94AbjdKkkvaPW1Jtw&usp=sharing).