<a href="https://colab.research.google.com/github/Vizzuality/mangrove-atlas-data/blob/master/process_mangrove_soil.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Prepare data for the mangrove-atlas project

https://github.com/Vizzuality/mangrove-atlas-data

`Edward P. Morris (vizzuality.)`

## Description
This notebook gets mangrove soil carbon data, creates manifests, uploads to GCS, and earthengine asset.   

```
MIT License

Copyright (c) 2020 Vizzuality

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
```

# Setup

Instructions for setting up the computing environment.

In [None]:
# Remove sample_data
!rm -r sample_data

## Linux dependencies

Instructions for adding linux (including node, ect.) system packages. 

In [None]:
# Fix for curl certificates (rasterio virtual connectors)
# RasterioIOError: CURL error: error setting certificate verify locations:   CAfile: /etc/pki/tls/certs/ca-bundle.crt   CApath: none
!apt install ca-certificates
#!export CURL_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
!mkdir -p /etc/pki/tls/certs
!cp /etc/ssl/certs/ca-certificates.crt /etc/pki/tls/certs/ca-bundle.crt
!export CURL_CA_BUNDLE=/etc/ssl/certs/ca-certificates.crt
!ls /etc/pki/tls/certs

Reading package lists... Done
Building dependency tree       
Reading state information... Done
ca-certificates is already the newest version (20190110~18.04.1).
The following package was automatically installed and is no longer required:
  libnvidia-common-440
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 33 not upgraded.
ca-bundle.crt


In [None]:
%%bash
#apt install -q -y [package-name]
#npm install -g [package-name]

## Python packages

In [None]:
# Install python packages
!pip install -q iso8601 pystac python_jsonschema_objects rasterio rio-cogeo shapely zenodo-get tqdm pystac uuid

[K     |████████████████████████████████| 71kB 3.1MB/s 
[K     |████████████████████████████████| 18.2MB 231kB/s 
[?25h  Building wheel for rio-cogeo (setup.py) ... [?25l[?25hdone
  Building wheel for uuid (setup.py) ... [?25l[?25hdone
  Building wheel for supermercado (setup.py) ... [?25l[?25hdone
  Building wheel for wget (setup.py) ... [?25l[?25hdone


In [None]:
# Check shapely speedups are available
from shapely import speedups
speedups.enabled

True

In [None]:
# Show python package versions
!pip list

Package                   Version        
------------------------- ---------------
absl-py                   0.9.0          
affine                    2.3.0          
alabaster                 0.7.12         
albumentations            0.1.12         
altair                    4.1.0          
asgiref                   3.2.10         
astor                     0.8.1          
astropy                   4.0.1.post1    
astunparse                1.6.3          
atari-py                  0.2.6          
atomicwrites              1.4.0          
attrs                     19.3.0         
audioread                 2.1.8          
autograd                  1.3            
Babel                     2.8.0          
backcall                  0.2.0          
beautifulsoup4            4.6.3          
bleach                    3.1.5          
blis                      0.4.1          
bokeh                     1.4.0          
boto                      2.49.0         
boto3                     1.14.9  

## Authorisation

Setting up connections and authorisation to cloud services.

### Google Cloud

This can be done in the URL or via adding service account credentials.

If you do not share the notebook, you can mount your Drive and and transfer credentials to disk. Note if the notebook is shared you always need to authenticate via URL.  

In [None]:
# Set the Google Cloud params
gc_project_id = "mangrove-atlas-246414"
gc_creds = "mangrove-atlas-246414-2f33cc439deb.json"
gc_username = "edward-morris-vizzuality-com-d@mangrove-atlas-246414.iam.gserviceaccount.com"
gcs_prefix = "gs://mangrove_atlas"
gcs_http_prefix = "https://storage.googleapis.com/mangrove_atlas"

#### without service account

In [None]:
# For auth WITHOUT service account
# https://cloud.google.com/resource-manager/docs/creating-managing-projects
#from google.colab import auth
#auth.authenticate_user()
#!gcloud config set project {project_id}

#### with service account

In [None]:
# If the notebook is shared
#from google.colab import drive
#drive.mount('/content/drive')

In [None]:
# If Drive is mounted, copy GC credentials to home (place in your GDrive, and connect Drive)
!cp "/content/drive/My Drive/{gc_creds}" "/root/.{gc_creds}"

In [None]:
# Auth WITH service account
!gcloud auth activate-service-account {gc_username} --key-file=/root/.{gc_creds} --project={gc_project_id}

Activated service account credentials for: [edward-morris-vizzuality-com-d@mangrove-atlas-246414.iam.gserviceaccount.com]


In [None]:
# Test GC auth
!gsutil ls {gcs_prefix}

gs://mangrove_atlas/wdpa_geometry_types_.csv
gs://mangrove_atlas/./
gs://mangrove_atlas//
gs://mangrove_atlas/boundaries/
gs://mangrove_atlas/deforestation-alerts/
gs://mangrove_atlas/ee-export-tables/
gs://mangrove_atlas/ee-upload-manifests/
gs://mangrove_atlas/elevation/
gs://mangrove_atlas/environmental-pressures/
gs://mangrove_atlas/gadm-eez.zarr/
gs://mangrove_atlas/land-cover/
gs://mangrove_atlas/mangrove-properties/
gs://mangrove_atlas/orthoimagery/
gs://mangrove_atlas/physical-environment/
gs://mangrove_atlas/tilesets/
gs://mangrove_atlas/tmp/


# Utils

Generic helper functions used in the subsequent processing. For easy navigation each function seperated into a section with the function name.

## mkdirs

In [None]:
from pathlib import Path

def mkdirs(dirs_list, exist_ok=True):
  """ Create nested directories
  """
  for p in dirs_list:
    Path(p).mkdir(parents=True, exist_ok=exist_ok)

## copy_gcs

In [None]:
import os
import subprocess

def copy_gcs(source_list, dest_list, opts=""):
  """
  Use gsutil to copy each corresponding item in source_list
  to dest_list.

  Example:
  copy_gcs(["gs://my-bucket/data-file.csv"], ["."])

  """
  for s, d  in zip(source_list, dest_list):
    cmd = f"gsutil -m cp -r {opts} {s} {d}"
    print(f"Processing: {cmd}")
    r = subprocess.call(cmd, shell=True)
    if r == 0:
        print("Task created")
    else:
        print("Task failed")
  print("Finished copy")

## unpack

In [None]:
import os
import subprocess

def unpack(source_list, dest_list):
  """
  Use (g)unzip to unpack paths in source_list
  to each corresponding path in dest_list

  FIXME: add more package options!
  """
  for s, d  in zip(source_list, dest_list):
    fn, fe = os.path.splitext(s)
    if fe == ".tgz":
      cmd = f"tar -xvzf {s} -C {d}"
    if fe == ".gz":
      cmd = f"gzip -c {s} > {d}"
    if fe == ".zip":
      cmd = f"unzip {s} -d {d}"
    print(f"Processing: {cmd}")
    r = subprocess.call(cmd, shell=True)
    if r == 0:
        print("Task created")
    else:
        print("Task failed")
  print("Finished unpacking")

## gee_upload_images_to_collection

In [None]:
import glob
from tqdm.notebook import trange, tqdm

def gee_upload_images_to_collection(dir_path, asset_path, gcs_tmp_path, pyramiding_policy, bands, time_start, nodata_value, force=True, properties={}):
    '''Given a local directory path and a GEE asset path
       Create ee.ImageCollection, upload image to GCS, and add images + metadata to a SINGLE ee.ImageCollection asset
    '''
    
    # Format arguments
    f = ""
    if force:
      f = "--force"
    pp = f"--pyramiding_policy={pyramiding_policy}" 
    ts = f"--time_start={time_start}"
    n = f"--nodata_value={nodata_value}"
    b =  f"--bands={bands}"
    p = ""
    if len(properties) > 0:
      p = [f"--property={key}={value}" for key, value in properties.items()]
      p = " ".join(p) 
    args = f"{f} {pp} {ts} {n} {b} {p}"
    
    # Get file path array
    file_array = glob.glob(dir_path)
    #subprocess.check_output(cmd, shell=True).decode('utf8').split('\n')
    print(f"Found {len(file_array)} files")
    
    # Create collection
    cmd = f"earthengine --no-use_cloud_api create collection {asset_path}"
    print(cmd)
    r = subprocess.call(cmd, shell=True)
    if r == 0:
        print("Task created")
    else:
        print("Task failed")
    
    # Create upload task for each file
    with tqdm(total=len(src_paths), desc="Converting images to COG") as pbar:
      for file in file_array:
      
        # Get asset item id
        asset_id = os.path.splitext(os.path.basename(file))[0]
        #print(f"Processing {asset_id}")

        # Upload to GCS tmp path
        cmd = f"gsutil -m cp -r {file} {gcs_tmp_path}" 
        #print(cmd)
        r = subprocess.call(cmd, shell=True)
        #if r == 0:
        #  print("Task created")
        #else:
        #  print("Task failed")

        # Upload to earthengine 
        gcs_file = f"{gcs_tmp_path}/{os.path.basename(file)}"
        cmd = f"earthengine --no-use_cloud_api upload image --asset_id={asset_path}/{asset_id} {args} {gcs_file}"
        #print(cmd)
        r = subprocess.call(cmd, shell=True)
        #if r == 0:
        #  print("Task created")
        #else:
        #  print("Task failed")
        pbar.update(1)
    
    print("Finished upload")

## gee_image_collection_to_image

In [None]:
import ee
import subprocess

def gee_image_collection_to_image(ic_path, im_path, pyramidingPolicy, force = False):
  
  if force == True:
    cmd = f"earthengine --no-use_cloud_api rm -r {im_path}"
    #print(cmd)
    r = subprocess.call(cmd, shell=True)
    if r == 0:
        print(f"Removed {im_path}")
    else:
        print("Task failed")
  
  # Initialize the earthengine library.
  ee.Initialize()
  
  # Get imageCollection
  ic = ee.ImageCollection(ic_path)
  
  # Get bounds of collection
  geom = ic.geometry().bounds() 
  #print("Bbox:", geom.getInfo())

  # Get nominal resolution of first image
  nr = ic.first().projection().nominalScale().getInfo()
  #print("Nominal resolution (m):", nr)
  
  # Get band names of first image
  bns = ic.first().bandNames().getInfo()
  #print("Band names:", bns)
  
  # Create image mosaic
  im = ee.Image(ic.mosaic().reduce(ee.Reducer.firstNonNull()))
  # update bandnames
  im = im.rename(bns)
  
  # Define export args
  args = {
  'image': im
  , 'description': "convert_ic_to_im"
  , 'assetId': im_path
  , 'pyramidingPolicy': dict(zip(bns, pyramidingPolicy)) 
  , 'dimensions': None
  , 'region': geom
  , 'scale': nr
  , 'crs': None
  , 'crsTransform': None
  , 'maxPixels': 1e12}
  #print(args)

  # return task
  return ee.batch.Export.image.toAsset(**args)

## get_ds_props

In [None]:
import os
import json
import requests

def get_ds_props(file_path, out_style="manifest"):
  """ Get dataset schema from file

  Optionally convert to an ImageManifest.Properties object

  @arg file_path File path string  
  @arg out_style Object style to return; set to false for no change

  @return A dictionary or properties
  """
  
  # get filename and extension
  fn, fe = os.path.splitext(file_path)
  
  if "http" in fn:
    ds = requests.get(file_path).text
  else:
    ds = json.load(open(file_path, 'r'))
 
  ds = json.loads(ds)
  #print(ds)
  
  # optionally format output
  if out_style == "manifest":
    # create properties template
    out = {
        "name" : ds.get("name"),
        "license" : ds.get("license"),
        "identifier" : ds.get("identifier"),
        "url" : ds.get("url"),
        "version" : ds.get("version"),
        "creator" : ds.get("creator")[0]["affiliation"],
        "keywords" : '; '.join(ds.get("keywords")),
        "description": json.dumps(ds.get("description"))
        }

  return out


## gee_update_asset_properties

In [None]:
import subprocess
import json

def gee_update_asset_properties(asset_path, properties = {}, time_start=None, time_end=None, dry_run=False):
  
  # Format arguments
  ts = ""
  if time_start:
    ts = f"--time_start={time_start}"
  te = ""
  if time_end:
    te = f"--time_end={time_end}"  
  p = ""
  if len(properties) > 0:
    p = [f"--property={key}={json.dumps(value)}" for key, value in properties.items()]
    p = " ".join(p) 
  args = f"{ts} {te} {p}"

  # Update asset
  cmd = f"earthengine --no-use_cloud_api asset set {args} {asset_path}"
  if dry_run:
    print(cmd)
  else:
    r = subprocess.call(cmd, shell=True)
    if r == 0:
      print(f"\nUpdated properties for asset: {asset_path}\n")
      cmd = f"earthengine --no-use_cloud_api asset info {asset_path}"
      out = subprocess.check_output(cmd, shell=True).decode('utf8')
      print(out)
    else:
      print("Task failed")
      print(cmd)


## gee_create_image_collection

In [None]:
import iso8601

def gee_create_image_collection(ee_asset_path, image_start_times, collection_properties = {}, dry_run=False):

  # Check if collection exists, potentially filter file array or create collection
  print("\nChecking if collection exists...")
  cmd = f"earthengine --no-use_cloud_api asset info {ee_asset_path}"

  try:
    out = subprocess.check_output(cmd, shell=True).decode('utf8')
    print(f"\nee.ImageCollection {ee_asset_path} exists, with properties\n:")
    print(out)
  except subprocess.CalledProcessError as ex: 
    print ("\nImageCollection not found\n")
    print (ex.output)
    # Create collection
    cmd = f"earthengine --no-use_cloud_api create collection {ee_asset_path}"
    print("\nCreating ee.ImageCollection\n")
    if dry_run:
      print(cmd)
    else:
      r = subprocess.call(cmd, shell=True)
      if r == 0:
        print(f"\nee.ImageCollection {ee_asset_path} created\n")
      else:
        print("\nTask failed")
        print(cmd)
        print("\n")
    
  # Update the collection properties
  print("\nUpdating ImageCollection properties...")
  ts = [iso8601.parse_date(t) for t in image_start_times]
  collection_time_start = min(ts).strftime("%Y-%m-%d")
  collection_time_end = max(ts).strftime("%Y-%m-%d")  
  gee_update_asset_properties(
      ee_asset_path,
      properties = collection_properties,
      time_start = collection_time_start,
      time_end = collection_time_end,
      dry_run = dry_run
      )
  

## folium_add_ee_layer

In [None]:
# Import libraries.
import ee
import folium

# Define a method for displaying Earth Engine image tiles to folium map.
def add_ee_layer(self, ee_image_object, vis_params, name):
  map_id_dict = ee.Image(ee_image_object).getMapId(vis_params)
  folium.raster_layers.TileLayer(
    tiles = map_id_dict['tile_fetcher'].url_format,
    attr = "Map Data © Google Earth Engine",
    name = name,
    overlay = True,
    control = True
  ).add_to(self)

# Add EE drawing method to folium.
folium.Map.add_ee_layer = add_ee_layer

# Processing

Data processing organised into sections.

In [None]:
# Set working directories
ds_dir = "./dataset"

In [None]:
# Make directory structure
mkdirs([ds_dir], exist_ok=True)
!ls

cogs	       mangroves_SOC30m_0_100cm.vrt	       stac
dataset        mangroves_SOC30m_0_100cm.zip	       thumbs
drive	       mangroves_SOC_points.gpkg	       tmp
ee-manifest    md5sums.txt			       wsg84
file_list.txt  preview_mangroves_soil_carbon_QGIS.png


## Get data package

In [None]:
%%bash
ds_doi="10.5281/zenodo.2536803"
zenodo_get $ds_doi

Title: Predicted soil organic carbon stock at 30 m in t/ha for 0-100 cm depth global / update of the map of mangrove forest soil carbon
Keywords: mangroves, soil carbon, machine learning, superlearner package
Publication date: 2018-10-23
DOI: 10.5281/zenodo.2536803
Total size: 364.5 MB

Link: https://zenodo.org/api/files/826b6287-7c3d-4a49-a1d5-cbecb5d0496d/mangroves_SOC30m_0_100cm.zip   size: 360.8 MB

Checksum is correct. (bfb5236878b0b60dee11ca9157e39c38)

Link: https://zenodo.org/api/files/826b6287-7c3d-4a49-a1d5-cbecb5d0496d/mangroves_SOC_points.gpkg   size: 2.3 MB

Checksum is correct. (32e9ca9c6736b2b5fe63435a24073f2b)

Link: https://zenodo.org/api/files/826b6287-7c3d-4a49-a1d5-cbecb5d0496d/preview_mangroves_soil_carbon_QGIS.png   size: 1.4 MB

Checksum is correct. (aad56eac7d787a00f2a8c119650641d3)
All files have been downloaded.


In [None]:
# copy data package to GCS
ds_name = "mangroves_SOC30m_0_100cm"
copy_gcs([f"{ds_name}.zip"], [f"{gcs_prefix}/mangrove-properties/"])

Processing: gsutil -m cp -r  mangroves_SOC30m_0_100cm.zip gs://mangrove_atlas/mangrove-properties/
Task created
Finished copy


In [None]:
# Make public
#set_acl_to_public("f"{gcs_prefix}/mangrove-properties/"")

In [None]:
# Unpack the dataset
unpack(["mangroves_SOC30m_0_100cm.zip"], [ds_dir])

Processing: unzip mangroves_SOC30m_0_100cm.zip -d ./dataset
Task created
Finished unpacking


In [None]:
# View info for single tile
!gdalinfo /content/dataset/mangroves_SOC30m_0_100cm/dSOCS_0_100cm_year2000_30m_T10037.tif

Driver: GTiff/GeoTIFF
Files: /content/dataset/mangroves_SOC30m_0_100cm/dSOCS_0_100cm_year2000_30m_T10037.tif
Size is 4000, 4000
Coordinate System is:
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0],
    UNIT["degree",0.0174532925199433],
    AUTHORITY["EPSG","4326"]]
Origin = (136.000000000000000,-34.000809433999997)
Pixel Size = (0.000250000000000,-0.000250000000000)
Metadata:
  AREA_OR_POINT=Area
Image Structure Metadata:
  COMPRESSION=DEFLATE
  INTERLEAVE=BAND
Corner Coordinates:
Upper Left  ( 136.0000000, -34.0008094) (136d 0' 0.00"E, 34d 0' 2.91"S)
Lower Left  ( 136.0000000, -35.0008094) (136d 0' 0.00"E, 35d 0' 2.91"S)
Upper Right ( 137.0000000, -34.0008094) (137d 0' 0.00"E, 34d 0' 2.91"S)
Lower Right ( 137.0000000, -35.0008094) (137d 0' 0.00"E, 35d 0' 2.91"S)
Center      ( 136.5000000, -34.5008094) (136d30' 0.00"E, 34d30' 2.91"S)
Band 1 Block

## Upload to earthengine

The dataset is single timestamp, multiple images, sparsly distributed in bounding box.

+ Upload to GCS tmp directory 
+ Upload as imageCollection
+ Export as single image
+ Update image metadata
+ Delete tmp files and imageCollection

In [None]:
# Authenticate earthengine
!earthengine authenticate

In [None]:
#!earthengine upload image -h

Instructions for updating:
non-resource variables are not supported in the long term
usage: earthengine upload image [-h] [--wait [WAIT]] [--force]
                                [--asset_id ASSET_ID] [--last_band_alpha]
                                [--nodata_value NODATA_VALUE]
                                [--pyramiding_policy PYRAMIDING_POLICY]
                                [--bands BANDS] [--crs CRS]
                                [--manifest MANIFEST] [--property PROPERTY]
                                [--time_start TIME_START]
                                [--time_end TIME_END]
                                [src_files [src_files ...]]

Uploads an image from Cloud Storage to Earth Engine. See docs for "asset set"
for additional details on how to specify asset metadata properties.

positional arguments:
  src_files             Cloud Storage URL(s) of the file(s) to upload. Must
                        have the prefix 'gs://'.

optional arguments:
  -h, --help        

### Upload to GCS and add to imageCollection

In [None]:
# Define parameters
dir_path = "./dataset/mangroves_SOC30m_0_100cm/*.tif"
asset_path = "projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm_collection"
gcs_tmp_path = "gs://mangrove_atlas/tmp"
pyramiding_policy = "mean"
bands = "soc"
time_start = "2016-01-01"
nodata_value = -32768

In [None]:
# Upload to imageCollection
gee_upload_images_to_collection(dir_path, asset_path, gcs_tmp_path, pyramiding_policy, bands, time_start, nodata_value, force=True, properties={})

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Processing dSOCS_0_100cm_year2000_30m_T21525
gsutil -m cp -r ./dataset/mangroves_SOC30m_0_100cm/dSOCS_0_100cm_year2000_30m_T21525.tif gs://mangrove_atlas/tmp
Task created
earthengine --no-use_cloud_api upload image --asset_id=projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm_collection/dSOCS_0_100cm_year2000_30m_T21525 --force --pyramiding_policy=mean --time_start=2000-01-01 --nodata_value=-32768 --bands=soc  gs://mangrove_atlas/tmp/dSOCS_0_100cm_year2000_30m_T21525.tif
Task created
Processing dSOCS_0_100cm_year2000_30m_T30348
gsutil -m cp -r ./dataset/mangroves_SOC30m_0_100cm/dSOCS_0_100cm_year2000_30m_T30348.tif gs://mangrove_atlas/tmp
Task created
earthengine --no-use_cloud_api upload image --asset_id=projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm_collection/dSOCS_0_100cm_year2000_30m_T30348 --force --pyramiding_policy=mean --time_start=2000-01-01 --nodata_value=-3

### Convert imageCollection to image

In [None]:
import ee

In [None]:
# Trigger the authentication flow.
ee.Authenticate()

# Initialize the library.
ee.Initialize()

To authorize access needed by Earth Engine, open the following URL in a web browser and follow the instructions. If the web browser does not start automatically, please manually browse the URL below.

    https://accounts.google.com/o/oauth2/auth?client_id=517222506229-vsmmajv00ul0bs7p89v5m89qs8eb9359.apps.googleusercontent.com&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fearthengine+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdevstorage.full_control&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&response_type=code&code_challenge=fr5onOxor_TJbU-Ylz0z5qqdp4_JxiQXDenlpSYk6OA&code_challenge_method=S256

The authorization workflow will generate a code, which you should paste in the box below. 
Enter verification code: 4/0gE8g1zR7u257hb3HPHjMRTE9ptIo93O4XB59VlN3WIzkG6tNUq8nZA

Successfully saved authorization token.


In [None]:
# Convert ic to im
task = gee_image_collection_to_image(
    ic_path = 'projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm_collection',
    im_path = 'projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm',
    pyramidingPolicy = ['mean'],
    force = True)  
task.start()

earthengine --no-use_cloud_api rm -r projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm
Removed projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm
Bbox: {'geodesic': False, 'type': 'Polygon', 'coordinates': [[[-119.00017703618076, -39.00095157225003], [205.0001398310314, -39.00095157225003], [205.0001398310314, 33.9993318246689], [-119.00017703618076, 33.9993318246689], [-119.00017703618076, -39.00095157225003]]]}
Nominal resolution (m): 27.829872698318393
Band names: ['soc']


In [None]:
# Check task status
task.status()

{'creation_timestamp_ms': 1591885501544,
 'description': 'convert_ic_to_im',
 'destination_uris': ['https://code.earthengine.google.com/?asset=projects/earthengine-legacy/assets/projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm'],
 'id': 'A3UKVXPNUXXDLXPPSBUYGRF4',
 'name': 'projects/earthengine-legacy/operations/A3UKVXPNUXXDLXPPSBUYGRF4',
 'start_timestamp_ms': 1591885512856,
 'state': 'COMPLETED',
 'task_type': 'EXPORT_IMAGE',
 'update_timestamp_ms': 1591890333352}

### Update image properties

In [None]:
#!earthengine --no-use_cloud_api asset info "projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm"

In [None]:
import pprint
# Scrape dataset properties from JSON-LD file
props = get_ds_props("https://storage.googleapis.com/mangrove_atlas/mangrove-properties/mangroves_SOC30m_0_100cm.jsonld")
#pprint.pprint(props, indent=4)

In [None]:
# add some more using sky API MetaData schema names
time_start = "2016-01-01"
time_end = "2016-01-01"
props.update({
    'language': 'en', 
    'altName': "mangroves_SOC30m_0_100cm",
    'citation': "Tomislav Hengl. (2018). Predicted soil organic carbon stock at 30 m in t/ha for 0-100 cm depth global / update of the map of mangrove forest soil carbon (Version 0.2) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.2536803",
    'distribution': "https://zenodo.org/api/files/826b6287-7c3d-4a49-a1d5-cbecb5d0496d/mangroves_SOC30m_0_100cm.zip",
    'variableMeasured': "Predicted soil organic carbon in the depth layer 0 to 100cm",
    'units': "t OC / ha",
    'spatialCoverage': "Global tropics",
    'temporalCoverage': "2016",
    'dataLineage': "Data downloaded from doi, unpacked, and added to Google earth engine."
})
#pprint.pprint(props, indent=4)    

In [None]:
# Update asset properties
gee_update_asset_properties(
    "projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm",
    properties = props,
    time_start=time_start,
    time_end=time_end
    )

earthengine --no-use_cloud_api asset set --time_start=2018-01-01 --time_end=2018-01-01 --property=name="Predicted soil organic carbon stock at 30 m in t/ha for 0-100 cm depth global / update of the map of mangrove forest soil carbon" --property=license="http://creativecommons.org/licenses/by-sa/4.0/legalcode" --property=identifier="https://doi.org/10.5281/zenodo.2536803" --property=url="https://zenodo.org/record/2536803" --property=version="0.2" --property=creator="Envirometrix Ltd" --property=keywords="mangroves; soil carbon; machine learning; superlearner package" --property=description="\"<p>This is an update of maps produced by&nbsp;<a href=\\\"https://doi.org/10.1088/1748-9326/aabe1c\\\">Sanderman et al (2018)</a>. The improvements to the 3D spatial prediction include:</p>\\n\\n<ul>\\n\\t<li>\\n\\t<p>new updated global mangrove coverage map (contact Thomas Worthington),</p>\\n\\t</li>\\n\\t<li>\\n\\t<p>new ALOS-based DEM of the world AW3D30 v18.04,</p>\\n\\t</li>\\n\\t<li>\\n\\t<p

In [None]:
# View properties
!earthengine --no-use_cloud_api asset info "projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm"

Instructions for updating:
non-resource variables are not supported in the long term
{
  "bands": [
    {
      "crs": "EPSG:4326",
      "crs_transform": [
        0.00025,
        0.0,
        -119.00025000000001,
        0.0,
        -0.00025,
        33.9995
      ],
      "data_type": {
        "max": 32767,
        "min": -32768,
        "precision": "int",
        "type": "PixelType"
      },
      "dimensions": [
        1296002,
        292002
      ],
      "id": "soc"
    }
  ],
  "id": "projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm",
  "properties": {
    "altName": "mangroves_SOC30m_0_100cm",
    "citation": "Tomislav Hengl. (2018). Predicted soil organic carbon stock at 30 m in t/ha for 0-100 cm depth global / update of the map of mangrove forest soil carbon (Version 0.2) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.2536803",
    "creator": "Envirometrix Ltd",
    "dataLineage": "Data downloaded from doi, unpacked, and added to Google ea

### View

In [None]:
# Import libraries.
import ee
import folium

# Trigger the authentication flow.
#ee.Authenticate()

# Initialize the library.
ee.Initialize()

# Define a method for displaying Earth Engine image tiles to folium map.
def add_ee_layer(self, ee_image_object, vis_params, name):
  map_id_dict = ee.Image(ee_image_object).getMapId(vis_params)
  folium.raster_layers.TileLayer(
    tiles = map_id_dict['tile_fetcher'].url_format,
    attr = "Map Data © Google Earth Engine",
    name = name,
    overlay = True,
    control = True
  ).add_to(self)

# Add EE drawing method to folium.
folium.Map.add_ee_layer = add_ee_layer

# Fetch an image.
im = ee.Image("projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm")

# Set visualization parameters.
vis_params = {
  'min': 0,
  'max': 400,
  'palette': ['006633', 'E5FFCC', '662A00', 'D8D8D8', 'F5F5F5']}

# Create a folium map object.
my_map = folium.Map(location=[-7.998, 39.4767], zoom_start=9, height=500)

# Add the elevation model to the map object.
my_map.add_ee_layer(im.updateMask(im.gt(0)), vis_params, 'SOC (t/ha, 0-100 cm)')

# Add a layer control panel to the map.
my_map.add_child(folium.LayerControl())

# Display the map.
display(my_map)

### Clean up

+ remove data from tmp dir
+ remove ee.Image.collection

In [None]:
!gsutil -m rm -r -q {gcs_tmp_path}/*.tif

Removing gs://mangrove_atlas/tmp/dSOCS_0_100cm_year2000_30m_T13545.tif#1591812736066831...
Removing gs://mangrove_atlas/tmp/dSOCS_0_100cm_year2000_30m_T13454.tif#1591814693812612...
Removing gs://mangrove_atlas/tmp/dSOCS_0_100cm_year2000_30m_T13652.tif#1591811168476160...
Removing gs://mangrove_atlas/tmp/dSOCS_0_100cm_year2000_30m_T13614.tif#1591814042551671...
Removing gs://mangrove_atlas/tmp/dSOCS_0_100cm_year2000_30m_T13654.tif#1591815034302233...
Removing gs://mangrove_atlas/tmp/dSOCS_0_100cm_year2000_30m_T13653.tif#1591816029353355...
Removing gs://mangrove_atlas/tmp/dSOCS_0_100cm_year2000_30m_T13814.tif#1591814309368123...
Removing gs://mangrove_atlas/tmp/dSOCS_0_100cm_year2000_30m_T13815.tif#1591815119447352...
Removing gs://mangrove_atlas/tmp/dSOCS_0_100cm_year2000_30m_T13816.tif#1591819251527927...
Removing gs://mangrove_atlas/tmp/dSOCS_0_100cm_year2000_30m_T13817.tif#1591816500433862...
Removing gs://mangrove_atlas/tmp/dSOCS_0_100cm_year2000_30m_T13896.tif#1591811174657716...

In [None]:
!earthengine --no-use_cloud_api rm -r {asset_path}

Instructions for updating:
non-resource variables are not supported in the long term
Running command using Cloud API.  Set --no-use_cloud_api to go back to using the API

W0611 16:37:06.331031 139674353043328 __init__.py:46] file_cache is unavailable when using oauth2client >= 4.0.0 or google-auth
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/googleapiclient/discovery_cache/__init__.py", line 36, in autodetect
    from google.appengine.api import memcache
ModuleNotFoundError: No module named 'google.appengine'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/googleapiclient/discovery_cache/file_cache.py", line 33, in <module>
    from oauth2client.contrib.locked_file import LockedFile
ModuleNotFoundError: No module named 'oauth2client.contrib.locked_file'

During handling of the above exception, another exception occurred:

Traceback (most recent ca

## Create SOC time-series

+ extend SOC values to match mangrove extent of other years
+ assumes values in 2016 represent other years

### Create imageCollection

In [None]:
import ee

# Trigger the authentication flow.
ee.Authenticate()

# Initialize the library.
ee.Initialize()

To authorize access needed by Earth Engine, open the following URL in a web browser and follow the instructions. If the web browser does not start automatically, please manually browse the URL below.

    https://accounts.google.com/o/oauth2/auth?client_id=517222506229-vsmmajv00ul0bs7p89v5m89qs8eb9359.apps.googleusercontent.com&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fearthengine+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdevstorage.full_control&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&response_type=code&code_challenge=yIBc3L1HKP2fnADToiCo7oJROVMImD_4HBmftD3948c&code_challenge_method=S256

The authorization workflow will generate a code, which you should paste in the box below. 
Enter verification code: 4/1gHsoGVhUJZX47gN93rQCluaHzGbuEtuunIwY4F4FZNeAFDZ64NWC10

Successfully saved authorization token.


In [None]:
# Define parameters

# asset path
ee_asset_path = f"projects/global-mangrove-watch/mangrove-properties/mangrove-soil-organic-carbon_version-0-2_1996--2016"

# get description from file
copy_gcs(["gs://mangrove_atlas/mangrove-properties/mangrove-soil-organic-carbon_version-0-2_1996--2016_description.md"], ["."])
with open("mangrove-soil-organic-carbon_version-0-2_1996--2016_description.md", "r") as f:
  description = f.read()

# set collection properties (these are compatible with Skydipper.Dataset.Metadata)
collection_properties = {
    'name': "Global predicted mangrove forest soil carbon",
    'version': "0.2",
    'creator': "vizzuality/Envirometrix Ltd",
    'system:description': description,
    'description': description,
    'identifier': "https://doi.org/10.5281/zenodo.2536803",
    'keywords': "Erosion; Coasts; Natural Infrastructure; Biodiversity; Blue Carbon; Forests; Mangroves; Climate change",
    'citation': "Tomislav Hengl. (2018). Predicted soil organic carbon stock at 30 m in t/ha for 0-100 cm depth global / update of the map of mangrove forest soil carbon (Version 0.2) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.2536803",
    'license': "https://creativecommons.org/licenses/by/4.0/",
    'url': "https://zenodo.org/record/2536803",
    'language': 'en', 
    'altName': "Global Mangrove Watch, Version 2.0",
    'distribution': "https://zenodo.org/api/files/826b6287-7c3d-4a49-a1d5-cbecb5d0496d/mangroves_SOC30m_0_100cm.zip",
    'variableMeasured': "Predicted soil organic carbon in the depth layer 0 to 100cm",
    'units': "t OC / ha",
    'spatialCoverage': "Global tropics",
    'temporalCoverage': "1996--2016",
    'dataLineage': "SOC data for 2016 downloaded from doi, unpacked, and added to Google earth engine. "
}

# set individual image properties (minimal)
image_properties = {
    "band_nodata_values": 'nan',
    "band_pyramiding_policies": "mean",
    "band_names": "soc"
}

# set image start times
image_start_times = ['1996-01-01', '2007-01-01', '2008-01-01', '2009-01-01', '2010-01-01', '2015-01-01', '2016-01-01']

Processing: gsutil -m cp -r  gs://mangrove_atlas/mangrove-properties/mangrove-soil-organic-carbon_version-0-2_1996--2016_description.md .
Task created
Finished copy


In [None]:
# Create image collection
gee_create_image_collection(ee_asset_path, image_start_times, collection_properties = collection_properties, dry_run=False)


Checking if collection exists...

ImageCollection not found

b'Asset does not exist or is not accessible: projects/global-mangrove-watch/mangrove-properties/mangrove-soil-organic-carbon_version-0-2_1996--2016\n'

Creating ee.ImageCollection


ee.ImageCollection projects/global-mangrove-watch/mangrove-properties/mangrove-soil-organic-carbon_version-0-2_1996--2016 created


Updating ImageCollection properties...

Updated properties for asset: projects/global-mangrove-watch/mangrove-properties/mangrove-soil-organic-carbon_version-0-2_1996--2016

{
  "bands": [],
  "id": "projects/global-mangrove-watch/mangrove-properties/mangrove-soil-organic-carbon_version-0-2_1996--2016",
  "properties": {
    "altName": "Global Mangrove Watch, Version 2.0",
    "citation": "Tomislav Hengl. (2018). Predicted soil organic carbon stock at 30 m in t/ha for 0-100 cm depth global / update of the map of mangrove forest soil carbon (Version 0.2) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.2536803",
    

In [None]:
import pprint
# Create SOC spread layer

def app(debug, view):
  
  # GET DATA LAYERS
  
  # mangrove properties
  # Soil organic carbon (2016)
  # soc : soil organic carbon [t OC / ha]
  soc = ee.Image("projects/global-mangrove-watch/mangrove-properties/mangroves_SOC30m_0_100cm")
  
  # land-cover
  # Distance from mangrove presence (1996--2016)
  # distance_m : linear distance from mangrove presence [m]
  dist = ee.ImageCollection("projects/global-mangrove-watch/land-cover/mangrove-extent-distance_version-2-0_1996--2016")
  
  # land-cover
  # Mangrove extent (1996--2016)
  # lc : mangrove presence extent [boolean]
  extent = ee.ImageCollection("projects/global-mangrove-watch/land-cover/mangrove-extent_version-2-0_1996--2016")

  # SET NOMINAL SCALE
  ns = ee.Number(30)
  
  # SET EXPORT REGION
  region = ee.Geometry.Polygon([[[-180, 33],[-180, -34],[180, -34],[180, 33]]], None, False);

  # CREATE ECOTYPE PRESENCE MASK
  # un-masked if ecotype was with 1000 m of mangroves EVER present in time series
  presence = dist.map(lambda i: ee.Image(1.0).updateMask(i.lte(1000)).unmask()) \
  .sum().gt(0).selfMask()
  
  # INTERPOLATE SOC TO ECOTYPE MASK
  # spread the edge of the SOC values so as to fill the presence mask  
  def focalMax(image, radius):
    return image.fastDistanceTransform().sqrt().lte(radius);
  
  # Gets edge pixels of binary image
  def focalEdge(image, radius_list):
    outer = focalMax(image, radius_list[0])
    inner = focalMax(ee.Image(1).updateMask(image.unmask().eq(0)), radius_list[1])
    return ee.Image(1).updateMask(outer.eq(1).And(inner.eq(1)))

  # Spreads values of edge pixels to fill mask
  def spread_to_mask(image, mask, iterations=100):
    edge = image.updateMask(focalEdge(image, [0, 2])) \
    .focal_mode(1.5, 'circle', 'pixels', iterations) \
    .updateMask(mask)
    return ee.Image(edge.blend(image))
  
  # Apply to SOC image using presence mask
  soc_spread = ee.Image(spread_to_mask(soc, presence, iterations=200)) \
  .rename(image_properties.get('band_names')) \
  .copyProperties(soc)\
  .set({'system:time_start': soc.get("system:time_start")})
  
  # Export params
  nm = ee.String("mangroves_SOC30m_0_100cm").cat("_").cat('spread')
  
  # EXPORT TO IMAGE
  
  # update data lineage
  data_lineage = ee.String(soc.get('dataLineage'))\
  .cat(" Edge pixels of the SOC image were extended to fill a mask representing a distance of 1km from any observation of mangroves.")
  soc_spread = ee.Image(soc_spread).set({'dataLineage':data_lineage})
  
  params = {
        'image': soc_spread,
        'description': "export_" + nm.getInfo(),
        'assetId': 'projects/global-mangrove-watch/mangrove-properties/' + nm.getInfo(),
        'pyramidingPolicy':{image_properties.get('band_names'):image_properties.get('band_pyramiding_policies')},
        'scale': ns.getInfo(),
        'crs': 'EPSG:4326',
        'region': region,
        'maxPixels': 1e13
      
    }
  task = ee.batch.Export.image.toAsset(**params)
  if debug == False:
        task.start()
        print(task.status())

  if debug == True:
    print('\n#######')
    #print('\nName:', ee.String(nm).getInfo())
    print('\nNominal scale:', ns.getInfo())
    print('\nSOC image:', json.dumps(soc.getInfo(), indent=4))
    print('\nSOC spread image:', json.dumps(soc_spread.getInfo(), indent=4))
    print("\nExport params:\n")
    pprint.pprint(params, indent=4)
      
  if view == True:
    print("\n Map:\n")
    # Create a folium map object.
    my_map = folium.Map(location=[20.3614, 93.1696], zoom_start=9, height=500)
    # Set visualization parameters.
    vis_params = {'palette': ['#5c4a3d','#933a06','#b84e17','#e68518','#eeb66b'], 'min': 400, 'max': 2000}
    # Add map layer
    my_map.add_ee_layer(presence, {'palette':['green']}, "extent 2016")
    my_map.add_ee_layer(soc, vis_params, "SOC (t OC / ha)")
    my_map.add_ee_layer(soc_spread, vis_params, "SOC interpolated (t OC / ha)")
    
    #my_map.add_ee_layer(extent.filterDate('1996-01-01').first(), {'palette':['green']}, "extent 2016")
    # Add a layer control panel to the map.
    my_map.add_child(folium.LayerControl())
    # Display the map.
    display(my_map)

debug = False
view = False
app(debug, view)

{'state': 'READY', 'description': 'export_mangroves_SOC30m_0_100cm_spread', 'creation_timestamp_ms': 1594136712545, 'update_timestamp_ms': 1594136712545, 'start_timestamp_ms': 0, 'task_type': 'EXPORT_IMAGE', 'id': 'OZWASP3OVFEIFD2WLUZKEQKZ', 'name': 'projects/earthengine-legacy/operations/OZWASP3OVFEIFD2WLUZKEQKZ'}
