# Upload mangrove canopy height to GCS and as an asset to GEE

This notebook gets mangrove extent data, uploads to GCS, and creates an earth engine image collection asset.  
These are the steps:  
* We first need to create a single tiff per year of mangrove extent data.
* We then upload the tiffs to GCS.
* We then create an earthengine image collection asset.
* We then upload each image as an asset to GEE.
* Finally make the data also publicly available in Zenodo.

## Setup

In [1]:
import os
from pathlib import Path
import urllib.parse
import geemap
import ee
import tarfile

%run utils.ipynb

In [2]:
# Trigger the authentication flow.
ee.Authenticate()

# Initialize the library.
ee.Initialize()


Successfully saved authorization token.


### Set Cloud credentials (could be done through and env file)

In [4]:
# WARNING: Don't forget to auth to google cloud platform
# gcloud auth application-default login --no-launch-browser --project=mangrove-atlas-246414

### Set Variables for GCS and GEE

In [3]:
#  FIXME: This will depends from where the notebook kernel is running so be careful
WORK_DIR =Path(os.getcwd())
BASE_DIR = f'{WORK_DIR.parents[3]}/datasets'

# @TODO: Add expected data files source as an environment variable.
assert BASE_DIR == '/home/jovyan/work/datasets', f'{BASE_DIR} is not the correct directory'

# variables
data_version = 'v3'
dataset = 'mangrove_canopy_height'

# Set the Google Cloud params
gc_project_id = "mangrove-atlas-246414"
gcs_bucket = f'mangrove_atlas'
gcs_prefix = f"gs://{gcs_bucket}"

# raw data source
raw_dataset_name = 'hchm_mng_gmw_v3_tif.tar.gz'
cgs_parent_folder = 'mangrove-properties'
base_source_data_url = f'https://www.dropbox.com/sh/ruk6m9btrkqb73b/AAC8lWSe7nOfDGzrapMR_uqRa?dl=0'
gcs_source_data_url = f'{gcs_prefix}/{cgs_parent_folder}/{raw_dataset_name}'

raw_local_folder = Path(f'{BASE_DIR}/raw/{dataset}')
raw_local_folder.mkdir(parents=True, exist_ok=True)

# Image Collection Information
data_year_range = [1996, 2007, 2008, 2009, 2010, 2015, 2016, 2017, 2018, 2019, 2020]
ee_image_collection = f'projects/global-mangrove-watch/mangrove-properties/{dataset}-{data_version}'

no_data_values = [0]
pyramiding = 'MEAN'


In [4]:
description = f"""
## Methodology

This dataset characterizes the global distribution, biomass, and canopy height of mangrove-forested wetlands based on remotely sensed and in situ field measurement data. 
Estimates of maximum canopy height (height of the tallest tree) for the nominal year 2000 were derived across a 30-meter resolution global mangrove ecotype extent map using remotely-sensed 
canopy height measurements and region-specific allometric models. Also provided are (4) in situ field measurement data for selected sites across a wide variety of forest 
structures (e.g., scrub, fringe, riverine and basin) in mangrove ecotypes of the global equatorial region. Within designated plots, selected trees were identified to 
species and diameter at breast height (DBH) and tree height was measured using a laser rangefinder or clinometer. Tree density (the number of stems) can be estimated for 
each plot and expressed per unit area. These data were used to derive plot-level allometry among AGB, basal area weighted height (Hba), and maximum canopy height (Hmax) 
and to validate the remotely sensed estimates.  

Spatially explicit maps of mangrove canopy height and AGB derived from space-borne remote sensing data and in situ measurements can be used to assess local-scale 
geophysical and environmental conditions that may regulate forest structure and carbon cycle dynamics. Maps revealed a wide range of canopy heights, including maximum 
values (> 62 m) that surpass maximum heights of other forest types.  

There are 348 data files in GeoTIFF format (.tif) with this dataset representing three data products for each of 116 countries. The in situ tree measurements are 
provided in a single .csv file.  

### {', '.join(list(map(str, data_year_range[:-1])))} and {data_year_range[-1]} maps of mangrove biomass

For the full documentation, please see the source [methodology](https://daac.ornl.gov/CMS/guides/CMS_Global_Map_Mangrove_Canopy.html).
"""

In [5]:
collection_properties = ImageCollectionProperties(
    name = 'Maximum canopy height',
    version = data_version,
    creator = "Global Mangrove Watch (GMW): Aberystwyth University/soloEO/Wetlands International/UNEP-WCMC/JAXA/DOB Ecology",
    description = description,
    identifier = "",
    keywords = "Erosion; Coasts; Natural Infrastructure; Biodiversity; Blue Carbon; Forests; Mangroves; Landcover",
    citation = "Simard, M., T. Fatoyinbo, C. Smetanka, V.H. Rivera-monroy, E. Castaneda, N. Thomas, and T. Van der stocken. 2019. Global Mangrove Distribution, Aboveground Biomass, and Canopy Height. ORNL DAAC, Oak Ridge, Tennessee, USA. https://doi.org/10.3334/ORNLDAAC/1665",
    license = "https://creativecommons.org/licenses/by/4.0/",
    url = "https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=1665",
    language = "en", 
    altName = "Maximum canopy height, Version 3.0",
    distribution = "",
    variableMeasured = "maximum canopy height",
    units = "meters",
    spatialCoverage = "Global tropics",
    temporalCoverage = ','.join(map(str, data_year_range)),
    dataLineage = "Raster data supplied as tilesets per year, each tilset was combined, and added to Google earth engine as multi-temporal ImageCollection."
)

In [14]:
def generate_manifests(year):
    files = [blob.name for blob in list_gcs(bucket_name = gcs_bucket, 
                    dir_path = f'ee_import_data/{dataset}/hchm_mng_mjr_{year}_tif', file_pattern = '*.tif') if blob.name.endswith('.tif')]
    manifest = GEEManifest(
            path = Path(f'{BASE_DIR}/processed/manifest/hchm_mng_mjr_{data_version}_{year}'),
            name = f"projects/earthengine-legacy/assets/{ee_image_collection}/{dataset}_{year}",
            tilesets = [ Tilesets(
                sources = [Sources(uris = [f"{gcs_prefix}/{file}"]) for file in files],
                    )
                ],
            start_time = f'{year}-01-01T00:00:00Z',
            end_time = f'{year}-12-31T00:00:00Z',
            uri_prefix = f"",
            properties = ImageProperties(
                band_nodata_values = no_data_values[0],
                band_pyramiding_policies = pyramiding,
                band_names = 'height',
                year = year,
            ),
            bands = [{"id": "height",
                    "tileset_band_index": 0}],
            pyramiding_policy = pyramiding,
            missing_data = {'values': no_data_values}

        )
    return manifest

In [7]:
def extract_path(path: Path):
    # Extract all the contents of zip file in current directory
    for item in path.iterdir():
        if '.tar.gz' in item.name:
            with tarfile.open(item.as_posix(), 'r') as conpressObj:
                conpressObj.extractall(path)
                
            item.unlink()

In [8]:
def rm_tree(pth: Path):
    for child in pth.iterdir():
        if child.is_file():
            child.unlink()
        else:
            rm_tree(child)
    pth.rmdir()

In [9]:
files = list(list_gcs(bucket_name = gcs_bucket, dir_path = cgs_parent_folder, file_pattern = f'{raw_dataset_name}'))
files

INFO:root:Searching mangrove-properties/hchm_mng_gmw_v3_tif.tar.gz


[<Blob: mangrove_atlas, mangrove-properties/, 1563369444139420>,
 <Blob: mangrove_atlas, mangrove-properties/agb_mng_gmw_v3_tif.tar.gz, 1660149522699737>,
 <Blob: mangrove_atlas, mangrove-properties/hchm_mng_gmw_v3_tif.tar.gz, 1660149526790250>,
 <Blob: mangrove_atlas, mangrove-properties/mangrove-soil-organic-carbon_version-0-2_1996--2016_description.md, 1594128267403391>,
 <Blob: mangrove_atlas, mangrove-properties/mangroves_SOC30m_0_100cm.jsonld, 1591393382823623>,
 <Blob: mangrove_atlas, mangrove-properties/mangroves_SOC30m_0_100cm.zip, 1591779593619314>]

### Get the raw files locally

In [11]:
copy_folder_gcs(f'{gcs_prefix}/{cgs_parent_folder}/{raw_dataset_name}', raw_local_folder)

Copying gs://mangrove_atlas/mangrove-properties/hchm_mng_gmw_v3_tif.tar.gz...
==> NOTE: You are downloading one or more large file(s), which would
run significantly faster if you enabled sliced object downloads. This
feature is enabled by default but requires that compiled crcmod be
installed (see "gsutil help crcmod").

- [1/1 files][  2.8 GiB/  2.8 GiB] 100% Done  10.9 MiB/s ETA 00:00:00           
Operation completed over 1 objects/2.8 GiB.                                      
INFO:root:Task created


### Extract them and reupload to GCS in the ee_import_data folder

In [13]:
extract_path(raw_local_folder)

folder_unziped = [x for x in raw_local_folder.iterdir() if x.is_dir()]

In [16]:
for folder in folder_unziped:
    copy_folder_gcs(folder, f'{gcs_prefix}/ee_import_data/{dataset}/')

Copying file:///home/jovyan/work/datasets/raw/mangrove_canopy_height/hchm_mng_mjr_2008_tif/S06E012_hchm_gmw_v314_mng_mjr_2008.tif.aux.xml [Content-Type=application/xml]...
Copying file:///home/jovyan/work/datasets/raw/mangrove_canopy_height/hchm_mng_mjr_2008_tif/S27E032_hchm_gmw_v314_mng_mjr_2008.tif [Content-Type=image/tiff]...
Copying file:///home/jovyan/work/datasets/raw/mangrove_canopy_height/hchm_mng_mjr_2008_tif/N03E125_hchm_gmw_v314_mng_mjr_2008.tif.aux.xml [Content-Type=application/xml]...
Copying file:///home/jovyan/work/datasets/raw/mangrove_canopy_height/hchm_mng_mjr_2008_tif/S06E038_hchm_gmw_v314_mng_mjr_2008.tif [Content-Type=image/tiff]...
Copying file:///home/jovyan/work/datasets/raw/mangrove_canopy_height/hchm_mng_mjr_2008_tif/N13W061_hchm_gmw_v314_mng_mjr_2008.tif.aux.xml [Content-Type=application/xml]...
Copying file:///home/jovyan/work/datasets/raw/mangrove_canopy_height/hchm_mng_mjr_2008_tif/N22W083_hchm_gmw_v314_mng_mjr_2008.tif [Content-Type=image/tiff]...
Copying

#### Clean up the raw data locally as this is expensive in terms of space.

In [17]:
rm_tree(raw_local_folder)

#### Generate manifests for each year that represent an image in the image collection

In [18]:
list_of_manifests = [generate_manifests(year) for year in data_year_range]

INFO:root:Searching ee_import_data/mangrove_canopy_height/hchm_mng_mjr_1996_tif/*.tif
INFO:root:Searching ee_import_data/mangrove_canopy_height/hchm_mng_mjr_2007_tif/*.tif
INFO:root:Searching ee_import_data/mangrove_canopy_height/hchm_mng_mjr_2008_tif/*.tif
INFO:root:Searching ee_import_data/mangrove_canopy_height/hchm_mng_mjr_2009_tif/*.tif
INFO:root:Searching ee_import_data/mangrove_canopy_height/hchm_mng_mjr_2010_tif/*.tif
INFO:root:Searching ee_import_data/mangrove_canopy_height/hchm_mng_mjr_2015_tif/*.tif
INFO:root:Searching ee_import_data/mangrove_canopy_height/hchm_mng_mjr_2016_tif/*.tif
INFO:root:Searching ee_import_data/mangrove_canopy_height/hchm_mng_mjr_2017_tif/*.tif
INFO:root:Searching ee_import_data/mangrove_canopy_height/hchm_mng_mjr_2018_tif/*.tif
INFO:root:Searching ee_import_data/mangrove_canopy_height/hchm_mng_mjr_2019_tif/*.tif
INFO:root:Searching ee_import_data/mangrove_canopy_height/hchm_mng_mjr_2020_tif/*.tif


#### Create the image collection asset and upload the images to GEE

In [20]:
createImageCollection(ee_asset_path = ee_image_collection,
                        properties = collection_properties, 
                        image_list = list_of_manifests)

INFO:root:Created image collection projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3
INFO:root:Task created
INFO:root:Upload image projects/earthengine-legacy/assets/projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3/mangrove_canopy_height_1996


Started upload task with ID: OIJZSMTLVKLRFK6TBDXZVDVV


INFO:root:Task created
INFO:root:Upload image projects/earthengine-legacy/assets/projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3/mangrove_canopy_height_2007


Started upload task with ID: JNDCNYA66C5GXGUINBSGEF2O


INFO:root:Task created
INFO:root:Upload image projects/earthengine-legacy/assets/projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3/mangrove_canopy_height_2008


Started upload task with ID: CMSKRWAGVESRN2MJEPOEVWKQ


INFO:root:Task created
INFO:root:Upload image projects/earthengine-legacy/assets/projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3/mangrove_canopy_height_2009


Started upload task with ID: TWFJZXESN67R6J2D7RY6XTC2


INFO:root:Task created
INFO:root:Upload image projects/earthengine-legacy/assets/projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3/mangrove_canopy_height_2010


Started upload task with ID: DS3CW32KYKSAVJ3IDSERD4CY


INFO:root:Task created
INFO:root:Upload image projects/earthengine-legacy/assets/projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3/mangrove_canopy_height_2015


Started upload task with ID: FKZ5JNQWXBD7JD5XDFUGVB7B


INFO:root:Task created
INFO:root:Upload image projects/earthengine-legacy/assets/projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3/mangrove_canopy_height_2016


Started upload task with ID: 4IIPKYE4T65Q2ABFW626K5O6


INFO:root:Task created
INFO:root:Upload image projects/earthengine-legacy/assets/projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3/mangrove_canopy_height_2017


Started upload task with ID: 5CG27MPVSHB2WZVCRODTWLFQ


INFO:root:Task created
INFO:root:Upload image projects/earthengine-legacy/assets/projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3/mangrove_canopy_height_2018


Started upload task with ID: B4O6E7ODVB3AFSYUVERKFLDP


INFO:root:Task created
INFO:root:Upload image projects/earthengine-legacy/assets/projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3/mangrove_canopy_height_2019


Started upload task with ID: ALNEJSKIDNFSOTWVQPQN6TE6


INFO:root:Task created
INFO:root:Upload image projects/earthengine-legacy/assets/projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3/mangrove_canopy_height_2020


Started upload task with ID: 4QRCWR6PJQUINAB52PFY5FOX


If we need to add new years to the image collection, we can do that by adding new manifests to the image collection asset.

In [108]:
addImagesToCollection(ee_image_collection, list_of_manifests[0:1])

INFO:root:Task created
INFO:root:Upload image projects/earthengine-legacy/assets/projects/global-mangrove-watch/land-cover/mangrove-extent-test/gmw_v3_1996


Started upload task with ID: CJXPIRIMZJL6MGZ2US6RD3CH


In [26]:
# Check task status
for operation in ee.data.listOperations():
    if operation.get('metadata',{}).get('state') == 'RUNNING':
        print('______________________')
        
        print(operation.get('metadata',{}).get('type'))
        print(operation.get('name'))

______________________
INGEST_IMAGE
projects/earthengine-legacy/operations/4QRCWR6PJQUINAB52PFY5FOX
______________________
INGEST_IMAGE
projects/earthengine-legacy/operations/ALNEJSKIDNFSOTWVQPQN6TE6
______________________
INGEST_IMAGE
projects/earthengine-legacy/operations/B4O6E7ODVB3AFSYUVERKFLDP
______________________
INGEST_IMAGE
projects/earthengine-legacy/operations/5CG27MPVSHB2WZVCRODTWLFQ
______________________
INGEST_IMAGE
projects/earthengine-legacy/operations/4IIPKYE4T65Q2ABFW626K5O6
______________________
INGEST_IMAGE
projects/earthengine-legacy/operations/FKZ5JNQWXBD7JD5XDFUGVB7B
______________________
INGEST_IMAGE
projects/earthengine-legacy/operations/DS3CW32KYKSAVJ3IDSERD4CY
______________________
INGEST_IMAGE
projects/earthengine-legacy/operations/TWFJZXESN67R6J2D7RY6XTC2
______________________
INGEST_IMAGE
projects/earthengine-legacy/operations/CMSKRWAGVESRN2MJEPOEVWKQ
______________________
INGEST_IMAGE
projects/earthengine-legacy/operations/JNDCNYA66C5GXGUINBSGEF2O


In [23]:
# Check ImageCollection properties
ee.data.getAsset(ee_image_collection)

{'type': 'IMAGE_COLLECTION',
 'name': 'projects/earthengine-legacy/assets/projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3',
 'id': 'projects/global-mangrove-watch/mangrove-properties/mangrove_canopy_height-v3',
 'properties': {'altName': 'Maximum canopy height, Version 3.0',
  'citation': 'Simard, M., T. Fatoyinbo, C. Smetanka, V.H. Rivera-monroy, E. Castaneda, N. Thomas, and T. Van der stocken. 2019. Global Mangrove Distribution, Aboveground Biomass, and Canopy Height. ORNL DAAC, Oak Ridge, Tennessee, USA. https://doi.org/10.3334/ORNLDAAC/1665',
  'creator': 'Global Mangrove Watch (GMW): Aberystwyth University/soloEO/Wetlands International/UNEP-WCMC/JAXA/DOB Ecology',
  'dataLineage': 'Raster data supplied as tilesets per year, each tilset was combined, and added to Google earth engine as multi-temporal ImageCollection.',
  'description': '\n## Methodology\n\nThis dataset characterizes the global distribution, biomass, and canopy height of mangrove-foreste