# Upload fishing presure map to GCS and as an asst to GEE

This notebook gets mangrove extent data, uploads to GCS, and creates an earth engine image collection asset.  
These are the steps:  
* We  upload the tiff to GCS.
* We then create an earthengine image collection asset.
* We then upload the image as an asset to GEE.

## Setup

In [1]:
import os
from pathlib import Path
import urllib.parse
import geemap
import ee
import zipfile

%run utils.ipynb

  from pandas.core.computation.check import NUMEXPR_INSTALLED


In [20]:
# Trigger the authentication flow.
ee.Authenticate()

# Initialize the library.
ee.Initialize()


Successfully saved authorization token.


### Set Cloud credentials (could be done through and env file)

In [3]:
# WARNING: Don't forget to auth to google cloud platform
# gcloud auth application-default login --no-launch-browser --project=mangrove-atlas-246414

### Set Variables for GCS and GEE

In [12]:
#  FIXME: This will depends from where the notebook kernel is running so be careful
WORK_DIR =Path(os.getcwd())
BASE_DIR = f'{WORK_DIR.parents[3]}/datasets'

# @TODO: Add expected data files source as an environment variable.
assert BASE_DIR == '/home/jovyan/work/datasets', f'{BASE_DIR} is not the correct directory'

# variables
data_version = 'v3'
dataset = 'mangrove_fishing_activity'

# Set the Google Cloud params
gc_project_id = "mangrove-atlas-246414"
gcs_prefix = "gs://mangrove_atlas"
gcs_bucket = f'mangrove_atlas'
cgs_parent_folder = 'GMW v3.14 geotiffs'
gcs_http_prefix = f"https://storage.googleapis.com/{gcs_bucket}"
base_source_data_url = f'{gcs_http_prefix}/GMW%20v3.14%20geotiffs'

raw_local_folder = Path(f'{BASE_DIR}/raw/{dataset}')
raw_local_folder.mkdir(parents=True, exist_ok=True)

# Image Collection Information
#data_year_range = [1996, 2007, 2008, 2009, 2010, 2015, 2016, 2017, 2018, 2019, 2020]
ee_image_collection = f'projects/global-mangrove-watch/land-cover/{dataset}'

no_data_values = [-3.40282e+38]
pyramiding = 'MEAN'


In [5]:
description = f"""
## Summary

Mangroves are critical nursery habitats for fish and invertebrates, providing livelihoods for many coastal communities. Despite their importance, there is currently no estimate of the number of fishers engaged in mangrove associated fisheries, nor of the fishing intensity associated with mangroves at a global scale. 
We address these gaps by developing a global model of mangrove associated fisher numbers and mangrove fishing intensity. To develop the model, we undertook a three-round Delphi process with mangrove fisheries experts to identify the key drivers of mangrove fishing intensity. We then developed a conceptual model of intensity of mangrove fishing using those factors identified both as being important and for which appropriate global data could be found or developed. These factors were non-urban population, distance to market, distance to mangroves and other fishing grounds, and storm events. 
By projecting this conceptual model using geospatial datasets, we were able to estimate the number and distribution of mangrove associated fishers and the intensity of fishing in mangroves. We estimate there are 4.1 million mangrove associated fishers globally, with the highest number of mangrove fishers found in Indonesia, India, Bangladesh, Myanmar, and Brazil. Mangrove fishing intensity was greatest throughout Asia, and to a lesser extent West and Central Africa, and Central and South America.
"""

In [6]:
extent_collection_properties = ImageCollectionProperties(
    name = 'Fishing intensity associated with mangroves',
    version = data_version,
    creator = "Philine et al. (2020) Fishers who rely on mangroves: Modelling and mapping the global intensity of mangrove-associated fisheries.",
    description = description,
    identifier = "",
    keywords = "Coasts; Natural Infrastructure; Biodiversity; Forests; Mangroves; Fishing; Fishers; Ecosystem Services",
    citation = 'Philine S.E. zu Ermgassen, Nibedita Mukherjee, Thomas A. Worthington, Alejandro Acosta, Ana Rosa da Rocha Araujo, Christine M. Beitl, Gustavo A. Castellanos-Galindo, Marília Cunha-Lignon, Farid Dahdouh-Guebas, Karen Diele, Cara L. Parrett, Patrick G. Dwyer, Jonathan R. Gair, Andrew Frederick Johnson, Baraka Kuguru, Aaron Savio Lobo, Neil R. Loneragan, Kate Longley-Wood, Jocemar Tomasino Mendonça, Jan-Olaf Meynecke, Roland Nathan Mandal, Cosmas Nzaka Munga, Borja G. Reguero, Patrik Rönnbäck, Julia Thorley, Matthias Wolff, Mark Spalding "Fishers who rely on mangroves: Modelling and mapping the global intensity of mangrove-associated fisheries", Estuarine, Coastal and Shelf Science, (2020), Volume 247 (106975), doi:10.1016/j.ecss.2020.106975.',
    license = "https://creativecommons.org/licenses/by/4.0/",
    url = "",
    language = "en", 
    altName = "Fishing Intesity Mangroves",
    distribution = "",
    variableMeasured = "Fishing intensity",
    units = "fishing days / square km / year",
    spatialCoverage = "Global tropics",
    temporalCoverage = "2020",
    dataLineage = ""
)

In [15]:
def generate_manifests(year):
    files = [blob.name for blob in list_gcs(bucket_name = gcs_bucket, 
                    dir_path = f'ee_import_data/{dataset}/fish_pres', file_pattern = '*.tif') if blob.name.endswith('.tif')]
    manifest = GEEManifest(
            path = Path(f'{BASE_DIR}/processed/manifest/fish_pres'),
            name = f"projects/earthengine-legacy/assets/{ee_image_collection}/fishing_intensity_mangorves",
            tilesets = [ Tilesets(
                sources = [Sources(uris = [f"{gcs_prefix}/{file}"]) for file in files],
                    )
                ],
            start_time = f'{year}-01-01T00:00:00Z',
            end_time = f'{year}-12-31T00:00:00Z',
            uri_prefix = f"",
            properties = ImageProperties(
                band_nodata_values = no_data_values[0],
                band_pyramiding_policies = pyramiding,
                band_names = 'Band 1',
                year = year,
            ),
            bands = [{"id": "Fishing intensity",
                    "tileset_band_index": 0}],
            pyramiding_policy = pyramiding,
            missing_data = {'values': no_data_values}

        )
    return manifest

In [126]:
def extract_path_from_zip(path_zip: Path):
    # Extract all the contents of zip file in current directory
    for item in path_zip.iterdir():
        if item.suffix == '.zip':
            with zipfile.ZipFile(item.as_posix(), 'r') as zipObj:
                zipObj.extractall(path_zip)
            
            item.unlink()

In [127]:
def rm_tree(pth: Path):
    for child in pth.iterdir():
        if child.is_file():
            child.unlink()
        else:
            rm_tree(child)
    pth.rmdir()

In [8]:
#!gcloud config set project mangrove-atlas-246414

Updated property [core/project].


In [7]:
files = list(list_gcs(bucket_name = gcs_bucket, dir_path = cgs_parent_folder, file_pattern = '*.zip'))
files

INFO:root:Searching GMW v3.14 geotiffs/*.zip


[<Blob: mangrove_atlas, GMW v3.14 geotiffs/, 1658502357947930>,
 <Blob: mangrove_atlas, GMW v3.14 geotiffs/gmw_v3_1996_gtiff.zip, 1658502502870899>,
 <Blob: mangrove_atlas, GMW v3.14 geotiffs/gmw_v3_2007_gtiff.zip, 1658502517196761>,
 <Blob: mangrove_atlas, GMW v3.14 geotiffs/gmw_v3_2008_gtiff.zip, 1658502557420942>,
 <Blob: mangrove_atlas, GMW v3.14 geotiffs/gmw_v3_2009_gtiff.zip, 1658502628130467>,
 <Blob: mangrove_atlas, GMW v3.14 geotiffs/gmw_v3_2010_gtiff.zip, 1658502634901691>,
 <Blob: mangrove_atlas, GMW v3.14 geotiffs/gmw_v3_2015_gtiff.zip, 1658502665713112>,
 <Blob: mangrove_atlas, GMW v3.14 geotiffs/gmw_v3_2016_gtiff.zip, 1658502699001779>,
 <Blob: mangrove_atlas, GMW v3.14 geotiffs/gmw_v3_2017_gtiff.zip, 1658502728032109>,
 <Blob: mangrove_atlas, GMW v3.14 geotiffs/gmw_v3_2018_gtiff.zip, 1658502769412890>,
 <Blob: mangrove_atlas, GMW v3.14 geotiffs/gmw_v3_2019_gtiff.zip, 1658502783087871>,
 <Blob: mangrove_atlas, GMW v3.14 geotiffs/gmw_v3_2020_gtiff.zip, 1658502810948153>]

### Get the raw files locally

In [37]:
#copy_folder_gcs(f'{gcs_prefix}/{cgs_parent_folder}/*.zip', raw_local_folder)

Copying gs://mangrove_atlas/GMW v3.14 geotiffs/gmw_v3_1996_gtiff.zip...
Resuming download for /home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_1996_gtiff.zip
Copying gs://mangrove_atlas/GMW v3.14 geotiffs/gmw_v3_2008_gtiff.zip...
Copying gs://mangrove_atlas/GMW v3.14 geotiffs/gmw_v3_2007_gtiff.zip...
Resuming download for /home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2008_gtiff.zip
Copying gs://mangrove_atlas/GMW v3.14 geotiffs/gmw_v3_2009_gtiff.zip...
Resuming download for /home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2007_gtiff.zip
Resuming download for /home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2009_gtiff.zip
Copying gs://mangrove_atlas/GMW v3.14 geotiffs/gmw_v3_2010_gtiff.zip...
Resuming download for /home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2010_gtiff.zip
Copying gs://mangrove_atlas/GMW v3.14 geotiffs/gmw_v3_2015_gtiff.zip...
Copying gs://mangrove_atlas/GMW v3.14 geotiffs/gmw_v3_2016_gtiff.zip...
Copying gs://mangrove_atlas/GMW v3.14 geo

### Extract them and reupload to GCS in the ee_import_data folder

In [96]:
#extract_path_from_zip(raw_local_folder)

#folder_unziped = [x for x in raw_local_folder.iterdir() if x.is_dir()]

#for folder in folder_unziped:
#    copy_folder_gcs(folder, f'{gcs_prefix}/ee_import_data/{dataset}/')

[PosixPath('/home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2008'),
 PosixPath('/home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2007'),
 PosixPath('/home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2009'),
 PosixPath('/home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2015'),
 PosixPath('/home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2016'),
 PosixPath('/home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2017'),
 PosixPath('/home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_1996'),
 PosixPath('/home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2019'),
 PosixPath('/home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2010'),
 PosixPath('/home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2020'),
 PosixPath('/home/jovyan/work/datasets/raw/mangrove_extent/gmw_v3_2018')]

In [9]:
raw_local_folder

PosixPath('/home/jovyan/work/datasets/raw/mangrove_fishing_activity')

In [10]:
copy_folder_gcs(raw_local_folder, f'{gcs_prefix}/ee_import_data/{dataset}/')

Copying file:///home/jovyan/work/datasets/raw/mangrove_fishing_activity/fish_pres2.tif [Content-Type=image/tiff]...
==> NOTE: You are uploading one or more large file(s), which would run          
significantly faster if you enable parallel composite uploads. This
feature can be enabled by editing the
"parallel_composite_upload_threshold" value in your .boto
configuration file. However, note that if you do this large files will
be uploaded as `composite objects
<https://cloud.google.com/storage/docs/composite-objects>`_,which
means that any user who downloads such objects will need to have a
compiled crcmod installed (see "gsutil help crcmod"). This is because
without a compiled crcmod, computing checksums on composite objects is
so slow that gsutil disables downloads of composite objects.

ResumableUploadAbortException: 401 Anonymous caller does not have storage.objects.create access to the Google Cloud Storage object. Permission 'storage.objects.create' denied on resource (or it may 

#### Clean up the raw data locally as this is expensive in terms of space.

In [121]:
#rm_tree(raw_local_folder)

#### Generate manifests for each year that represent an image in the image collection

In [128]:
list_of_manifests = [generate_manifests(year) for year in data_year_range]

INFO:root:Searching ee_import_data/gmw_v3_1996/*.tif
INFO:root:Searching ee_import_data/gmw_v3_2007/*.tif
INFO:root:Searching ee_import_data/gmw_v3_2008/*.tif
INFO:root:Searching ee_import_data/gmw_v3_2009/*.tif
INFO:root:Searching ee_import_data/gmw_v3_2010/*.tif
INFO:root:Searching ee_import_data/gmw_v3_2015/*.tif
INFO:root:Searching ee_import_data/gmw_v3_2016/*.tif
INFO:root:Searching ee_import_data/gmw_v3_2017/*.tif
INFO:root:Searching ee_import_data/gmw_v3_2018/*.tif
INFO:root:Searching ee_import_data/gmw_v3_2019/*.tif
INFO:root:Searching ee_import_data/gmw_v3_2020/*.tif


#### Create the image collection asset and upload the images to GEE

In [14]:
ee_image_collection

'projects/global-mangrove-watch/land-cover/mangrove_fishing_activity'

In [16]:
gcs_bucket

'mangrove_atlas'

In [11]:
extent_collection_properties

ImageCollectionProperties(name='Fishing intensity associated with mangroves', version='v3', creator='Philine et al. (2020) Fishers who rely on mangroves: Modelling and mapping the global intensity of mangrove-associated fisheries.', description='\n## Summary\n\nMangroves are critical nursery habitats for fish and invertebrates, providing livelihoods for many coastal communities. Despite their importance, there is currently no estimate of the number of fishers engaged in mangrove associated fisheries, nor of the fishing intensity associated with mangroves at a global scale. \nWe address these gaps by developing a global model of mangrove associated fisher numbers and mangrove fishing intensity. To develop the model, we undertook a three-round Delphi process with mangrove fisheries experts to identify the key drivers of mangrove fishing intensity. We then developed a conceptual model of intensity of mangrove fishing using those factors identified both as being important and for which app

In [17]:
list_of_manifests = generate_manifests(2020)

INFO:root:Searching ee_import_data/mangrove_fishing_activity/fish_pres/*.tif


In [18]:
list_of_manifests

GEEManifest(path=PosixPath('/home/jovyan/work/datasets/processed/manifest/fish_pres'), name='projects/earthengine-legacy/assets/projects/global-mangrove-watch/land-cover/mangrove_fishing_activity/fishing_intensity_mangorves', tilesets=[Tilesets(data_type=None, id=None, crs=None, sources=[])], bands=[{'id': 'Fishing intensity', 'tileset_band_index': 0}], mask_bands=None, footprint=None, missing_data={'values': [-3.40282e+38]}, pyramiding_policy='MEAN', uri_prefix='', start_time='2020-01-01T00:00:00Z', end_time='2020-12-31T00:00:00Z', properties=ImageProperties(band_nodata_values='-3.40282e+38', band_pyramiding_policies='MEAN', band_names='Band 1', year=2020))

In [21]:
createImageCollection(ee_asset_path = ee_image_collection,
                        properties = extent_collection_properties, 
                        image_list = list_of_manifests)

ERROR:root:Error creating collection projects/global-mangrove-watch/land-cover/mangrove_fishing_activity
ERROR:root:Insufficient permissions to create asset 'projects/earthengine-legacy/assets/projects/global-mangrove-watch/land-cover/mangrove_fishing_activity'.


EEException: Insufficient permissions to create asset 'projects/earthengine-legacy/assets/projects/global-mangrove-watch/land-cover/mangrove_fishing_activity'.

If we need to add new years to the image collection, we can do that by adding new manifests to the image collection asset.

In [108]:
addImagesToCollection(ee_image_collection, list_of_manifests[0:1])

INFO:root:Task created
INFO:root:Upload image projects/earthengine-legacy/assets/projects/global-mangrove-watch/land-cover/mangrove-extent-test/gmw_v3_1996


Started upload task with ID: CJXPIRIMZJL6MGZ2US6RD3CH


In [2]:
# Check task status
!earthengine task list --status RUNNING