# 03. Upload polygon masks from Cloud bucket to GEE Assets

This notebook uploads the polygon mask created for each Sentinel-2 and Landsat tile from GCP (Google Cloud Platform) cloud bucket.

This notebook reads in the output of the notebook [01_Create_polygon_mask.ipynb](https://github.com/ShiruiH/time-series-OFS/blob/main/01_Create_polygon_mask.ipynb), which is located in `/outputs/Sentinel2_tiles_mask` and `/outputs/Landsat_tiles_mask`.

In [11]:
# import libraries
import os
import glob
from pathlib import Path
import shutil
import json
from google.cloud import storage
import ee

In [12]:
# Authenticate and Initialize ee
ee.Authenticate()
ee.Initialize(project='nsw-dpe-gee-tst')

Enter verification code:  4/1AQlEd8x0RDUhwiv-cPMfwITWjOt-LBrS3-bc9nyWbcGgPa_194cTybyJcOk



Successfully saved authorization token.


In [13]:
! gcloud auth login
! gcloud config set project nsw-dpe-gee-tst

Your browser has been opened to visit:

    https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=32555940559.apps.googleusercontent.com&redirect_uri=http%3A%2F%2Flocalhost%3A8085%2F&scope=openid+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fappengine.admin+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fsqlservice.login+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcompute+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Faccounts.reauth&state=crtdmd3eInGW6kpNkyt6JaC4xjxI3P&access_type=offline&code_challenge=n0kqE7CihoXRsoYAiotDNVGm2FJBU3rgf4oaNB0vok4&code_challenge_method=S256


You are now logged in as [kilian.vos@dpie.nsw.gov.au].
Your current project is [nsw-dpe-gee-tst].  You can change this setting by running:
  $ gcloud config set project PROJECT_ID
Updated property [core/project].


Auxiliary functions

In [14]:
# list files in the bucket
def list_blobs(bucket_name, folder_path):
    storage_client = storage.Client()
    blobs = storage_client.list_blobs(bucket_name, prefix=folder_path)
    blobs = [blob for blob in blobs if blob.name != folder_path and not blob.name.endswith('/')]
    return blobs
# write json file
def write_json(new_data, filename):
    with open(filename, 'r+') as file:
        # Load existing data into a dict
        file_data = json.load(file)
        # Write new data
        file_data['name'] = new_data
        # Sets file's current position at offset
        file.seek(0)
        # convert back to json
        json.dump(file_data, file, indent=4)
        file.truncate()
        file.close()
# write manifest for multiple files
def write_json_manifest_multi_tiles(sources, tile, filename):
    with open(filename, 'r+') as file:
        # Load existing data into a dict
        file_data = json.load(file)
        # Write new data
        file_data['tilesets'][0]['sources'] = [sources]
        file_data['properties'] = tile
        # Sets file's current position at offset
        file.seek(0)
        # convert back to json
        json.dump(file_data, file, indent=4)
        file.truncate()
        file.close()

Locate cloud bucket (make sure the path to the cloud bucket is correct)

In [15]:
# Buckets args set up
BUCKET_NAME = 'label-tiles'
FOLDER_PATH = 'Sentinel2_tiles_mask/'
sat_name = FOLDER_PATH.split('_')[0]
print('Polygon masks from satellite:', sat_name)

# GEE Assets args set up
PROJ_NAME = 'nsw-dpe-gee-tst'
ASSETS_SUB_FOLDER = 'OFS' # Can be nested folder, e.g., OFS/sub_folder
IMAGE_COLLECTION_NAME = f'exp_baseOFS_{sat_name}_tiles_1'

Polygon masks from satellite: Sentinel2


Load template json manifest (to be updated in the next cell)

In [16]:
fp_outputs = os.path.join(os.getcwd(),'outputs/')
fp_json = os.path.join(fp_outputs,'json_manifest')
if not os.path.exists(fp_json):
    os.makedirs(fp_json)
# JSON file template
src_file = glob.glob(f'{fp_outputs}Template_{sat_name}_tileset.json')[0]
blobs_lst = list_blobs(BUCKET_NAME, FOLDER_PATH)
upload_lst = [blob for blob in blobs_lst if '.tif' in blob.name]
print('%d polygon masks found in cloud bucket'%len(upload_lst))

18 polygon masks found in cloud bucket


Ingest file into GEE Assets using a json manifest

In [17]:
# loop through blobs (elements in cloud bucket)
for blob in upload_lst:
    blob_name = blob.name
    file_name = blob_name.split('/')[1].split('.')[0]
    tile = file_name.split('_')[-1]
    dst_file = f'{fp_json}/{sat_name}_EE_upload_{tile}.json'
    shutil.copy(src_file, dst_file)
    assets_name = f'projects/{PROJ_NAME}/assets/{ASSETS_SUB_FOLDER}/{IMAGE_COLLECTION_NAME}/{file_name}'
    source_arg = {
        "uris": [
            f"gs://{BUCKET_NAME}/{blob_name}"
            ]
        }
    # add tile property to the image object
    tile_arg = {
        "Tile": f"{tile[1:]}" # 'T' needs to be removed
    }
    # Change the base name in the json file
    write_json(assets_name, dst_file)
    write_json_manifest_multi_tiles(source_arg, tile_arg, filename=dst_file)
    # Submit EE Tasks to ingest tiles
    reqID = ee.data.newTaskId()[0]
    with open(f'{dst_file}') as f:
        params = json.load(f)
    ee.data.startIngestion(request_id=reqID, params=params)
print('%d polygon masks uploaded to GEE Assets under %s'%(len(upload_lst),assets_name))

18 polygon masks uploaded to GEE Assets under projects/nsw-dpe-gee-tst/assets/OFS/exp_baseOFS_Sentinel2_tiles_1/T55JGH_20231213T001111_B02
