(https://developers.google.com/earth-engine/Earth_Engine_asset_from_cloud_geotiff)

Latest S2 composites (2019-2023 seasonal composites) version is v2024-03-11

GEE script: [https://code.earthengine.google.com/f7ccb6d08f9f201c2f09d6433719433e] (snapshot)

[https://code.earthengine.google.com/?scriptPath=users%2Fmmacander%2Fakveg_map%3Asentinel_2%2Fs2_medians_v20240311] (script path in GEE)

Visualize results after creating the Cloud GeoTiff Backed Earth Engine Assets with this viz script:

[https://code.earthengine.google.com/?scriptPath=users%2Fmmacander%2Fakveg_map%3Asentinel_2%2Fs2_viz]

v2024-03-11 has much reduced snow contamination in early season composites. Currently, we use fire polygons for 2019-2023 to mask input imagery that was acquired the year of or before a fire. So where a 2020 fire is mapped in the fire perimeter polygon data, only 2021-2023 data is included in the composites. This should make the composites represent the current condition (after the fire). Problems with this approach are that for training, we may prefer the pre-fire conditions. Also, fires in 2023 will have no image because it is all masked. The script currently falls back to a non-seasonal full snow-free season composite when there is insufficient data available but this is not desirable behavior for 2023 firescars.

Better would be to save pre- and post-fire composite imagery for the 2019-2023 firescar footprints. 

# Cloud GeoTiff Backed Earth Engine Assets

***Note:*** *The REST API contains new and advanced features that may not be suitable for all users.  If you are new to Earth Engine, please get started with the [JavaScript guide](https://developers.google.com/earth-engine/guides/getstarted).*

Earth Engine can load images from Cloud Optimized GeoTiffs (COGs) in Google Cloud Storage ([learn more](https://developers.google.com/earth-engine/guides/image_overview#images-from-cloud-geotiffs)).  This notebook demonstrates how to create Earth Engine assets backed by COGs.  An advantage of COG-backed assets is that the spatial and metadata fields of the image will be indexed at asset creation time, making the image more performant in collections.  (In contrast, an image created through `ee.Image.loadGeoTIFF` and put into a collection will require a read of the GeoTiff for filtering operations on the collection.)  A disadvantage of COG-backed assets is that they may be several times slower than standard assets when used in computations.

To create a COG-backed asset, make a `POST` request to the Earth Engine [`CreateAsset` endpoint](https://developers.google.com/earth-engine/reference/rest/v1alpha/projects.assets/create).  As shown in the following, this request must be authorized to create an asset in your user folder.

## Start an authorized session

To be able to make an Earth Engine asset in your user folder, you need to be able to authenticate as yourself when you make the request.  You can use credentials from the Earth Engine authenticator to start an [`AuthorizedSession`](https://google-auth.readthedocs.io/en/master/reference/google.auth.transport.requests.html#google.auth.transport.requests.AuthorizedSession).  You can then use the `AuthorizedSession` to send requests to Earth Engine.

In [65]:
import ee
from google.auth.transport.requests import AuthorizedSession

ee.Authenticate()  #  or !earthengine authenticate --auth_mode=gcloud
session = AuthorizedSession(ee.data.get_persistent_credentials())

Enter verification code:  4/1AeaYSHB6eW-jvfleRbC1-wjkVwkce7Rj2bNRaMe9481whOeq_qgvQ2HYu8w



Successfully saved authorization token.


## Cleanup existing imageCollection, if needed

imageCollections cannot be deleted until all images inside them are deleted. For a cloud-backed image collection with hundreds or thousands of tiles, this can take a while

bash earthengine CLI
```
for i in `earthengine ls projects/akveg-map/assets/s2_2019_2023_gMedian_v20240311`; do earthengine rm $i; done
earthengine rm projects/akveg-map/assets/s2_2019_2023_gMedian_v20240311
```

## Create list of cloud geotiffs
Create list of cogs in a bucket folder to load into an imageCollection

Bash in conda env with gsutil and earthengine command line

```
cd /data/gis/gis_projects/2024/24-224_Land_Cover_Metrics_Susitna_Wolf/sentinel2_gMedian

gsutil ls gs://akveg-data/s2_sr_2019_2023_gMedian_v20240311/*.tif > s2_cogs_v20240311.txt
```

## Open list of geotiffs to ingest

In [66]:
import pandas
cogs = pandas.read_csv('/data/gis/gis_projects/2024/24-224_Land_Cover_Metrics_Susitna_Wolf/sentinel2_gMedian/s2_cogs_v20240311.txt', header=None,names=['tif'])
print(cogs[0:2])
print(len(cogs.index))


                                                 tif
0  gs://akveg-data/s2_sr_2019_2023_gMedian_v20240...
1  gs://akveg-data/s2_sr_2019_2023_gMedian_v20240...
35


## Setup parameters

In [67]:
import json
# from urllib.parse import urlparse
import os
from pprint import pprint

# Earth Engine enabled Cloud Project.
project_folder = 'akveg-map'
collection = 's2_sr_2019_2023_geometricMedian_v20240311'

## Create empty image collection as target
TODO Automate creation of empty image collection.

For now, manually create empty image collection with earthengine CLI

```
earthengine create collection projects/akveg-map/assets/s2_sr_2019_2023_gMedian_v20240311
```

## View list of cogs to ingest (skip when list is long)

In [68]:
for cog in cogs['tif']:
    print(cog)

gs://akveg-data/s2_sr_2019_2023_gMedian_v20240311/s2_sr_2019_2023_gMedian_AK050H01V31_all_v20240311.tif
gs://akveg-data/s2_sr_2019_2023_gMedian_v20240311/s2_sr_2019_2023_gMedian_AK050H01V32_all_v20240311.tif
gs://akveg-data/s2_sr_2019_2023_gMedian_v20240311/s2_sr_2019_2023_gMedian_AK050H01V33_all_v20240311.tif
gs://akveg-data/s2_sr_2019_2023_gMedian_v20240311/s2_sr_2019_2023_gMedian_AK050H02V31_all_v20240311.tif
gs://akveg-data/s2_sr_2019_2023_gMedian_v20240311/s2_sr_2019_2023_gMedian_AK050H02V32_all_v20240311.tif
gs://akveg-data/s2_sr_2019_2023_gMedian_v20240311/s2_sr_2019_2023_gMedian_AK050H02V33_all_v20240311.tif
gs://akveg-data/s2_sr_2019_2023_gMedian_v20240311/s2_sr_2019_2023_gMedian_AK050H03V32_all_v20240311.tif
gs://akveg-data/s2_sr_2019_2023_gMedian_v20240311/s2_sr_2019_2023_gMedian_AK050H03V33_all_v20240311.tif
gs://akveg-data/s2_sr_2019_2023_gMedian_v20240311/s2_sr_2019_2023_gMedian_AK050H04V34_all_v20240311.tif
gs://akveg-data/s2_sr_2019_2023_gMedian_v20240311/s2_sr_2019_202

## Function to load list of gcs cogs to GEE imageCollection
comment out pprint and most print except when troubleshooting

In [69]:
def load_gcs_cogs_to_collection(cogs, project_folder, collection):
    # Request body as a dictionary.
    for cog in cogs['tif']:
      fileOnly = os.path.split(cog)[1]
      # print(fileOnly)

      cogName = fileOnly[:-4]
      print(cogName)
      
      parts = fileOnly.split('_')
      # print(parts)
      
      season = parts[6]
      # print(season)
        
      request = {
        'type': 'IMAGE',
        'gcs_location': {
          'uris': cog
        },
        'properties': {
        #   'source': 'https://code.earthengine.google.com/067b10ee56537817756a3177a9138aee',
            'seasonName': season
        },
        'startTime': '2023-01-01T00:00:00.000000000Z',
        'endTime': '2024-01-01T00:00:00.000000000Z',
      }

      # pprint(json.dumps(request))

      # A folder (or ImageCollection) name and the new asset name.
      asset_id = collection+'/'+cogName
      # print(project_folder)
      # print(asset_id)
        
      url = 'https://earthengine.googleapis.com/v1alpha/projects/{}/assets?assetId={}'
      # print(url)

      response = session.post(
        url = url.format(project_folder, asset_id),
        data = json.dumps(request)
      )

      # pprint(json.loads(response.content))
    print('done')


## Run it

In [70]:
# load_gcs_cogs_to_collection(cogs, project_folder, 's1_2022_v20230326')
load_gcs_cogs_to_collection(cogs, project_folder, 's2_sr_2019_2023_gMedian_v20240311')


s2_sr_2019_2023_gMedian_AK050H01V31_all_v20240311
s2_sr_2019_2023_gMedian_AK050H01V32_all_v20240311
s2_sr_2019_2023_gMedian_AK050H01V33_all_v20240311
s2_sr_2019_2023_gMedian_AK050H02V31_all_v20240311
s2_sr_2019_2023_gMedian_AK050H02V32_all_v20240311
s2_sr_2019_2023_gMedian_AK050H02V33_all_v20240311
s2_sr_2019_2023_gMedian_AK050H03V32_all_v20240311
s2_sr_2019_2023_gMedian_AK050H03V33_all_v20240311
s2_sr_2019_2023_gMedian_AK050H04V34_all_v20240311
s2_sr_2019_2023_gMedian_AK050H05V34_all_v20240311
s2_sr_2019_2023_gMedian_AK050H50V03_all_v20240311
s2_sr_2019_2023_gMedian_AK050H50V04_all_v20240311
s2_sr_2019_2023_gMedian_AK050H50V05_all_v20240311
s2_sr_2019_2023_gMedian_AK050H50V06_all_v20240311
s2_sr_2019_2023_gMedian_AK050H50V07_all_v20240311
s2_sr_2019_2023_gMedian_AK050H50V08_all_v20240311
s2_sr_2019_2023_gMedian_AK050H50V09_all_v20240311
s2_sr_2019_2023_gMedian_AK050H50V10_all_v20240311
s2_sr_2019_2023_gMedian_AK050H50V11_all_v20240311
s2_sr_2019_2023_gMedian_AK050H50V12_all_v20240311


# OLD or Single Image Example below here

In [36]:
# # Request body as a dictionary.
# for cog in cogs['tif']:
#   request = {
#     'type': 'IMAGE',
#     'gcs_location': {
#       'uris': cog
#     },
#     # 'properties': {
#     #   'source': 'https://code.earthengine.google.com/067b10ee56537817756a3177a9138aee'
#     # },
#     'startTime': '2022-01-01T00:00:00.000000000Z',
#     'endTime': '2023-01-01T00:00:00.000000000Z',
#   }

#   # pprint(json.dumps(request))

#   cogName = cog[34:-4]
#   print(cogName)

#   # A folder (or ImageCollection) name and the new asset name.
#   asset_id = collection+'/'+cogName

#   url = 'https://earthengine.googleapis.com/v1alpha/projects/{}/assets?assetId={}'

#   response = session.post(
#     url = url.format(project_folder, asset_id),
#     data = json.dumps(request)
#   )

#   # pprint(json.loads(response.content))

In [None]:
# cogs = [
# 'gs://akveg-data/s1_2022_v20230320/s1_flat_seasonal_composite_2022_06VUL_v20230320c.tif',
# 'gs://akveg-data/s1_2022_v20230320/s1_flat_seasonal_composite_2022_06VUM_v20230320c0000000000-0000000000.tif',
# 'gs://akveg-data/s1_2022_v20230320/s1_flat_seasonal_composite_2022_06VUM_v20230320c0000009472-0000000000.tif',
# 'gs://akveg-data/s1_2022_v20230320/s1_flat_seasonal_composite_2022_06VUN_v20230320c0000000000-0000000000.tif',
# 'gs://akveg-data/s1_2022_v20230320/s1_flat_seasonal_composite_2022_06VUN_v20230320c0000009472-0000000000.tif']
# cogs[0:4]

In [None]:
# from google.cloud import storage


# def list_blobs(bucket_name):
#     """Lists all the blobs in the bucket."""
#     # bucket_name = "your-bucket-name"

#     storage_client = storage.Client()

#     # Note: Client.list_blobs requires at least package version 1.17.0.
#     blobs = storage_client.list_blobs(bucket_name)

#     # Note: The call returns a response only when the iterator is consumed.
#     for blob in blobs:
#         print(blob.name)

# list_blobs('akveg-data')

In [None]:
# from google.cloud import storage


# def list_blobs_with_prefix(bucket_name, prefix, delimiter=None):
#     """Lists all the blobs in the bucket that begin with the prefix.

#     This can be used to list all blobs in a "folder", e.g. "public/".

#     The delimiter argument can be used to restrict the results to only the
#     "files" in the given "folder". Without the delimiter, the entire tree under
#     the prefix is returned. For example, given these blobs:

#         a/1.txt
#         a/b/2.txt

#     If you specify prefix ='a/', without a delimiter, you'll get back:

#         a/1.txt
#         a/b/2.txt

#     However, if you specify prefix='a/' and delimiter='/', you'll get back
#     only the file directly under 'a/':

#         a/1.txt

#     As part of the response, you'll also get back a blobs.prefixes entity
#     that lists the "subfolders" under `a/`:

#         a/b/
#     """

#     storage_client = storage.Client()

#     # Note: Client.list_blobs requires at least package version 1.17.0.
#     blobs = storage_client.list_blobs(bucket_name, prefix=prefix, delimiter=delimiter)

#     # Note: The call returns a response only when the iterator is consumed.
#     print("Blobs:")
#     for blob in blobs:
#         print(blob.name)

#     if delimiter:
#         print("Prefixes:")
#         for prefix in blobs.prefixes:
#             print(prefix)

# list_blobs_with_prefix('akveg-map', 's1_2022_v20230320/')

## Send the request

Make the POST request to the Earth Engine [`projects.assets.create`](https://developers.google.com/earth-engine/reference/rest/v1alpha/projects.assets/create) endpoint.

In [None]:
# Earth Engine enabled Cloud Project.
project_folder = 'akveg-map'
# A folder (or ImageCollection) name and the new asset name.
asset_id = 's1_v20230321c'

url = 'https://earthengine.googleapis.com/v1alpha/projects/{}/assets?assetId={}'

response = session.post(
  url = url.format(project_folder, asset_id),
  data = json.dumps(request)
)

pprint(json.loads(response.content))

## Details on COG-backed assets

### Permissions
The ACLs of COG-backed Earth Engine assets and the underlying data are managed separately. If a COG-backed asset is shared in Earth Engine, it is the owner's responsibility to ensure that the data in GCS is shared with the same parties. If the data is not visible, Earth Engine will return an error of the form "Failed to load the GeoTIFF at `gs://my-bucket/my-object#123456`" (123456 is the generation of the object).

### Generations
When a COG-backed asset is created, Earth Engine reads the metadata of the TIFF in Cloud Storage and creates asset store entry. The URI associated with that entry must have a generation.  See the [object versioning docs](https://cloud.google.com/storage/docs/object-versioning) for details on generations. If a generation is specified (e.g., `gs://foo/bar#123`), Earth Engine will use it. If a generation is not specified, Earth Engine will use the latest generation of the object. 

That means that if the object in GCS is updated, Earth Engine will return a "Failed to load the GeoTIFF at `gs://my-bucket/my-object#123456`" error because the expected object no longer exists (unless the bucket enables  multiple object versions).  This policy is designed to keep metadata of the asset in sync with the metadata of the object.  

### Configuration
In terms of how a COG should be configured, the TIFF MUST be:

- Tiled, where the tile dimensions are either:
  - 16x16
  - 32x32
  - 64x64
  - 128x128
  - 256x256
  - 512x512
  - 1024x1024

- Arranged so that all IFDs are at the beginning.

For best performance:

- Use tile dimensions of 128x128 or 256x256.
- Include power of 2 overviews.

See [this page](https://cogeotiff.github.io/rio-cogeo/Advanced/#web-optimized-cog) for more details on an optimized configuration.