<a href="https://colab.research.google.com/github/wayne-xyz/GoogleEarthEngineTask/blob/main/Gee_Export_tif_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Exporting tif file from GEE Nicif and sentinel

- Nicfi
 - Monthly: [planet.com](https://developers.planet.com/docs/integrations/gee/nicfi/faq/) Analytic Monthly series Basemaps are stored as images in the Image Collections representing a single month in time.  Running a date filter for a specific time period within this month, will not return any results (for example, between 2021-08-02 and 2021-08-19).
 - Monthly update on GEE



## 1.Shared config before excute the exporting

After Excute this part , excute the nicfi or sentinel export:
- Setting the parameters for exporting
- Run the exporting
- Test function of exporting

### Config Setting


In [None]:
# Google Drive folder name
from google.colab import drive
import time

drive.mount('/content/drive')

# export function only support the root's folder name , not the path
sentinel_folder_name="sentinel_tif_2024June"
nicfi_folder_name="nicfi_tif_2024June"

# minimun shape size of dumpsite
shapefile_size=0.1

# time range
start_date='2023-07-01'
end_date='2024-07-01'

# image zoom



print("Current Time:",time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))


Mounted at /content/drive
Current Time: 2024-07-23 01:54:09


### Initial EE

In [None]:
# pre-setting

# Trigger the authentication flow.
import ee
import time

ee.Authenticate()

# Initialize the library.
ee.Initialize(project='ee-qinheyi')
print("Current Time:",time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))


Current Time: 2024-07-23 01:54:30


### Define exporting function

In [None]:
# after get the First in a collection to be the parameter it would work well
# solid size of the certain shape
def export_tif_image(ind,feature,image,date_str='YYYYMM',source_name="Source",folder_name="tif_file"):
    """
    Export the tif file image from GEE (both nicfi and sentinel) to Google Drive

    Args:
        ind(int): index for the shapefile
        feature(feature):shape of certain place
        date_str(str): image's date. format YYYMMDD  or YYYYMM
        source_name(str): source name.
        folder_name(str): folder name.
    """
    # to_export=image.clip(feature.geometry())
    # export_id=feature.get('Index').getInfo()
    # print(image.bandNames().getInfo())
    task=ee.batch.Export.image.toDrive(
        image=image,
        description=f"export_{ind}",
        folder=folder_name,
        region=feature.geometry(),
        scale=10,
        crs='EPSG:4326',
        maxPixels=1e13,
        fileNamePrefix=f"{ind}-{date_str}-{source_name}"
    )
    task.start()
    print(f"Export task start:{ind} ",date_str)

print("Current Time:",time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))


Current Time: 2024-07-22 13:21:15


In [None]:
# based on the shape size , the tif will have 2 option , <4ha tif will be 5 , > 4 the tif will be double size of the shape
def export_tif_image_fixedsize(ind,feature,image,date_str='YYYYMM',source_name="Source",folder_name="tif_file",shape_size=1):
    """
        Export the tif file image from GEE (both nicfi and sentinel) to Google Drive

        Args:
            ind(int): index for the shapefile
            feature(feature): shape of certain place
            image: Image to export
            date_str(str): image's date. format YYYYMMDD or YYYYMM
            source_name(str): source name.
            folder_name(str): folder name.
            shape_size(int): size of the shape ha, default 1
        """
        # Convert hectares to square meters (1 hectare = 10,000 square meters).
    if shape_size < 4:
        exportSizeSqMeters = 5 * 10000
    else:
        exportSizeSqMeters = shape_size * 10000 *2


    # Get the centroid of the feature.
    centroid = feature.geometry().centroid()

    # Create a bounding box of 5 hectares centered on the centroid.
    halfSideLength = (exportSizeSqMeters ** 0.5) / 2
    exportRegion = centroid.buffer(halfSideLength).bounds()

    # Start the export task.
    task = ee.batch.Export.image.toDrive(
        image=image.clip(exportRegion),
        description=f"export_{ind}",
        folder=folder_name,
        region=exportRegion,
        scale=10,
        crs='EPSG:4326',
        maxPixels=1e13,
        fileNamePrefix=f"{ind}-{date_str}-{source_name}"
    )
    task.start()
    print(f"Export task started for index {ind} with date {date_str}")

print("Current Time:", time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()))

Current Time: 2024-07-23 01:58:06


### Load the shape file and filter by size of shape

In [None]:
# load the shape file check the shapefile

shapefile_table=ee.FeatureCollection('projects/ee-qinheyi/assets/1823_ADRSM')
print('Count of table item:',shapefile_table.size().getInfo())

# filter function 1 based on the size of the shape

size_field_name='AREA_HA'
size_filter=ee.Filter.greaterThan(size_field_name,shapefile_size)
filtered_shape=shapefile_table.filter(size_filter)
print("size of filtering", size_filter.getInfo())
export_shape_count=filtered_shape.size().getInfo()
print('Count of table item after filtering',filtered_shape.size().getInfo())

# get the index list of filtered shapes
index_list=filtered_shape.aggregate_array('Index').getInfo()

Count of table item: 1823
size of filtering {'type': 'Filter.lt', 'leftValue': 0.1, 'rightField': 'AREA_HA'}
Count of table item after filtering 1206


## 2.GEE Export the tif from nicfi source

1.   This notebook aims to export the tif files of image from nicfi
2.   Export Rull: Area size >10ha, based on the nicfi natural color , format is geotif



### MAIN

In [None]:
# config for exporting

count_per_loc=50 # from now on export 50 time series for each location shape
scale_ratio=1.5 # 2 times of the shape
is_outline=False


drive_folder_name='nicfi_tif_4b'


In [None]:
# Load the nicfi and the check the newest time of image

nicfi = ee.ImageCollection('projects/planet-nicfi/assets/basemaps/americas')

def getDateRange(imageCollection:ee.ImageCollection):
    dateRange=imageCollection.aggregate_array('system:time_end')
    date_list=[ee.Date(date).format().getInfo() for date in dateRange.getInfo()]
    return date_list

nicfi_date_list=getDateRange(nicfi)
print("Leng",len(nicfi_date_list),nicfi_date_list)

Leng 54 ['2016-06-01T00:00:00', '2016-12-01T00:00:00', '2017-06-01T00:00:00', '2017-12-01T00:00:00', '2018-06-01T00:00:00', '2018-12-01T00:00:00', '2019-06-01T00:00:00', '2019-12-01T00:00:00', '2020-06-01T00:00:00', '2020-09-01T00:00:00', '2020-10-01T00:00:00', '2020-11-01T00:00:00', '2020-12-01T00:00:00', '2021-01-01T00:00:00', '2021-02-01T00:00:00', '2021-03-01T00:00:00', '2021-04-01T00:00:00', '2021-05-01T00:00:00', '2021-06-01T00:00:00', '2021-07-01T00:00:00', '2021-08-01T00:00:00', '2021-09-01T00:00:00', '2021-10-01T00:00:00', '2021-11-01T00:00:00', '2021-12-01T00:00:00', '2022-01-01T00:00:00', '2022-02-01T00:00:00', '2022-03-01T00:00:00', '2022-04-01T00:00:00', '2022-05-01T00:00:00', '2022-06-01T00:00:00', '2022-07-01T00:00:00', '2022-08-01T00:00:00', '2022-09-01T00:00:00', '2022-10-01T00:00:00', '2022-11-01T00:00:00', '2022-12-01T00:00:00', '2023-01-01T00:00:00', '2023-02-01T00:00:00', '2023-03-01T00:00:00', '2023-04-01T00:00:00', '2023-05-01T00:00:00', '2023-06-01T00:00:00', '2

In [None]:
## iterate the filtered features
# get the geometry collection and the corresponding index

def extract_geometry(feature):
    return ee.Feature(None, {'geometry': feature.geometry()})

geometry_collection = filtered_shape.map(extract_geometry)
print(geometry_collection.getInfo())

In [None]:
# define a export function used repeatly

# default image
nicfi_collection_image=nicfi.filterDate(nicfi_date_list[0],nicfi_date_list[-1]).sort('system:time_start', False).first()
print(ee.Date( nicfi_collection_image.date()).format().getInfo())

# print the config information
print(nicfi_collection_image.bandNames().getInfo())
print("Counte_per_loc:",count_per_loc,"scale_ratio:",scale_ratio,"is_outline:",is_outline)
print("filter_size:",filter_size,"size_field_name:",size_field_name,"drive_folder_name:",drive_folder_name)

# TODO pendding for the scale properties
def scale_polygon_area(geome,ratio):
    buffer_distance=geome.area().multiply(2).sqrt().divide(ratio)
    result_geometry=geome.buffer( 10000)
    print(result_geometry.area().getInfo())
    print(geome.area().getInfo())
    print(type(geome))
    print(type(result_geometry))
    return result_geometry




In [None]:
# iterate the time list of the nicfi_date_list
# and iterate the shape



#pending for fix the invoke of the exporting
processing_count=0
for t in range(len(nicfi_date_list)-1,len(nicfi_date_list)-count_per_loc-1,-1):
    processing_count+=1
    print("Processing:",processing_count,nicfi_date_list[t-1][0:7],"--",nicfi_date_list[t][0:7])
    export_nicfi_basemap=nicfi.filterDate(nicfi_date_list[t-1],nicfi_date_list[t]).sort('system:time_start', False).first()
    for index in index_list:
        select_shape=shapefile_table.filter(ee.Filter.eq('Index',index))
        # TODO export_nicfi_image(index,select_shape.first(),export_nicfi_basemap,nicfi_date_list[t][0:7] )  pending




### TEST

In [None]:
# filter function 2 based on certain index do the selction
# this function is for test , only selection one location

# RUN this after load the shapefile
test_field_name='Index'
test_field_value=1740
test_select=ee.Filter.eq(test_field_name,test_field_value)
test_select_shape=shapefile_table.filter(test_select)
print("index",test_select_shape.first().get('Index').getInfo())



print('Count of table item:',test_select_shape.size().getInfo())
# TODO export_nicfi_image(test_select_shape.first()) pending



# get the geometry of this selected shape
test_geometry=test_select_shape.geometry()
print(test_geometry.getInfo())
print(test_geometry.area().getInfo())

index 1740
Count of table item: 1
Export task start:1740 2024-05-03 08:34:03
{'type': 'Polygon', 'coordinates': [[[-80.84707833151006, -5.746746782807356], [-80.84676616394673, -5.747246197879137], [-80.8464050075316, -5.746933993167123], [-80.84625339905642, -5.746679824048988], [-80.84584316788882, -5.7466753836964575], [-80.84558006814474, -5.746720028631912], [-80.84563800928616, -5.746960774541778], [-80.84568260535481, -5.746974125001984], [-80.84597687121577, -5.747009839682905], [-80.84583421491011, -5.747228295317482], [-80.84551317394687, -5.747433477682885], [-80.84525902616313, -5.747384421024927], [-80.84499144796432, -5.747736673204591], [-80.84482201509482, -5.747446853788952], [-80.8446123773197, -5.747558310619363], [-80.844434074977, -5.7476430150083075], [-80.8441977456328, -5.74733980539183], [-80.84382762443725, -5.747281813830131], [-80.8436760110852, -5.747433455974853], [-80.8438632683146, -5.747750022181812], [-80.84347532355517, -5.748062198781131], [-80.84330

In [None]:
# check the exporting function
ee.data.listOperations()

## 3.GEE Export the tif from sentinel

Usuage description:
- Set the parameter
- Load data
- Excute the Exporting

Support exporting 3 time slot of a month like 1/1-1/10 1/11-1/20 1/20-1/30

- Dealing with the limitation of the exporting. The googld earth engin exporting tasks queue only has 3000 capacity.

`EEException: Too many tasks already in the queue (3000). Please wait for some of them to complete.`

In [None]:

# bands infor  https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR_HARMONIZED#bands
# B2 B3 B4 : B G R
# B8: NIR
# B11 : SWIR 1  short wave infrared wavelength: 1613.7, 1610.4
# B12 : SWIR 2  .... wavelentgh 2002.4 2185.7

### Load sentinel data source

In [None]:
# load sentinel dataset ready


s2_dataset=ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED')

one_image=ee.Image( s2_dataset.filterDate(start_date,end_date).first())
print(one_image.bandNames().getInfo())
print(one_image.date().format("YYYY-MM-dd").getInfo())


['B1', 'B2', 'B3', 'B4', 'B5', 'B6', 'B7', 'B8', 'B8A', 'B9', 'B11', 'B12', 'AOT', 'WVP', 'SCL', 'TCI_R', 'TCI_G', 'TCI_B', 'MSK_CLDPRB', 'MSK_SNWPRB', 'QA10', 'QA20', 'QA60']
2023-08-01


### Generate the date list

In [None]:
# Date list ready

from datetime import datetime, timedelta

# define function to get the date list for 3 time slot a month.
# creat a date list which contains the 3 date of a month and within the data_start to date_end
def generate_dates(start_date, end_date):
    # Convert string dates to datetime objects
    start = datetime.strptime(start_date, "%Y-%m-%d")
    end = datetime.strptime(end_date, "%Y-%m-%d")

    # Generate a list to store the dates
    dates = []

    # Loop through each month from start to end
    current = start
    while current <= end:
        # Append the 1st, 10th, and 20th of each month to the list
        for day in [1, 10, 20]:
            date = datetime(current.year, current.month, day)
            if date <= end:
                dates.append(date.strftime("%Y-%m-%d"))

        # Move to the next month
        if current.month == 12:
            current = datetime(current.year + 1, 1, 1)
        else:
            current = datetime(current.year, current.month + 1, 1)

    return dates


export_dates=generate_dates(start_date,end_date)
print(export_dates)
print(len(export_dates))

['2023-08-01', '2023-08-10', '2023-08-20', '2023-09-01']
4


### Dealing with the limitaion of GEE tasks queue

In [None]:
# check the task list
max_runing_tasks=2500
task_check_interval = 600  # Time to wait (in seconds) between task checks, sleep 10 mins to check next loop

def check_task_status():
    tasks = ee.batch.Task.list()
    running_tasks = [task for task in tasks if task.state in ['READY', 'RUNNING']]
    return len(running_tasks)

def check_complete_task():
    tasks = ee.batch.Task.list()
    completed_tasks = [task for task in tasks if task.state in ['COMPLETED']]
    return len(completed_tasks)

### Execute the export

In [None]:
# iterate the time slot for the pair of slot start and end
import sys
import time
time_index=0
export_count=0
total_count=export_shape_count*len(export_dates) # 4(date) * 1206(dumpsite count) = 4,824
for i in range(len(export_dates)-1):
    time_index+=1
    sys.stdout.write(f"\r{time_index}, {export_dates[i]},{export_dates[i+1]} ")
    sys.stdout.flush()
    for index in index_list:
        select_shape=shapefile_table.filter(ee.Filter.eq('Index',index))
        #print(export_dates[i],export_dates[i+1])
        export_s2_image=s2_dataset.filterDate(export_dates[i],export_dates[i+1]).filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE',20)).mean()  # only use mean rather than first()

        # check the tasks queue running tasks and max runing tasks

        while check_task_status() >= max_runing_tasks:
            print(f"Waiting for tasks to complete...runing tasks:{check_task_status()}")
            time.sleep(task_check_interval)


        export_tif_image(index,select_shape.first(),export_s2_image,export_dates[i].replace('-',''),"sentinel",sentinel_folder_name)
        export_count+=1
        print(f"exporting task submission progress:{export_count} / {total_count}")



In [None]:

tasks = ee.batch.Task.list()
print(check_task_status())
print(check_complete_task())

COMPLETED
0
7988
0


### Test one image from the sentinel



In [None]:

test_feature_sentinel=shapefile_table.filter(ee.Filter.eq('Index',1740)).first()
print(test_feature_sentinel.getInfo())

test_sentinel_image=s2_dataset.filterDate("2024-04-01","2024-04-30" ).filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE',20)).mean()  # only use mean rather than first()

print(test_sentinel_image.bandTypes().getInfo())



{'type': 'Feature', 'geometry': {'type': 'Polygon', 'coordinates': [[[-80.84707833151006, -5.746746782807356], [-80.84676616394673, -5.747246197879137], [-80.8464050075316, -5.746933993167123], [-80.84625339905642, -5.746679824048988], [-80.84584316788882, -5.7466753836964575], [-80.84558006814474, -5.746720028631912], [-80.84563800928616, -5.746960774541778], [-80.84568260535481, -5.746974125001984], [-80.84597687121577, -5.747009839682905], [-80.84583421491011, -5.747228295317482], [-80.84551317394687, -5.747433477682885], [-80.84525902616313, -5.747384421024927], [-80.84499144796432, -5.747736673204591], [-80.84482201509482, -5.747446853788952], [-80.8446123773197, -5.747558310619363], [-80.844434074977, -5.7476430150083075], [-80.8441977456328, -5.74733980539183], [-80.84382762443725, -5.747281813830131], [-80.8436760110852, -5.747433455974853], [-80.8438632683146, -5.747750022181812], [-80.84347532355517, -5.748062198781131], [-80.84330141084888, -5.747879342492197], [-80.84308298

In [None]:
## export the sentinel image

export_tif_image(1740,select_shape.first(),export_s2_image,export_dates[i].replace('-',''),"sentinel",sentinel_folder_name)


['B2', 'B3', 'B4', 'B8', 'B11', 'B12']
Export task start:2  202405


In [None]:
# check the exporting status
ee.data.listOperations()

In [None]:
tasks = ee.batch.Task.list()
print(len(tasks))
print(tasks)

7988
[<Task XCRGSYQ7FXO7YBTBZZVCXDSG EXPORT_IMAGE: export_1267 (COMPLETED)>, <Task UFONNKBHMN3C5GWVWCAN346H EXPORT_IMAGE: export_1094 (COMPLETED)>, <Task PY5DXVHCMQ274EO6K7M5JLLZ EXPORT_IMAGE: export_1345 (COMPLETED)>, <Task A6ZLCM7YJ4IQTIMRJUDTPGGY EXPORT_IMAGE: export_1096 (COMPLETED)>, <Task ILAVHB7XTE43M5ZKRFAD2GWE EXPORT_IMAGE: export_30 (COMPLETED)>, <Task VSSQ7VZOW3PP7CRX4UPGSVMI EXPORT_IMAGE: export_1284 (COMPLETED)>, <Task K63KL2OYALUFVNQNZVNGMHJZ EXPORT_IMAGE: export_1152 (COMPLETED)>, <Task ESCXKNT7IDCM53EHIIH2QFAQ EXPORT_IMAGE: export_841 (COMPLETED)>, <Task JH6GKHUQNQBK4NLKFGEFT7SY EXPORT_IMAGE: export_1308 (COMPLETED)>, <Task RQN3SJXAAMPDA6ATEUJW2DY5 EXPORT_IMAGE: export_1293 (COMPLETED)>, <Task ZYZ4VBOA4GD5GPQ2D47G2RVE EXPORT_IMAGE: export_1275 (COMPLETED)>, <Task E3S6VNGFXWKUFPKU3LVYSYBO EXPORT_IMAGE: export_1248 (COMPLETED)>, <Task WJISLEON645VYDZOJNJMZEVL EXPORT_IMAGE: export_1277 (COMPLETED)>, <Task LKHH3OBYTN7MGGMKVVKZFOD2 EXPORT_IMAGE: export_1112 (COMPLETED)>, <Ta

In [None]:
# test showing it

import geemap
map=geemap.Map()
vis_params = {
    'min': 0,
    'max': 3000,  # why the simple is 0.3
    'bands': ['B4', 'B3', 'B2']
}

map.setCenter(83.277, 17.7009, 12);
map.addLayer(test_sentinel_image, vis_params, 'Sentinel-2')

