<a href="https://colab.research.google.com/github/aitorvv96/cajon_desastre/blob/master/Python/get_GEE_images.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **DOWNLOAD MDT, SENTINEL-1 AND SENTINEL-2 IMAGES FROM GEE**

This script is compounded by 4 main parts:

-The first one is used to import libraries, set variables and obtain from github the coordinates of the trees that should be downloaded. This **first script must be always runned**. 
-The following scripts - MDT, Sentinel-1 and Sentinel-2 - can be runned independly depending on which product the user want to download.

**PART 1**

Import libraries, set variables and obtain position tree data.

In [None]:
# Run this code if you have to install a library with the library you need
#!pip install earthengine-api

In [None]:
## Import libraries
import time
import pandas as pd
import ee
import math
import random

In [None]:
# Run the ee.Authenticate function to authenticate your access to Earth Engine servers and ee.Initialize to initialize it. 
# Authenticate to the Earth Engine servers:
ee.Authenticate() 

# Initialize the API:
ee.Initialize() # Then you'll be asked to grant Earth Engine access to your Google account

In [None]:
# Set your drive account
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# import the csv files from a DRIVE folder to a list of df

from os import listdir
from os.path import isfile, join

mypath = '/content/drive/MyDrive/SGE-tree_data/'  # set main path
onlyfiles = [f for f in listdir(mypath) if isfile(join(mypath, f))]  

csv_list = list()
for file in onlyfiles:  # extract all paths
  csv_list.append(mypath + file)
  
df_list = csv_list  # copy the list to save the original 

#for file in csv_list:  # read all files
#  df = pd.read_csv(file)
#  print(df)

In [None]:
### Output downloaded images folder on drive
# Folder will be automatically created if it does not exist
# GEE only alow to save to folder from base, for Example: /folder1 - works, /folder1/folder2 - does not work
Prefix_original_points_folder_name = 'Original_'
Prefix_random_points_folder_name = 'Random_'

### Random points properties
number_of_creations = 4
output_csv_base_name = 'Random_points_'
maximum_distance = 75 # meter 
buffer_distance = 100 #The buffer has to be of 100m, which means 100 meters of radio so finally it is 200 meters

Now, let's run a common function.

In [None]:
#### (***Random points***)
#Here, the function which take a dataframe with list of coordinates and generate
#random points from original coordinates within a pre-defined radius (distance)
#and outputs a new dataframe (csv)
def generate_random_point_from_csv(df, distance):

    new_df = df.copy(deep=True)

    #No information for converting X, Y from lat long -> remove X, Y columns
    new_df = new_df.drop(columns=['Y_WGS84', 'X_WGS84'])

    R = 6378.1 #Radius of the Earth

    for i in range(len(df['X_WGS84'])):
        brng = math.radians(random.randrange(360))
        d = float(random.randrange(distance))/1000.

        lat1 = math.radians(df['Y_WGS84'][i]) #Current lat point converted to radians
        lon1 = math.radians(df['X_WGS84'][i]) #Current long point converted to radians

        lat2 = math.asin( math.sin(lat1)*math.cos(d/R) + math.cos(lat1)*math.sin(d/R)*math.cos(brng))
        lon2 = lon1 + math.atan2(math.sin(brng)*math.sin(d/R)*math.cos(lat1),
                math.cos(d/R)-math.sin(lat1)*math.sin(lat2))

        lat2 = math.degrees(lat2)
        lon2 = math.degrees(lon2)

        new_df.loc[i,'Y_WGS84'] = lat2
        new_df.loc[i,'X_WGS84'] = lon2
    
    return new_df

**PART 2**

Code to download MDT images.
That code will download the list of dataframes given on the first part of the script. 

In [None]:
'''SCRIPT FOCUSED ONLY IN MDT files - We will use 100 m of distance to obtain a pixel of 200x200
and a random distance of 75m'''

#### (***Elevation data***)
#Here, the function which charge the dataset and extract the buffer around the coordinates is defined
def get_MDT_data(u_lon, u_lat, folder_name,  num):
  elv = ee.Image('USGS/SRTMGL1_003') #NASA SRTM Digital Elevation 30m - Earth Engine Snippet
  u_poi = ee.Geometry.Point(u_lon, u_lat) #Singular tree point is defined here
  roi = u_poi.buffer(buffer_distance) #The buffer has to be of 100m, which means 100 metres of radio so finally it is 200 metres
  task = ee.batch.Export.image.toDrive(image=elv,#The image is exported to drive
                                      folder=folder_name,
                                      description=str(num)+ '_' + folder_name,
                                      scale=30,
                                      region=roi,
                                      fileNamePrefix=str(num)+ '_' + folder_name,
                                      crs='EPSG:4326',
                                      fileFormat='GeoTIFF')

  print('MDT images from tree', num, 'will be downloaded.')

  task.start()

In [None]:
# Function to download MDT images
def download_MDT_images(df, folder_name):
    X_WGS84 = df['X_WGS84']
    Y_WGS84 = df['Y_WGS84']
    num = df['TREE_ID']
    iteration = len(X_WGS84)
    ele_folder = 'MDT_30m_' + folder_name
    for i in range(iteration):
      get_MDT_data(X_WGS84[i], Y_WGS84[i], ele_folder, num[i])

In [None]:
# That code will download all the df of the list without exceed the
# maximum of tasks on the GEE queue (3000)

for df in df_list:  # for each df on the list...

  my_df = pd.read_csv(df)

  # download original image
  download_MDT_images(my_df, Prefix_original_points_folder_name)  

  # download random images
  for n in range(number_of_creations):
      new_df = generate_random_point_from_csv(my_df, maximum_distance)
      csv_name = output_csv_base_name + str(n + 1) + '.csv'
      new_df.to_csv(csv_name) #new csv files will be saved in working directory
      download_MDT_images(new_df, Prefix_random_points_folder_name + str(n + 1))

  task_list = ee.data.getTaskList()  # tasks list must be updated with each df

  while task_list[0]['state'] != 'COMPLETED':  # check if the last was downloaded

    time.sleep(60)  # wait some time to download the images
    task_list = ee.data.getTaskList()  # tasks list must be updated on each try

    if task_list[0]['state'] == 'COMPLETED':  # condition to go out of "while"

      print('-------------------')
      print('df fully downloaded')
      print('-------------------')

      break

print('ALL THE MDT DOWNLOADS HAVE FINISHED')

**PART 3**

Code to download Sentinel-1 images.
That code will download the list of dataframes given on the first part of the script. 

In [None]:
'''SCRIPT FOCUSED ONLY IN SENTINEL-1 files - We will use 100 m of distance to obtain a pixel of 200x200
and a random distance of 75m'''

#### (***Sentinel 1***)
#Here, the function which charge the dataset and extract the buffer around the coordinates is defined

def get_Sentinel1_data(u_lon,u_lat, folder_name, num):
  sen1_folder = 'SGE-SingularTrees/Sen1_' + folder_name

  ffa_db = ee.Image(ee.ImageCollection('COPERNICUS/S1_GRD') 
                       .filterBounds(ee.Geometry.Point(u_lon,u_lat).buffer(buffer_distance)) 
                       .filterDate(ee.Date('2020-06-01'), ee.Date('2020-08-30')) 
                       .first())
  
  u_poi = ee.Geometry.Point(u_lon, u_lat)#Singular tree point is defined here
  roi = u_poi.buffer(buffer_distance) #The buffer has to be of 100m, which means 100 metres of radio so finally it is 200 metres
  
  #In the following steps, a script to download the the images VV, VH and rbg(of VV and VH) is done

  #Make an RGB color composite image (VV,VH,VV/VH).
  RGB = ee.Image.rgb(ffa_db.select('VV'),
                    ffa_db.select('VH'),
                    ffa_db.select('VV').divide(ffa_db.select('VH')))

  #Selection of the other two images
  VV = ee.Image(ffa_db.select('VV'))
  VH = ee.Image(ffa_db.select('VH'))

  #rgb
  task = ee.batch.Export.image.toDrive(image=RGB,
                                        folder=sen1_folder,
                                        description=str(num)+'_Sen1_RGB_10m_'+folder_name,
                                        scale=10,
                                        region=roi,
                                        fileNamePrefix=str(num)+'_Sen1_RGB_10m_'+folder_name,
                                        crs='EPSG:4326',
                                        fileFormat='GeoTIFF')
  
  print('RGB Sentinel-1 images from tree', num, 'will be downloaded.')
  
  task.start()

  #vv
  task = ee.batch.Export.image.toDrive(image=VV,
                                        folder=sen1_folder,
                                        description=str(num)+'_Sen1_VV_10m_'+folder_name,
                                        scale=10,
                                        region=roi,
                                        fileNamePrefix=str(num)+'_Sen1_VV_10m_'+folder_name,
                                        crs='EPSG:4326',
                                        fileFormat='GeoTIFF')
  print('VV Sentinel-1 images from tree', num, 'will be downloaded.')
  
  task.start()

  #vh
  task = ee.batch.Export.image.toDrive(image=VH,
                                        folder=sen1_folder,
                                        description=str(num)+'_Sen1_VH_10m_'+folder_name,
                                        scale=10,
                                        region=roi,
                                        fileNamePrefix=str(num)+'_Sen1_VH_10m_'+folder_name,
                                        crs='EPSG:4326',
                                        fileFormat='GeoTIFF')
  
  print('VH Sentinel-1 images from tree', num, 'will be downloaded.')
  
  task.start()

In [None]:
# Function to download Sentinel-1 images
def download_S1_images(df, folder_name):
    X_WGS84 = df['X_WGS84']
    Y_WGS84 = df['Y_WGS84']
    num = df['TREE_ID']
    iteration = len(X_WGS84)
    for i in range(iteration):
      get_Sentinel1_data(X_WGS84[i], Y_WGS84[i], folder_name, num[i])

In [None]:
# That code will download all the df of the list without exceed the
# maximum of tasks on the GEE queue (3000)

for df in df_list:  # for each df on the list...

  my_df = pd.read_csv(df)

  # download original image
  download_S1_images(my_df, Prefix_original_points_folder_name)  

  # download random images
  for n in range(number_of_creations):
      new_df = generate_random_point_from_csv(my_df, maximum_distance)
      csv_name = output_csv_base_name + str(n + 1) + '.csv'
      new_df.to_csv(csv_name) #new csv files will be saved in working directory
      download_S1_images(new_df, Prefix_random_points_folder_name + str(n + 1))

  task_list = ee.data.getTaskList()  # tasks list must be updated with each df

  while task_list[0]['state'] != 'COMPLETED':  # check if the last was downloaded

    time.sleep(30)  # wait some time to download the images
    task_list = ee.data.getTaskList()  # tasks list must be updated on each try

    if task_list[0]['state'] == 'COMPLETED':  # condition to go out of "while"

      print('-------------------')
      print('df fully downloaded')
      print('-------------------')

      break

print('ALL THE SENTINEL-1 DOWNLOADS HAVE FINISHED')

RGB Sentinel-1 images from tree 10_2206_A_1_3 will be downloaded.
VV Sentinel-1 images from tree 10_2206_A_1_3 will be downloaded.
VH Sentinel-1 images from tree 10_2206_A_1_3 will be downloaded.
RGB Sentinel-1 images from tree 10_2797_A_1_2 will be downloaded.
VV Sentinel-1 images from tree 10_2797_A_1_2 will be downloaded.
VH Sentinel-1 images from tree 10_2797_A_1_2 will be downloaded.
RGB Sentinel-1 images from tree 11_270_A_1_7 will be downloaded.
VV Sentinel-1 images from tree 11_270_A_1_7 will be downloaded.
VH Sentinel-1 images from tree 11_270_A_1_7 will be downloaded.
RGB Sentinel-1 images from tree 11_270_A_1_8 will be downloaded.
VV Sentinel-1 images from tree 11_270_A_1_8 will be downloaded.
VH Sentinel-1 images from tree 11_270_A_1_8 will be downloaded.
RGB Sentinel-1 images from tree 11_4_A_1_2 will be downloaded.
VV Sentinel-1 images from tree 11_4_A_1_2 will be downloaded.
VH Sentinel-1 images from tree 11_4_A_1_2 will be downloaded.
RGB Sentinel-1 images from tree 11_

**PART 4**

Code to download Sentinel-2 images.
That code will download the list of dataframes given on the first part of the script. 

In [None]:
'''SCRIPT FOCUSED ONLY IN SENTINEL-2 files - We will use 100 m of distance to obtain a pixel of 200x200
and a random distance of 75m'''

#### (***Sentinel 2***)
#Here, the function which charge the dataset and extract the buffer around the coordinates is defined
def get_Sentinel2_data(u_lon,u_lat, folder_name, num):
  list_10m=['B2','B3','B4','B8','B11','B12']
  list_20m=['B5','B6','B7','B8A']
  sen2_folder = 'SGE-SingularTrees/Sen2_' + folder_name


  for i in list_10m:
    sen2 = ee.ImageCollection('COPERNICUS/S2_SR').filterDate("2020-06-01","2020-08-30").filterBounds(ee.Geometry.Point(u_lon,u_lat).buffer(buffer_distance))\
    .select(i).sort('CLOUDY_PIXEL_PERCENTAGE', False).mosaic() #Sort() and mosaic() are supossed to select the pixels with less cloud presence
    u_poi = ee.Geometry.Point(u_lon, u_lat) #Singular tree point is defined here
    roi = u_poi.buffer(buffer_distance) #The buffer has to be of 100m, which means 100 metres of radio so finally it is 200 metres
    task = ee.batch.Export.image.toDrive(image=sen2,
                                          folder=sen2_folder,
                                          description=str(num)+'_Sen2_'+i+'_10m_'+folder_name,
                                          scale=10,
                                          region=roi,
                                          fileNamePrefix=str(num)+'_Sen2_'+i+'_10m_'+folder_name,
                                          crs='EPSG:4326',
                                          fileFormat='GeoTIFF')
    
    print('10m Sentinel-2 images from tree', num, 'will be downloaded.')
    
    task.start()

  for i in list_20m:
    sen2 = ee.ImageCollection('COPERNICUS/S2_SR').filterDate("2020-06-01","2020-08-30").filterBounds(ee.Geometry.Point(u_lon,u_lat).buffer(100))\
    .select(i).sort('CLOUDY_PIXEL_PERCENTAGE', False).mosaic()#Sort() and mosaic() are supossed to select the pixels with less cloud presence
    u_poi = ee.Geometry.Point(u_lon, u_lat)#Singular tree point is defined here
    roi = u_poi.buffer(buffer_distance) #The buffer has to be of 100m, which means 100 metres of radio so finally it is 200 metres
    task = ee.batch.Export.image.toDrive(image=sen2,
                                          folder=sen2_folder,
                                          description=str(num)+'_Sen2_'+i+'_20m_'+folder_name,
                                          scale=20,
                                          region=roi,
                                          fileNamePrefix=str(num)+'_Sen2_'+i+'_20m_'+folder_name,
                                          crs='EPSG:4326',
                                          fileFormat='GeoTIFF')
    
    print('20m Sentinel-2 images from tree', num, 'will be downloaded.')
    
    task.start()

In [None]:
# Function to download Sentinel-2 images
def download_S2_images(df, folder_name):
    X_WGS84 = df['X_WGS84']
    Y_WGS84 = df['Y_WGS84']
    num = df['TREE_ID']
    iteration = len(X_WGS84)
    for i in range(iteration):
      get_Sentinel2_data(X_WGS84[i], Y_WGS84[i], folder_name, num[i])

In [None]:
# That code will download all the df of the list without exceed the
# maximum of tasks on the GEE queue (3000)

for df in df_list:  # for each df on the list...

  my_df = pd.read_csv(df)

  # download original image
  download_S2_images(my_df, Prefix_original_points_folder_name)  

  # download random images
  for n in range(number_of_creations):
      new_df = generate_random_point_from_csv(my_df, maximum_distance)
      csv_name = output_csv_base_name + str(n + 1) + '.csv'
      new_df.to_csv(csv_name) #new csv files will be saved in working directory
      download_S2_images(new_df, Prefix_random_points_folder_name + str(n + 1))

  task_list = ee.data.getTaskList()  # tasks list must be updated with each df

  while task_list[0]['state'] != 'COMPLETED':  # check if the last was downloaded

    time.sleep(30)  # wait some time to download the images
    task_list = ee.data.getTaskList()  # tasks list must be updated on each try

    if task_list[0]['state'] == 'COMPLETED':  # condition to go out of "while"

      print('-------------------')
      print('df fully downloaded')
      print('-------------------')

      break

print('ALL THE SENTINEL-2 DOWNLOADS HAVE FINISHED')

10m Sentinel-2 images from tree 10_2206_A_1_3 will be downloaded.
10m Sentinel-2 images from tree 10_2206_A_1_3 will be downloaded.
10m Sentinel-2 images from tree 10_2206_A_1_3 will be downloaded.
10m Sentinel-2 images from tree 10_2206_A_1_3 will be downloaded.
10m Sentinel-2 images from tree 10_2206_A_1_3 will be downloaded.
10m Sentinel-2 images from tree 10_2206_A_1_3 will be downloaded.
20m Sentinel-2 images from tree 10_2206_A_1_3 will be downloaded.
20m Sentinel-2 images from tree 10_2206_A_1_3 will be downloaded.
20m Sentinel-2 images from tree 10_2206_A_1_3 will be downloaded.
20m Sentinel-2 images from tree 10_2206_A_1_3 will be downloaded.
10m Sentinel-2 images from tree 10_2797_A_1_2 will be downloaded.
10m Sentinel-2 images from tree 10_2797_A_1_2 will be downloaded.
10m Sentinel-2 images from tree 10_2797_A_1_2 will be downloaded.
10m Sentinel-2 images from tree 10_2797_A_1_2 will be downloaded.
10m Sentinel-2 images from tree 10_2797_A_1_2 will be downloaded.
10m Sentin

EEException: ignored

SOME AUXILIAR FUNCTIONS AND RESOURCES:
* https://developers.google.com/earth-engine/guides


In [None]:
!earthengine task cancel all # cancel all the running downloads
ee.data.getTaskList()  # check task list