# <img src="https://github.com/fjmeyer/HydroSAR/raw/master/HydroSARbanner.jpg" width="100%" />

### <font size="7"> <b> Download HAND from GEE</b></font>

<font size="5">  Download raster tiles from HAND dataset on Google Earth Engine </font>

<br>
<font size="4"> <b> Part of NASA A.37 Project:</b> Integrating SAR Data for Improved Resilience and Response to Weather-Related Disasters   <br>
<font size="4"> <b> PI:</b>Franz J. Meyer <br>
<font size="3"> Version 0.1.1 - 2021/01/24 <br>
<b>Change Log</b><br>
See bottom of the notebook.<br>
</font> 
<font color='rgba(0,0,200,0.2)'> <b>Contact: </b> batuhan.osmanoglu@nasa.gov </font>


<hr>
<font face="Calibri">

<font size="5"> <b> 0. Importing Relevant Python Packages </b> </font>

<font size="3"> The first step in any notebook is to import the required Python libraries into the Jupyter environment. In this notebooks we use the following libraries:
<ol type="1">
    <li> <b><a href="https://www.gdal.org/" target="_blank">GDAL</a></b> is a software library for reading and writing raster and vector geospatial data formats. It includes a collection of programs tailored for geospatial data processing. Most modern GIS systems (such as ArcGIS or QGIS) use GDAL in the background.</li>
    <li> <b><a href="http://www.numpy.org/" target="_blank">NumPy</a></b> is one of the principal packages for scientific applications of Python. It is intended for processing large multidimensional arrays and matrices, and an extensive collection of high-level mathematical functions and implemented methods makes it possible to perform various operations with these objects. </li> 
    <li> <b><a href="https://docs.python.org/3/library/urllib.html" target="_blank">urllib</a></b> is an internal package that collects several modules for working with URLs.</li>
    <li> <b><a href="https://docs.python.org/3/library/zipfile.html" target="_blank">zipfile</a></b> is an internal python module provides tools to create, read, write, append, and list a ZIP file.</li>
    <li> <b><a href="https://docs.python.org/3/library/tempfile.html" target="_blank">tempfile</a></b> is an internal python package that creates temporary files and directories.</li>
    <li> <b><a href="https://github.com/tqdm/tqdm" target="_blank"> tqdm </a></b> is a smart progress meter that allows easy addition of a loop counter.</li>
    <li> <b><a href="https://pypi.org/project/google-api-python-client/" target="_blank">googleapiclient</a></b> is the Python client library for Google's discovery based APIs. These client libraries are officially supported by Google. </li>
    <li> <b><a href="https://pypi.org/project/oauth2client/" target="_blank">oauth2client</a></b> is a client library for OAuth 2.0, which is used to access the users Google Earth Engine account.</li>
    <li> <b><a href="https://pypi.org/project/earthengine-api/" target="_blank">earthengine-api</a></b> allows developers to interact with Google Earth Engine using the Python programming language.</li>
    

In [None]:
#Setup Environment
import os
import numpy as np
import urllib
import zipfile
import gdal
import osr
import pyproj
import glob
import tempfile
from tqdm.auto import tqdm
#The two lines below are for visually browsing and selecting the DEM. 
import ipywidgets as ui

from IPython.display import display
from IPython.core.debugger import set_trace #Enable if you like to debug and add set_trace() where you want debugger

try:
    import googleapiclient
except:
    #!pip install google-api-python-client
    os.system('pip install google-api-python-client')

try:
    from oauth2client import crypt
except:
    #!pip install --upgrade oauth2client
    os.system('pip install --upgrade oauth2client')
    from oauth2client import crypt

try:
    import ee
except:
    #!pip install earthengine-api
    os.system('pip install earthengine-api')
    import ee


<font size="5"> <b> 1. Define convenience functions </b> </font>

<font size="3"> Here we define some functions for later convenience.
    
<ol type="1">
    <li> <b>built_vrt</b> generate a virtual raster file using GDAL. </li>
    <li> <b>bounding_box_to_string</b> given west, east, south, north bounds, return a string e.g. 'E084_N025_E085_N024' </li>
    <li> <b>coordinate_to_string</b> return string from given lat/lon coordinates. e.g. 'E084_N025' </li>
    <li> <b>reproject</b> reprojects a given vector file to another coordinate reference system (CRS). </li>
    <li> <b>download_tile_from_ge</b> download a single tile from Google Earth Engine, not to exceed 32MB in size.</li>
    <li> <b>download_from_ge</b> calculates number of tiles needed given an area, downloads and stitches.</li>    
    <li> <b>estimate_size</b> estimates tile size with SRTM-1arcsec format and sampling assumptions </li>     
    <li> <b>gdal_get_geotransform</b> returns the geotransform of the dataset using GDAL. </li>
    <li> <b>gdal_get_projection</b> returns the spatial reference system in wkt, proj4, or epsg formats.</li>
    <li> <b>gdal_get_WESN</b> returns rectangle bounding box coordinates: west, east, south and north </li>
    <li> <b>numel</b> returns number of elements for a wide range of data types. </li>  
    <li> <b>PathSelector</b> displays a file tree to easily browse and select a file. </li>
    <li> <b>transform_point</b> transforms a point coordinate to another coordinate reference system (CRS). </li>
    <li> <b>TqdmUpTo</b> callback function for TQDM counter used in warp function. </li>
    <li> <b>warp</b> a raster file into a new coordinate system using GDAL. Can also be used to combine multiple tiles into a single file. </li>
    

In [None]:
#Define Constants and Functions

google_api_download_limit=33554432
def build_vrt(filename, input_file_list, targetAlignedPixels=True, separate=False, resampleAlg='near', resolution='highest'):
    vrt_options = gdal.BuildVRTOptions(resampleAlg=resampleAlg, resolution=resolution, separate=separate, targetAlignedPixels=targetAlignedPixels)
    ds=gdal.BuildVRT(filename,input_file_list,options=vrt_options)
    ds.FlushCache()

def bounding_box_to_string(w,e,s,n, factor=3):
    """Format a pair of angles lat, lon as a string:
    string = coordinate_to_string(lat, lon)
    """
    s1=coordinate_to_string(n,w, factor=factor)
    s2=coordinate_to_string(s,e, factor=factor)
    return '_'.join([s1,s2])
    
def coordinate_to_string(lat,lon, factor=4):
    """Format a pair of angles lat, lon as a string:
    string = coordinate_to_string(lat, lon)
    """
    def format_parts():
        fmt = '{prefix}{value:0>'+str(factor+3)+'d}'
        for angle, directions in zip((lon,lat), ['EW', 'NS']):
            if angle >= 0:
                yield fmt.format(prefix=directions[0], value=int(angle*10**factor))
            else:
                yield fmt.format(prefix=directions[1], value=int(-angle*10**factor))
    return '_'.join(format_parts())

def download_tile_from_ge(W,E,S,N, collection_path='users/gena/global-hand/hand-100', prefix=None, fname=None, download_path=None, debug=False):
    if download_path is None:
        download_path = os.getcwd()
    if prefix is None:
        if fname is None:
            prefix='ge'            
    else: #prefix is not None
        if fname is not None: # and fname is not None
            print('When fname is specified prefix is ignored.')

    geom = ee.Geometry.Polygon( [[E, S], [W, S], [W, N], [E, N], [E, S]] )    
    try:  
        #GENA HAND dataset is an image collection
        hand = ee.ImageCollection(collection_path)
        hand_clip=ee.ImageCollection(hand.filterBounds(geom))
        hand_mosaic=hand_clip.reduce(ee.Reducer.mean());
        ge_path = hand_mosaic.getDownloadUrl({
            'scale': 30,
            'crs': 'EPSG:4326',
            'region': geom,
            'maxPixels': 1e10
        })        
    except:
        #This allows for future expansion to other datasets like NASADEM
        #e.g.
        #download_from_ge(W,E,S,N, collection_path="NASA/NASADEM_HGT/001",download_path='/home/jovyan/nasadem_test/ge_nasadem.vrt', debug=True, keep_downloads=False)
        hand = ee.Image(collection_path)
        ge_path = hand.getDownloadUrl({
            'scale': 30,
            'crs': 'EPSG:4326',
            'region': geom,
            'maxPixels': 1e10
        })

    if fname is None:
        fname=prefix+f'_{W}W_{E}E_{S}S_{N}N.zip'
    if debug: print(f'Downloading {fname} from: {ge_path}')
    output_path=os.path.join(download_path,fname)
    urllib.request.urlretrieve(ge_path, output_path)
    return output_path    

def download_from_ge(W,E,S,N, collection_path='users/gena/global-hand/hand-100', prefix=None, download_path=None, debug=False, keep_downloads=False, dstSRS=None):    
    #deal with input
    if prefix is None:
        prefix='ge'    
    if download_path is not None:
        download_folder=os.path.dirname(download_path)
        output_file=download_path
    else:
        download_folder=os.getcwd()
        output_file='ge_download.tif'
    if not os.path.exists(download_folder):
        os.mkdir(download_folder)
    else:
        if not os.path.isdir(download_folder):
            print(f'Can not create download_folder: {download_folder}')
            raise ValueError 
    #start download
    estimated_size=estimate_size(W,E,S,N)*1.2 #overestimate a little. 
    tile_count=estimated_size/google_api_download_limit    
    if tile_count > 1: #32MB=32*1024*1024
        print(f'Area too large by a factor of: {tile_count}')        
    divider = np.int(np.ceil(np.sqrt(tile_count))+1)
    SN = np.linspace(S,N,divider)    
    WE = np.linspace(W,E,divider)
    if debug: print(f'SN: {SN}')
    if debug: print(f'WE: {WE}')
    s  = SN[0]
    w  = WE[0]
    zip_files = []
    print('Downloading...')
    for n in tqdm(SN[1:]):
        for e in tqdm(WE[1:]):
            if debug: print(f'w/e/s/n: {w}/{e}/{s}/{n}')
            #fname=prefix+f'_{w}W_{e}E_{s}S_{n}N.zip'
            fname='_'.join([prefix, bounding_box_to_string(w,e,s,n)])+'.zip'
            if debug: print(f'download: {os.path.join(download_folder,fname)}')
            if os.path.isfile(os.path.join(download_folder,fname)):
                zip_files.append(os.path.join(download_folder,fname))
            else:
                zip_files.append(download_tile_from_ge(w,e,s,n, collection_path=collection_path, fname=fname, download_path=download_folder, debug=debug))
            w=e.copy()
        s=n.copy()
        w=WE[0]
    if debug: print(zip_files)
    #start_splicing  
    print('Unzipping...')
    zip_contents=[]
    extract_folders=[os.path.splitext(f)[0] for f in zip_files]
    for f,extract_folder in zip(zip_files, extract_folders):        
        with zipfile.ZipFile(f, 'r') as zip_ref:  
            zip_contents.append(zip_ref.namelist())
            zip_ref.extractall(path=extract_folder)
    if debug: print(zip_contents)
    #convert zip_contents to list of full paths.
    vrt_contents=[ os.path.join(extract_folders[k],zip_contents[k][0]) for k in range(len(zip_files))]
    if debug: print(f'contents:{vrt_contents}')
    #combine with gdal
    #build_vrt(output_file, vrt_contents, targetAlignedPixels=False) #targetAlignedPixels has to be False. Otherwise no file is created for some reason. 
    warp(vrt_contents, output_file, pixel_spacing=None,dstSRS=dstSRS) #vrt can leave too many files behind. Switching to warp. 

    #cleanup
    if debug or keep_downloads:
        print('Skipping cleanup in debug mode or when keep_downloads is set.')
        print(f'Files NOT deleted: {zip_files}')
        print(f'Folders NOT deleted: {extract_folders}')
    else:
        for f in zip_files:
            os.remove(f)
        for f in vrt_contents:
            os.remove(f)
            os.rmdir(os.path.dirname(f))
    print(f'Successfully generated:{output_file}')    
        
def estimate_size(W,E,S,N):
    pixels_per_deg=3601
    bytes_per_pixel=4
    return (E-W)*pixels_per_deg*(N-S)*pixels_per_deg*bytes_per_pixel    
    
def gdal_get_geotransform(filename):
    '''
    [top left x, w-e pixel resolution, rotation, top left y, rotation, n-s pixel resolution]=gdal_get_geotransform('/path/to/file')
    '''
    #http://stackoverflow.com/questions/2922532/obtain-latitude-and-longitude-from-a-geotiff-file
    ds = gdal.Open(filename)
    return ds.GetGeoTransform()

def gdal_get_projection(filename, out_format='proj4'):
    """
    epsg_string=get_epsg(filename, out_format='proj4')
    """
    try:
      ds=gdal.Open(filename, gdal.GA_ReadOnly)
      srs=gdal.osr.SpatialReference()
      srs.ImportFromWkt(ds.GetProjectionRef())
    except: #I am not sure if this is working for datasets without a layer. The first try block should work mostly.
      ds=gdal.Open(filename, gdal.GA_ReadOnly)
      ly=ds.GetLayer()
      if ly is None:
        print(f"Can not read projection from file:{filename}")
        return None
      else:
        srs=ly.GetSpatialRef()
    if out_format.lower()=='proj4':
      return srs.ExportToProj4()
    elif out_format.lower()=='wkt':
      return srs.ExportToWkt()
    elif out_format.lower()=='epsg':
      crs=pyproj.crs.CRS.from_proj4(srs.ExportToProj4())
      return crs.to_epsg()

def gdal_get_WESN(filename):
    '''
    (minx,miny,maxx,maxy)=corners('/path/to/file')
    '''
    #http://stackoverflow.com/questions/2922532/obtain-latitude-and-longitude-from-a-geotiff-file
    ds = gdal.Open(filename)
    width = ds.RasterXSize
    height = ds.RasterYSize
    gt = ds.GetGeoTransform()
    minx = gt[0]
    miny = gt[3] + width*gt[4] + height*gt[5] 
    maxx = gt[0] + width*gt[1] + height*gt[2]
    maxy = gt[3] 
    return (minx,maxx,miny,maxy) #(minx,miny,maxx,maxy)    

def numel(x):
    if isinstance(x, np.int):
      return 1
    elif isinstance(x, np.double):
      return 1
    elif isinstance(x, np.float):
      return 1
    elif isinstance(x, str):
      return 1
    elif isinstance(x, list) or isinstance(x, tuple):
      return len(x)
    elif isinstance(x, np.ndarray):
      return x.size
    else: 
      print('Unknown type {}.'.format(type(x)))
      return None
    
class PathSelector():
    """
    Displays a file selection tree. Any file can be selected. 
    Selected path can be obtained by: PathSelector.accord.get_title(0)
    """
    def __init__(self,start_dir,select_file=True):
        self.file        = None 
        self.select_file = select_file
        self.cwd         = start_dir
        self.select      = ui.SelectMultiple(options=['init'],value=(),rows=10,description='') 
        self.accord      = ui.Accordion(children=[self.select]) 

        self.accord.selected_index = None # Start closed (showing path only)
        self.refresh(self.cwd)
        self.select.observe(self.on_update,'value')

    def on_update(self,change):
        if len(change['new']) > 0:
            self.refresh(change['new'][0])

    def refresh(self,item):
        path = os.path.abspath(os.path.join(self.cwd,item))

        if os.path.isfile(path):
            if self.select_file:
                self.accord.set_title(0,path)  
                self.file = path
                self.accord.selected_index = None
            else:
                self.select.value = ()

        else: # os.path.isdir(path)
            self.file = None 
            self.cwd  = path

            # Build list of files and dirs
            keys = ['[..]']; 
            for item in os.listdir(path):
                if item[0] == '.':
                    continue
                elif os.path.isdir(os.path.join(path,item)):
                    keys.append('['+item+']'); 
                else:
                    keys.append(item); 

            # Sort and create list of output values
            keys.sort(key=str.lower)
            vals = []
            for k in keys:
                if k[0] == '[':
                    vals.append(k[1:-1]) # strip off brackets
                else:
                    vals.append(k)

            # Update widget
            self.accord.set_title(0,path)  
            self.select.options = list(zip(keys,vals)) 
            with self.select.hold_trait_notifications():
                self.select.value = ()
    
def transform_point(x,y,z,s_srs='+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs', t_srs='+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs'):
    '''
    transform_point(x,y,z,s_srs='+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs', t_srs='+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs')
    
    Known Bugs: gdal transform may fail if a proj4 string can not be found for the EPSG or WKT formats. 
    '''    
    srs_cs=osr.SpatialReference()    
    if "EPSG" == s_srs[0:4]:    
      srs_cs.ImportFromEPSG(int(s_srs.split(':')[1]));
    elif "GEOCCS" == s_srs[0:6]:
      srs_cs.ImportFromWkt(s_srs);
    else:
      srs_cs.ImportFromProj4(s_srs);

    trs_cs=osr.SpatialReference()    
    if "EPSG" == t_srs[0:4]:    
      trs_cs.ImportFromEPSG(int(t_srs.split(':')[1]));
    elif "GEOCCS" == t_srs[0:6]:
      trs_cs.ImportFromWkt(t_srs);
    else:
      trs_cs.ImportFromProj4(t_srs);
    transform = osr.CoordinateTransformation(srs_cs,trs_cs) 
    
    if numel(x)>1:
      return [  transformPoint(x[k], y[k], z[k]) for k in range(numel(x))]
    else:
      try:
        return transform.TransformPoint((x,y,z));
      except: 
        return transform.TransformPoint(x,y,z)

class TqdmUpTo(tqdm): #Used in warp()
    """Provides `update_to(n)` which uses `tqdm.update(delta_n)`."""
    def update_to(self, b=1, bsize=1, tsize=None):
        """
        b  : int, optional
            Number of blocks transferred so far [default: 1].
        bsize  : int, optional
            Size of each block (in tqdm units) [default: 1].
        tsize  : int, optional
            Total size (in tqdm units). If [default: None] remains unchanged.
        """
        if tsize is not None:
            self.total = tsize
        return self.update(b * bsize - self.n)  # also sets self.n = b * bsize
    def callback(self,complete, message, data):
        percent = int(complete * 100)  # round to integer percent
        self.update_to(percent, tsize=100)
        
def warp(src_filename, dst_filename, pixel_spacing=0.00008333, xRes=None, yRes=None, resampleAlg='nearest', dstSRS="EPSG:4326", tps=False, rpc=False):
    if xRes is None and pixel_spacing:
      xRes=pixel_spacing
    if yRes is None and pixel_spacing:
      yRes=pixel_spacing
    t=TqdmUpTo()
    gwo=gdal.WarpOptions(xRes=xRes, yRes=yRes, resampleAlg=resampleAlg, dstSRS=dstSRS, callback=t.callback)
    gdal.Warp(dst_filename, src_filename, options=gwo)
    del t
    



<font size="5"> <b> 2. Define input parameters</b> </font>

<font size="3"> Please set coordinates, output file and other options.

In [None]:
# Define Input Parameters

## 1. Bounding Box for the Area of Interest ##
# Set same value (e.g. all zeros) to select bounding box using file. 

W=0 #92.0#-96.5001618 #upper left -96.5 / -95.1 / 39.9 / 41.5
N=0 #25.0#41.5002284
E=0 #93.0#-95.1003219 #lower right
S=0 #24.0#39.9001804

## 2. Output file name ##
output_file='~/hand/GEE_HAND_Bangladesh_CALVAL.tif' # Tiles will be downloaded next to the output file.
#output_file='~/hand/GEE_HAND_' + bounding_box_to_string(W,E,S,N,factor=1) + '.tif'
debug=False  # display verbose messages. Will also set keep_downloads True. 
keep_downloads=False # By default downloaded files are deleted. Set to True if you like to keep them for troubleshooting. 

<font size="5"> <b> 3. Login to Google Earth Engine </b> </font>

<font size="3"> If this is your first time logging in, you will need to execute a command on terminal.

In [None]:
#Login to Google Earth Engine
try:
    ee.Initialize()
except:
    print('In a terminal run: earthengine authenticate')

<font size="5"> <b> 4. Select file if coordinates are set to zero </b> </font>

<font size="3"> The file is used to define the bounding box. 

In [None]:
# Select gdal file if WESN is all zeros. 
if W==E or N==S:
    print("Choose your GDAL compatible file using the file browser below:")
    f = PathSelector('.')
    display(f.accord)    
else:
    pass

In [None]:
#Get W,E,S,N from file if needed. 
if W==E or N==S:
    gdal_file=f.accord.get_title(0)    
    if os.path.exists(gdal_file):
        print(f'Selected file: {gdal_file}')
    else:
        print(f'Can not find file: {gdal_file}')
        raise ValueError
    W,E,S,N=gdal_get_WESN(gdal_file)
    epsg=gdal_get_projection(gdal_file, out_format='epsg')
    if epsg=="4326":
        pass
    else:
        srs=gdal_get_projection(gdal_file, out_format='proj4')
        W,N,h=transform_point(W,N,0,s_srs=srs)
        E,S,h=transform_point(E,S,0,s_srs=srs)
        W=np.round(W,2)
        E=np.round(E,2)
        S=np.round(S,2)
        N=np.round(N,2)
        del h # we don't use height
    print(f'Bounding Box W/E/S/N: {W} / {E} / {S} / {N}')
else:
    epsg='4326'

<font size="5"> <b> 5. Download from Google Earth Engine </b> </font>

<font size="3"> Depending on the area requested this could take a while.

In [None]:
# Download and stitch tiles 
output_file=os.path.expanduser(output_file) #expand ~ to user home
if os.path.exists(output_file):
    if yesno(f"Overwrite file: {output_file}"):
        pass
    else:
        assert False

download_from_ge(W,E,S,N, download_path=output_file, debug=debug, keep_downloads=keep_downloads, dstSRS="EPSG:" + str(epsg))

<font face="Calibri" size="2" color="gray"> <i> Version 0.1.1- Batu Osmanoglu
    
<b>Change Log</b> <br>
2021/01/24: v0.1.1 <br>
-Minor organization change, and added comments.<br>
2021/01/13: v0.1.0 <br>
-Initial version.<br>