# API development for download of Sentinel-2 and Landsat-8 data
### User defined mosaicing on harmonised products

## Introduction

Two different multi-spectral optical satellite constellations deliver open data source for earth observation.
The Sentinel-2 mission contains two identical satellites with suffix A and B, with opposite direction of orbits flying in a 
repeat cycle of 10 days each, 5 days for both orbits, respectively. Launch: 2015 (A), 2017 (B)[ESA].
<br>
As the second optical satellite Landsat 8 Operational Land Imager (OLI),...
with a repeat cycle of 16 days.  Launch: 2013, come with [res. etc.]... 2013
<br>


The scope of this extension to the python-based nasa_hls[email/github] aims to add the following functionality:
1. Download tiles of the harmonised product with user input geometry
2. Spatial mosaicing of the products for Landsat and Sentinel, and the acquisition dates, respectively

Additionally, we want to show that spectral indexes can be calculated with the downloaded products (3). 
<br>
The results are plotted with `ipyleafet` and `folium` on the Open Street Map WMS service.


## Download HSL files with user input
First, the module nasa_hls is loaded.

In [1]:
%matplotlib inline

import nasa_hls
import sys
import pandas as pd
import os

pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

# and for later processing in the notebook

For testing purpose, try downloading the kml file:

In [None]:
nasa_hls.download_kml()

make tiles from the user given:
1. spatial geometry
2. date or time span
3. product type (S30 or L30, or both)

The input features should come in the WGS84 coordinate system.
Dates have to be passes as yyyy-mm-dd strings

In [None]:
ds = nasa_hls.make_tiles_dataset(shape="/home/aleko-kon/Dokumente/nasa_hls/data/amazon.shp",
                                products=["S30", "L30"],
                                date="2018-05-05")
ds

Taking this list, the data sources can be downloaded via `download_tiles`. <br> This function call `download_batch` and other methods internally in order to parse the right URLs for download.

In [None]:
nasa_hls.download_tiles(dstdir=os.path.join(os.path.expanduser('~'), 'Dokumente', 'nasa_hls', 'data', 'hdf'),
                        datasets=ds)

***
## Mosaicing tiles
To simplify the handling of the downloaded files ... we mosaic the data according to the 
`make_mosaic.py`

In [None]:
path_in = os.path.join(os.path.expanduser('~'), 'Dokumente', 'nasa_hls', 'data', 'hdf' + os.sep)
path_out =  os.path.join(os.path.expanduser('~'), 'Dokumente', 'nasa_hls', 'data', 'tifs' + os.sep)

In [None]:
import rasterio
import os
from spatialist import Vector
import rasterio
import glob
from osgeo import gdal
import os.path
import sys

from nasa_hls.download_tiles import get_available_datasets_from_shape
from nasa_hls.download_hls_dataset import download_batch
from nasa_hls.utils import BAND_NAMES

def make_mosaic_tif(srcdir = path_in, dstdir = path_out, bands = None, product = "S30"):

    # get all hdf-files
    hdf_files_list = list(glob.glob(srcdir + '*.hdf'))

    # make list of all dates in directory
    dates_doy = []
    for line in hdf_files_list:
        l = line.split(".")[3][4:]
        dates_doy.append(l)

    # print list with unique
    #print(dates_doy)

    # make a function that gets the unique entries from a list
    # these will be the keys afterwards
    def unique_dates(liste):
        unique_dates = []
        for x in liste:
            if x not in unique_dates:
                unique_dates.append(x)
        return unique_dates
    
    # make the list of unique dates
    unique_doy = unique_dates(dates_doy)
    

    # create dictionary with keys being the unique dates
    # not yet specify the value-type
    dataframe_dict = {date: None for date in unique_doy}

    # add rows of orignial dataframe as values
    for key in dataframe_dict.keys():
        foo = []
        # now go over all the files
        for line in hdf_files_list:
            # get the doy
            line_date = line.split(".")[3][4:]
            # wenn doy in der line == dem key, dann schreib es in die liste foo
            if key == line_date:
                foo.append(line)
        # nachdem du über alle files gegangen bist, schreib an den key mit dem doy die aktuelle foo-liste,
        # die nach diesem Durchgang wieder neu aufgesetzt wird
        dataframe_dict[key] = foo

    #print(dataframe_dict["311"], "\n\n")
    #for key, item in dataframe_dict.items():
         #print(key, item, "\n")

     #check if band is specified
    if bands is None:
        bands = list(BAND_NAMES[product].keys())
        long_band_names = []
        for long_band_name in bands:
            band = BAND_NAMES[product][long_band_name]
            long_band_names.append(band)
    else:
        long_band_names = bands
        
    

    for key in dataframe_dict.keys():
        for band in long_band_names:
            hdf_list = dataframe_dict[key]
            hdf_file_bands = []
            for hdf_file in hdf_list:
                filename = 'HDF4_EOS:EOS_GRID:"{0}":Grid:{1}'.format(hdf_file, band)
                hdf_file_bands.append(filename)

            #print("\n".join(hdf_file_bands))
            # make mosaics for each band for each date
            vrt_path = os.path.join(path_out + key + band + ".vrt")
            build_vrt = gdal.BuildVRT(vrt_path, hdf_file_bands)
            build_vrt = None

    dates_dict = {date: None for date in unique_doy}

    # list of vrts
    vrts_path = path_out
    vrts = list(glob.glob(vrts_path + "*.vrt"))
    
    #print(vrts)
    

    for key in dates_dict.keys():
        files = []
        for single_file in vrts:
            doy = single_file.split("/")[-1][0:3]
            if key == doy:
                files.append(single_file)

        dates_dict[key] = files


    ######is
    #print the dict
    ######
    # for keys, items in dates_dict.items():
    #     print(keys, items, "\n")
    # print dictionary nicely
    #print("\n".join("{}\t{}".format(k, v) for k, v in dates_dict.items()))
    #print(len(dates_dict))

    for date in dates_dict.keys():
        print(date)
        vrts_per_date = dates_dict[date]
        vrt_path = os.path.join(path_out + date + "final.vrt")
        single_vrt = gdal.BuildVRT(vrt_path, vrts_per_date, separate=True)
        tiff_path = os.path.join(dstdir + date + ".tiff")
        final_tif = gdal.Translate(tiff_path, single_vrt)







            # final_tif = gdal.Translate(os.path.join(path_data_lin_robin + "mosaic/" + key + band + ".tiff"), build_vrt)
            # final_tif = None
            # build_vrt = None


In [None]:
make_mosaic_tif()

The output extent can now be clipped to the actual outline of the input geometry.

## Calculate Indexes

## Visualise Data

## Visualise Data

## Visualise Data

In [2]:
nasa_hls.download_kml()

UTM tiles already successfully downloaded to:
 /home/aleko-kon/.nasa_hls/.auxdata/utm.kml 



'/home/aleko-kon/.nasa_hls/.auxdata/utm.kml'

make tiles from the user given:
1. spatial geometry
2. date or time span
3. product type (S30 or L30, or both)

The input features should come in the WGS84 coordinate system.
Dates have to be passes as yyyy-mm-dd strings

In [43]:
ds = nasa_hls.make_tiles_dataset(shape="/home/aleko-kon/Dokumente/nasa_hls/data/amazon.shp",
                                products=["S30", "L30"],
                                date="2018-05-05")
ds

valid shape, process continues

single date: 2018-05-05
 
valid shape, process continues

UTM tiles already successfully downloaded to:
 /home/aleko-kon/.nasa_hls/.auxdata/utm.kml 



  0%|          | 0/10 [00:00<?, ?it/s]


getting available datasets . . .


100%|██████████| 10/10 [00:14<00:00,  1.46s/it]


[    product   tile       date                                                url
 40      S30  19LDJ 2018-05-05  https://hls.gsfc.nasa.gov/data/v1.4/S30/2018/1...
 237     S30  19LEK 2018-05-05  https://hls.gsfc.nasa.gov/data/v1.4/S30/2018/1...
 367     S30  19LDH 2018-05-05  https://hls.gsfc.nasa.gov/data/v1.4/S30/2018/1...]

Taking this list, the data sources can be downloaded via `download_tiles`. <br> This function call `download_batch` and other methods internally in order to parse the right URLs for download.

In [None]:
nasa_hls.download_tiles(dstdir=os.path.join(os.path.expanduser('~'), 'Dokumente', 'nasa_hls', 'data', 'hdf'),
                        datasets=ds)

 33%|███▎      | 1/3 [01:29<02:58, 89.28s/it]

***
## Mosaicing tiles
To simplify the handling of the downloaded files ... we mosaic the data according to the 
`make_mosaic.py`

In [19]:
path_in = os.path.join(os.path.expanduser('~'), 'Dokumente', 'nasa_hls', 'data', 'hdf' + os.sep)
path_out =  os.path.join(os.path.expanduser('~'), 'Dokumente', 'nasa_hls', 'data', 'tifs' + os.sep)

In [37]:
import rasterio
import os
from spatialist import Vector
import rasterio
import glob
from osgeo import gdal
import os.path
import sys

from nasa_hls.download_tiles import get_available_datasets_from_shape
from nasa_hls.download_hls_dataset import download_batch
from nasa_hls.utils import BAND_NAMES

def make_mosaic_tif(srcdir = path_in, dstdir = path_out, bands = None, product = "S30"):

    # get all hdf-files
    hdf_files_list = list(glob.glob(srcdir + '*.hdf'))

    # make list of all dates in directory
    dates_doy = []
    for line in hdf_files_list:
        l = line.split(".")[3][4:]
        dates_doy.append(l)

    # print list with unique
    #print(dates_doy)

    # make a function that gets the unique entries from a list
    # these will be the keys afterwards
    def unique_dates(liste):
        unique_dates = []
        for x in liste:
            if x not in unique_dates:
                unique_dates.append(x)
        return unique_dates
    
    # make the list of unique dates
    unique_doy = unique_dates(dates_doy)
    

    # create dictionary with keys being the unique dates
    # not yet specify the value-type
    dataframe_dict = {date: None for date in unique_doy}

    # add rows of orignial dataframe as values
    for key in dataframe_dict.keys():
        foo = []
        # now go over all the files
        for line in hdf_files_list:
            # get the doy
            line_date = line.split(".")[3][4:]
            # wenn doy in der line == dem key, dann schreib es in die liste foo
            if key == line_date:
                foo.append(line)
        # nachdem du über alle files gegangen bist, schreib an den key mit dem doy die aktuelle foo-liste,
        # die nach diesem Durchgang wieder neu aufgesetzt wird
        dataframe_dict[key] = foo

    #print(dataframe_dict["311"], "\n\n")
    #for key, item in dataframe_dict.items():
         #print(key, item, "\n")

     #check if band is specified
    if bands is None:
        bands = list(BAND_NAMES[product].keys())
        long_band_names = []
        for long_band_name in bands:
            band = BAND_NAMES[product][long_band_name]
            long_band_names.append(band)
    else:
        long_band_names = bands
        
    

    for key in dataframe_dict.keys():
        for band in long_band_names:
            hdf_list = dataframe_dict[key]
            hdf_file_bands = []
            for hdf_file in hdf_list:
                filename = 'HDF4_EOS:EOS_GRID:"{0}":Grid:{1}'.format(hdf_file, band)
                hdf_file_bands.append(filename)

            #print("\n".join(hdf_file_bands))
            # make mosaics for each band for each date
            vrt_path = os.path.join(path_out + key + band + ".vrt")
            build_vrt = gdal.BuildVRT(vrt_path, hdf_file_bands)
            build_vrt = None

    dates_dict = {date: None for date in unique_doy}

    # list of vrts
    vrts_path = path_out
    vrts = list(glob.glob(vrts_path + "*.vrt"))
    
    #print(vrts)
    

    for key in dates_dict.keys():
        files = []
        for single_file in vrts:
            doy = single_file.split("/")[-1][0:3]
            if key == doy:
                files.append(single_file)

        dates_dict[key] = files


    ######is
    #print the dict
    ######
    # for keys, items in dates_dict.items():
    #     print(keys, items, "\n")
    # print dictionary nicely
    #print("\n".join("{}\t{}".format(k, v) for k, v in dates_dict.items()))
    #print(len(dates_dict))

    for date in dates_dict.keys():
        print(date)
        vrts_per_date = dates_dict[date]
        vrt_path = os.path.join(path_out + date + "final.vrt")
        single_vrt = gdal.BuildVRT(vrt_path, vrts_per_date, separate=True)
        tiff_path = os.path.join(dstdir + date + ".tiff")
        final_tif = gdal.Translate(tiff_path, single_vrt)







            # final_tif = gdal.Translate(os.path.join(path_data_lin_robin + "mosaic/" + key + band + ".tiff"), build_vrt)
            # final_tif = None
            # build_vrt = None


In [39]:
make_mosaic_tif()

122


SystemError: <built-in function TranslateInternal> returned a result with an error set

The output extent can now be clipped to the actual outline of the input geometry.

## Calculate Indexes

## Visualise Data

## Visualise Data