# Description 

This script allows to download remote sensing data from the CCI
database according to user defined period and extent of the region. 
This script is useful for downloadindg data for multiple lakes in a 
region where latitude and longitude boundaries are known.

INPUT:
<ul>
    <li>latitude/longitud min and max values</li>
    <li>first/last date</li>
    <li>version of the dataset to be download (default value 2.1.0)</li>
    <li>ouptput dir to storage the extracted data</li>
    <li>prefix (optional): to be added at the output files</li>
</ul>

 Reference: Carrea, L.; Crétaux, J.-F.; Liu, X.; Wu, Y.; Bergé-Nguyen,
 M.; Calmettes, B.; Duguay, C.; Jiang, D.; Merchant, C.J.; Mueller, D.;
 Selmes, N.; Simis, S.; Spyrakos, E.; Stelzer, K.; Warren, M.; Yesou,
 H.; Zhang, D. (2022): ESA Lakes Climate Change Initiative (Lakes_cci):
 Lake products, Version 2.0.1. NERC EDS Centre for Environmental Data
 Analysis, date of citation.
 
 https://dx.doi.org/10.5285/7fc9df8070d34cacab8092e45ef276f1
    
 Acknowledgements: thanks to Sebastiano Piccolroaz who has made available
 a script to extract data from the first version of the dataset
 (https://github.com/spiccolroaz/CCI_extractor) and inspired this one

 WARNING: This is a beta version. All controls on the input parameters
 are not (yet) available. If you find a bug, have a question or a
 suggestion, don't hesitate to contact us, it will be much appreciated :
 cci_lakes.contact@groupcls.com

 to be executed with python version >= 3.9

In [28]:
import os
import numpy as np
import xarray as xr
import datetime

In [123]:
###########################################################################################
# input parameters
###########################################################################################

# defining the zone
# latitude values must be between -90 and 90
# longitud values must be between -180 and 180

minlat = 38.8
maxlat = 39.3
minlon = -120.35
maxlon = -119.8

# defining the period of time in string format: YYYY-MM-DD
# dates values must be between 1992-09-26 and 2020-12-31
mindate = '2019-08-19'
maxdate = '2019-08-30'

# version dataset (2.1.0 is the version published in April 2024)
version = '2.1'

# output
outdir = '/home/sar_hydro/Projets/CCI-LAKES/output/Tahoe/zone'
outprefix = 'Tahoe_zone_'


In [124]:
# test if dates are in the temporal coverage

mindate = datetime.datetime.strptime(mindate, '%Y-%m-%d')
maxdate = datetime.datetime.strptime(maxdate, '%Y-%m-%d')
mindate = max([mindate, datetime.datetime(1992,9,26)])
maxdate = min([maxdate, datetime.datetime(2020,12,31)])

In [125]:
# create the output directory if it does not exist
if os.path.exists(outdir)==False:
    os.makedirs(outdir)

In [None]:
# The download process

for data_date in np.arange(mindate.toordinal(), maxdate.toordinal()+1):
    current_date = datetime.datetime.fromordinal(data_date)
    date_str = current_date.strftime("%Y%m%d")

    print (f'Downloading data from ESACCI-LAKES-L3S-LK_PRODUCTS-MERGED-{date_str}-fv{version}.nc')
    path = 'https://data.cci.ceda.ac.uk/thredds/dodsC/esacci/lakes/data/lake_products/L3S/v2.1/merged_product/'
    path += f'{current_date.year}/{current_date.month:02}/'
    path += f'ESACCI-LAKES-L3S-LK_PRODUCTS-MERGED-{date_str}-fv{version}.0.nc'
    
    
    dataset = xr.open_dataset(path, engine="pydap")
    
    # extract data in the defined zone
    dataset = dataset.sel(lat=slice(minlat, maxlat), lon=slice(minlon, maxlon))
    
    # create a netcdf file
    outfile = f'{outdir}/{outprefix}ESACCI-LAKES-L3S-LK_PRODUCTS-MERGED-{date_str}-fv{version}.nc'
    dataset.to_netcdf(outfile)
