We begin with setting up the Data Access Component (DAC). The DAC is responsible for retrieving the data we need as input for the MULTIPLY platform. During its setup, it will load several data stores. A data store manages access to data that is provided locally or remotely. The DAC can be asked about the stores it provides access to.

In [1]:
from multiply_data_access import DataAccessComponent
data_access_component = DataAccessComponent()
data_access_component.show_stores()

INFO:root:Read data store aws_s2
INFO:root:Read data store cams
INFO:root:Read data store emulators
INFO:root:Read data store Aster DEM
INFO:root:Read data store MODIS Data
INFO:root:Scanning local file system, not remote
INFO:root:Scanning local file system, not remote
INFO:root:Scanning local file system, not remote
INFO:root:Scanning local file system, not remote
INFO:root:Scanning local file system, not remote


Data store aws_s2
Data store cams
Data store emulators
Data store Aster DEM
Data store MODIS Data


We then can ask the Data Access Component what data of what type is available.

In [2]:
data_access_component.get_provided_data_types()

['AWS_S2_L1C',
 'CAMS',
 'ISO_MSI_A_EMU',
 'ISO_MSI_B_EMU',
 'WV_EMU',
 'Aster DEM',
 'MCD43A1.006']

Suppose we want to perform high resolution pre-processing. For this we need S2 L1C data in the format used by AWS (AWS_S2_L1C). We also need MODIS data (from 3 days before to 3 days after the S2 observations, data type MCD43A1.006), a global DEM file (AsterDEM) and several emulators for S2 data.
We also need to specify a spatial region of interest and start and end times. Let's suppose we are interested in the area around Barrax, Spain. We define the area by passing its coordinates in Well-Known-Text-Format (you can use https://arthur-e.github.io/Wicket/sandbox-gmaps3.html to get WKT representations of other regions of interest).
We also define a start and ending time.
When we have done that, we can ask the data access component for available S2 data.

In [3]:
BARRAX_ROI = "POLYGON((-2.20397502663252 39.09868106889479,-1.9142106223355313 39.09868106889479," \
             "-1.9142106223355313 38.94504502508093,-2.20397502663252 38.94504502508093," \
             "-2.20397502663252 39.09868106889479))"
start_time = '2017-01-01'
end_time = '2017-01-20'
s2_data_infos = data_access_component.query(BARRAX_ROI, start_time, end_time, 'AWS_S2_L1C')
s2_data_infos

[Data Set:
   Id: E:/Projects/Multiply/Data/aws_s2_l1c//AWS_S2_L1C/2017/01/16/30/S/WJ/2017/1/16/0, 
   Type: AWS_S2_L1C, 
   Start Time: 2017-01-16 10:53:55, 
   End Time: 2017-01-16 10:53:55, 
   Coverage: POLYGON((-3.000233454377241 39.75026792656397, -1.7187196513335372 39.74319619168243, -1.7365967808116474 38.754036047778804, -3.0002301960295696 38.760864456727795, -3.000233454377241 39.75026792656397)),
 Data Set:
   Id: E:/Projects/Multiply/Data/aws_s2_l1c//AWS_S2_L1C/2017/01/19/30/S/WJ/2017/1/19/0, 
   Type: AWS_S2_L1C, 
   Start Time: 2017-01-19 11:05:33, 
   End Time: 2017-01-19 11:05:33, 
   Coverage: POLYGON((-3.000233454377241 39.75026792656397, -1.7187196513335372 39.74319619168243, -1.7365967808116474 38.754036047778804, -3.0002301960295696 38.760864456727795, -3.000233454377241 39.75026792656397))]

This has retrieved meta information about available S2 data. In particular, we can see the data coverage and the sensing date. When we are happy with the selection, let's get the data:

In [7]:
s2_urls = data_access_component.get_data_urls_from_data_set_meta_infos(s2_data_infos)
s2_urls

['E:/Projects/Multiply/Data/aws_s2_l1c//AWS_S2_L1C/2017/01/16/30/S/WJ/2017/1/16/0',
 'E:/Projects/Multiply/Data/aws_s2_l1c//AWS_S2_L1C/2017/01/19/30/S/WJ/2017/1/19/0']

Now, let's get all the other data we need. First, all the emulators. We need to download them only once and can re-use them later.

In [8]:
emu_urls = data_access_component.get_data_urls(BARRAX_ROI, start_time, end_time, 'ISO_MSI_A_EMU,ISO_MSI_B_EMU,WV_EMU')
emu_urls

['E:/Projects/Multiply/Data/Emulators//ISO_MSI_A_EMU/isotropic_MSI_emulators_correction_xap_S2A.pkl',
 'E:/Projects/Multiply/Data/Emulators//WV_EMU/wv_MSI_retrieval_S2A.pkl',
 'E:/Projects/Multiply/Data/Emulators/ISO_MSI_B_EMU/isotropic_MSI_emulators_correction_xap_S2B.pkl',
 'E:/Projects/Multiply/Data/Emulators/ISO_MSI_B_EMU/isotropic_MSI_emulators_correction_xbp_S2B.pkl',
 'E:/Projects/Multiply/Data/Emulators/ISO_MSI_B_EMU/isotropic_MSI_emulators_correction_xcp_S2B.pkl',
 'E:/Projects/Multiply/Data/Emulators/ISO_MSI_B_EMU/isotropic_MSI_emulators_optimization_xap_S2B.pkl',
 'E:/Projects/Multiply/Data/Emulators/ISO_MSI_B_EMU/isotropic_MSI_emulators_optimization_xbp_S2B.pkl',
 'E:/Projects/Multiply/Data/Emulators/ISO_MSI_B_EMU/isotropic_MSI_emulators_optimization_xcp_S2B.pkl']

Let's get the ASTER DEM URL next ...

In [11]:
aster_dem_url = data_access_component.get_data_urls(BARRAX_ROI, start_time, end_time, 'Aster DEM')
aster_dem_url

['E:\\Projects\\Multiply\\Data/DEM/aster_dem.vrt']

For CAMS, we only need the data for the days for which we have S2 observations.

In [17]:
cams_urls = data_access_component.get_data_urls(BARRAX_ROI, '2017-1-16', '2017-1-16', 'CAMS')
cams_urls.extend(data_access_component.get_data_urls(BARRAX_ROI, '2017-1-19', '2017-1-19', 'CAMS'))
cams_urls

['E:/Projects/Multiply/Data/CAMS/CAMS/2017/2017-01-16.nc',
 'E:/Projects/Multiply/Data/CAMS/CAMS/2017/2017-01-19.nc']

Finally, we need MODIS data from three days before the first available S2 observation up to three days after the last S2 observation.

In [23]:
modis_start_time = '2017-01-13'
modis_end_time = '2017-01-22'
modis_urls = data_access_component.get_data_urls(BARRAX_ROI, modis_start_time, modis_end_time, 'MCD43A1.006')
modis_urls

INFO:root:Downloading MCD43A1.A2017013.h17v05.006.2017024081921.hdf


100 %

INFO:root:Downloaded MCD43A1.A2017013.h17v05.006.2017024081921.hdf
INFO:root:Downloading MCD43A1.A2017020.h17v05.006.2017029103253.hdf


99 %

INFO:root:Downloaded MCD43A1.A2017020.h17v05.006.2017029103253.hdf
INFO:root:Downloading MCD43A1.A2017021.h17v05.006.2017032034322.hdf


99 %

INFO:root:Downloaded MCD43A1.A2017021.h17v05.006.2017032034322.hdf
INFO:root:Downloading MCD43A1.A2017022.h17v05.006.2017032100642.hdf


100 %

INFO:root:Downloaded MCD43A1.A2017022.h17v05.006.2017032100642.hdf


['E:/Projects/Multiply/Data/MODIS/MCD43A1.006/2017/01/13/MCD43A1.A2017013.h17v05.006.2017024081921.hdf',
 'E:/Projects/Multiply/Data/MODIS/MCD43A1.006/2017/01/14/MCD43A1.A2017014.h17v05.006.2017024085825.hdf',
 'E:/Projects/Multiply/Data/MODIS/MCD43A1.006/2017/01/15/MCD43A1.A2017015.h17v05.006.2017024094450.hdf',
 'E:/Projects/Multiply/Data/MODIS/MCD43A1.006/2017/01/16/MCD43A1.A2017016.h17v05.006.2017025102642.hdf',
 'E:/Projects/Multiply/Data/MODIS/MCD43A1.006/2017/01/17/MCD43A1.A2017017.h17v05.006.2017026133512.hdf',
 'E:/Projects/Multiply/Data/MODIS/MCD43A1.006/2017/01/18/MCD43A1.A2017018.h17v05.006.2017027113353.hdf',
 'E:/Projects/Multiply/Data/MODIS/MCD43A1.006/2017/01/19/MCD43A1.A2017019.h17v05.006.2017028150223.hdf',
 'E:/Projects/Multiply/Data/MODIS/MCD43A1.006/2017/01/20/MCD43A1.A2017020.h17v05.006.2017029103253.hdf',
 'E:/Projects/Multiply/Data/MODIS/MCD43A1.006/2017/01/21/MCD43A1.A2017021.h17v05.006.2017032034322.hdf',
 'E:/Projects/Multiply/Data/MODIS/MCD43A1.006/2017/01/2

We now have all the data we need to execute the atmospheric correction. To run it, all files of a type need to be in the same folder. Let's first define a working directory as parent folder for the various data type diretories. This is supposed to be a temporary directory which later can be deleted. 
For convenience we define a method create_dir here to help us to ensure that the directories exist.

In [24]:
import os
working_dir = 'E:/Produkte/multiply/working_dir/'
def create_dir(dir):
    if not os.path.exists(dir):
        os.makedirs(dir)
create_dir(working_dir)
s2_l1c_dir = '{}/s2_l1c'.format(working_dir)
create_dir(s2_l1c_dir)
emus_dir = '{}/emus'.format(working_dir)
create_dir(emus_dir)
cams_dir = '{}/cams'.format(working_dir)
create_dir(cams_dir)
modis_dir = '{}/modis'.format(working_dir)
create_dir(modis_dir)

As we do not want to actually move data around, we will place symbolic links in these folders. For this, we will use the sym linking functionality from the multiply orchestration package.

In [22]:
from multiply_orchestration import create_sym_links
