# 3 Setup
All the code for the individual modules are is located at https://github.com/multiply-org/. This can be used to setup the MULTIPLY framework on your own computing infrastructure. At present however no deployment setup (in the form of windows-setup-executables, or anaconda package’s) exist. While this is planned further intolater in the project, the focus at this stage is on testing the individual components themselves. Please let us know if you would prefer to install the software yourself on a dedicated computational framework, so that we can investigate how to facilitate this for you. 
In order to facilitate the testing of the framework itself, we have setup this Virtual Machine on Google Compute Engine, for testing purposes. 


## 3.1 Parameters
Here you can actually set the parameters for the run. 


### Define working_directory

In [None]:
working_directory_name = 'OVP_test_pynb39_20220717'

### 3.1.1 Define region of Interest
* **roi**: A region of interest, given as a Polygon in WKT format. You can use this tool ( https://arthur-e.github.io/Wicket/sandbox-gmaps3.html ) to easily get definitions of the regions you are interested in in WGS84 coordinates.
* **roi_grid**: The EPSG-code of the spatial reference system in which the roi is given. If it is set to 'none', it is assumed that the roi is given in WGS84 coordinates.
* **destination_grid**: The EPSG-code of the spatial reference system in which the output shall be given. If it is set to 'none', the platform will attempt to derive it from the roi_grid.
* **spatial_resolution**: The resolution the output data is supposed to have, must be a non-negative integer number. The resolution is given in meters and is the same for both dimensions.

In [2]:
roi = 'POLYGON(( 5.695 52.26,  5.695 52.25,  5.680 52.25,  5.680 52.26,  5.695 52.26 ))' #Speulderbos. SIAC not working on this area possibly due to tiling
roi = 'POLYGON ((5.163574 52.382529, 5.163574 52.529813, 5.493164 52.529813, 5.493164 52.382529, 5.163574 52.382529))' #OVP. Correct in MUI version
roi_grid = 'EPSG:4326'                            # WGS84
destination_grid = 'EPSG:4326'                   # WGS84
spatial_resolution = 20 # in m

### 3.1.2 Define temporal frequency and period
* **start_time**: The start date of the period you are interested in, must be given in the format 'Year-Month-Day' as below.
* **end_time**: The end date of the period you are interested in, must be given in the format 'Year-Month-Day' as below.
* **time_step**: The temporal resolution the output is supposed to have. Data will be aggregated over the period denoted by this parameter. Must be a non-negative integer value. The unit is days.

In [3]:
start_time_as_string = '2019-04-16'
stop_time_as_string = '2019-04-17'

start_time_as_string = '2018-04-16'
stop_time_as_string = '2018-04-20'

time_step = 5 # in days


### 3.1.2 Variables 
* **variables**: The list of the biophysical variables that shall be derived. Please do not change this list, as the underlying forward model requires all of them. The parameters are as follows:
  * **n**: Structural parameter
  * **cab**: Leaf Chlorophyll Content, given in ug/cm²
  * **car**: Leaf Carotonoid Content, given in ug/cm²
  * **cb**: Leaf senescent material
  * **cw**: Leaf Water Content, given in cm
  * **cdm**: Leaf Dry Mass, given in g/cm²
  * **lai**: Effective Leaf Area Index, given in m²/m²
  * **ala**: Average Leaf Angle, given in degrees
  * **bsoil**: Soil Brightness Parameter
  * **psoil**: Soil Wetness Parameter
* **file_mask**: A file that can be used to explicitly state the region you are interested in. You can also use it to mask out single pixels within this region. If this is not 'none', the aforementioned parameters roi_grid, spatial_resolution, and destination_grid are not used.

**HINT**: The platform will perform faster the smaller your roi and the larger the spatial resolution is.

In [4]:
variables = {'n', 'cab', 'car', 'cb', 'cw', 'cdm', 'lai', 'ala', 'bsoil', 'psoil'}
file_mask = None

## 3.2 Load internal packages and auxiliary methods

In [5]:
from multiply_data_access import DataAccessComponent
from vm_support.utils import create_config_file, set_permissions
from vm_support.sym_linker import create_sym_links
import datetime
import glob

## 3.2 Defining the additional interfaces

In [6]:
def get_static_data(data_access_component: DataAccessComponent, roi: str, roi_grid: str, start_time: str,
                    stop_time: str, emulation_directory: str, dem_directory: str):
    create_dir(emulation_directory)
    create_dir(dem_directory)

    rg = roi_grid
    if roi_grid == 'none':
        rg = None

    print('Retrieving emulators ...')
    emu_urls = data_access_component.get_data_urls(roi, start_time, stop_time, 'ISO_MSI_A_EMU,ISO_MSI_B_EMU', rg)
    set_permissions(emu_urls)
    create_sym_links(emu_urls, emulation_directory)
    print('Retrieving DEM ...')
    dem_urls = data_access_component.get_data_urls(roi, start_time, stop_time, 'Aster_DEM', rg)
    set_permissions(dem_urls)
    create_sym_links(dem_urls, dem_directory)
    print('Done retrieving static data')


In [7]:
def create_dir(dir):
    try:
        if not os.path.exists(dir):
            os.makedirs(dir)
    except:
        print(dir)
    return

In [8]:
def get_dynamic_data(data_access_component: DataAccessComponent, roi: str, roi_grid: str, start_time: str,
                     stop_time: str, modis_directory: str, cams_tiff_directory: str, s2_l1c_directory: str):
    create_dir(modis_directory)
    create_dir(cams_tiff_directory)
    create_dir(s2_l1c_directory)

    modis_delta = datetime.timedelta(days=16)
    start = datetime.datetime.strptime(start_time, '%Y-%m-%d')
    modis_start = start - modis_delta
    modis_start_time = datetime.datetime.strftime(modis_start, '%Y-%m-%d')
    end = datetime.datetime.strptime(stop_time, '%Y-%m-%d')
    modis_end = end + modis_delta
    modis_end_time = datetime.datetime.strftime(modis_end, '%Y-%m-%d')
    
    rg = roi_grid
    if roi_grid == 'none':
        rg = None

    print('Retrieving MODIS BRDF descriptors ...')
    modis_urls = data_access_component.get_data_urls(roi, modis_start_time, modis_end_time, 'MCD43A1.006', rg)
    set_permissions(modis_urls)
    create_sym_links(modis_urls, modis_directory)
    print('Retrieving CAMS data ...')
    cams_urls = data_access_component.get_data_urls(roi, start_time, stop_time, 'CAMS_TIFF', rg)
    set_permissions(cams_urls)
    create_sym_links(cams_urls, cams_tiff_directory)
    print('Retrieving S2 L1C data ...')
    s2_urls = data_access_component.get_data_urls(roi, start_time, stop_time, 'AWS_S2_L1C, S2_L1C', rg)
    set_permissions(s2_urls)
    create_sym_links(s2_urls, s2_l1c_directory)
    print('Done retrieving dynamic data')

In [9]:
import os
import shutil
def get_working_dir(dir_name: str) -> str:
    working_dir = f'/datastore/working_dirs/{dir_name}'
    if os.path.exists(working_dir):
        shutil.rmtree(working_dir)
    os.makedirs(working_dir)
    return working_dir

# 4 Running MULTIPLY
Below the actual code is provided for running the MULTIPLY framework.
We start with setting earth data authentication. This is required to download the MODIS brdf descriptors which are required for the atmospheric correction of the Sentinel-2 data. You can get credentials when you register at https://urs.earthdata.nasa.gov/profile . Registration and use is free of cost. If you do not register, you can only use the MODIS data which has been downloaded in previous runs of the notebook by other users.
Also you will need to set up the data stores so that the data access component is working correctly and finds the pre-configured data stores. Both steps only need to be performed once.

### 4.1 Creating working directory
For this notebook, you will operate in your own working directory. All data you use will be copied here, all output will be written here.

In [10]:
start_time_as_datetime = datetime.datetime.strptime(start_time_as_string, '%Y-%m-%d')
stop_time_as_datetime = datetime.datetime.strptime(stop_time_as_string, '%Y-%m-%d')

time_step_as_time_delta = datetime.timedelta(days=time_step)

In [11]:
# Setup clean working directory
name = working_directory_name
working_dir = get_working_dir(name)

# use previous (non-empty) working directory
# working_dir = '/Data/test_user_16/' + name
# working_dir = '/datastore/working_dirs/' + name

In [12]:
print('Working directory is {}'.format(working_dir))

priors_directory = '{}/priors'.format(working_dir)
hres_state_dir = '{}/hresstate'.format(working_dir)
modis_directory = '{}/modis'.format(working_dir)
state_directory = '{}/state'.format(working_dir)
cams_directory = '{}/cams'.format(working_dir)
s2_l1c_directory = '{}/s2'.format(working_dir)
sdrs_directory = '{}/sdrs'.format(working_dir)
biophys_output = '{}/biophys'.format(working_dir)
emulators_directory = '{}/emulators'.format(working_dir)
dem_directory = '{}/dem'.format(working_dir)

print(working_dir)

Working directory is /datastore/working_dirs/OVP_test_pynb39_20220717
/datastore/working_dirs/OVP_test_pynb39_20220717


## 5.2 Acquire Data
We differentiate between two types of data here: Dynamic and static, meaning: Data which are valid for a certain period of time and data which is valid permanently. The latter is the elevation data of the Digital Elevation Model and the Emulators required for the Atmospheric Correction. We put them in their designated folders before we start our loop through time.

In [13]:
data_access_component = DataAccessComponent()

INFO:root:Read data store cams
INFO:root:Read data store cams_tiff
INFO:root:Read data store emulators
INFO:root:Read data store wv_emulator
INFO:root:Read data store aster_dem
INFO:root:Read data store modis_mcd43a1
INFO:root:Read data store S2L2
INFO:root:Read data store aws_s2


### 5.2.1 Download Static Data

In [14]:
get_static_data(data_access_component=data_access_component, roi=roi,
                start_time=start_time_as_string, stop_time=stop_time_as_string, 
          emulation_directory=emulators_directory, dem_directory=dem_directory, roi_grid=roi_grid)

Retrieving emulators ...


  if polygon_1.almost_equals(polygon_2):
INFO:ComponentProgress:0
INFO:ComponentProgress:8
INFO:ComponentProgress:16
INFO:ComponentProgress:25
INFO:ComponentProgress:33
INFO:ComponentProgress:41
INFO:ComponentProgress:50
INFO:ComponentProgress:58
INFO:ComponentProgress:66
INFO:ComponentProgress:75
INFO:ComponentProgress:83
INFO:ComponentProgress:91
  coverage = cascaded_union(coverages)


Retrieving DEM ...


  coverage = cascaded_union(coverages)
  coverage = cascaded_union(coverages)
INFO:ComponentProgress:0


Done retrieving static data


These stepping of time works as follows: Dedicated directories are set up for the MODIS, CAMS, and S2 data. These data are retrieved and put into these directories. After that, pre-processing will take place either on the whole S2 image or on the region of interest. If it has been performed on the whole region, the result will be permanently saved. Next, priors will be derived for every variable and every day within the current period. Having gathered all these, the inference can finally begin. The state of the inference engine is saved and considered during the next iteration.

### 5.2.2 Download Dynamic Data
First for explanation, we will perform the inference over a single timestep. During this, we will show all the results to indicate the flow of the processings. 

In [15]:
# Setting up
cursor = start_time_as_datetime
previous_inference_state = None #'none'
updated_inference_state = 'none'
one_day_step = datetime.timedelta(days=1)
preprocess_only_region_of_interest = True

date_as_string = datetime.datetime.strftime(cursor, '%Y-%m-%d')

cursor += time_step_as_time_delta
cursor -= one_day_step
if cursor > stop_time_as_datetime:
    cursor = stop_time_as_datetime
next_date_as_string = datetime.datetime.strftime(cursor, '%Y-%m-%d')
cursor += one_day_step
cursor_as_string = datetime.datetime.strftime(cursor, '%Y-%m-%d')

modis_directory_for_date = '{}/{}'.format(modis_directory, date_as_string)
cams_directory_for_date = '{}/{}'.format(cams_directory, date_as_string)
s2_l1c_directory_for_date = '{}/{}'.format(s2_l1c_directory, date_as_string)
sdrs_directory_for_date = '{}/{}'.format(sdrs_directory, date_as_string)
priors_directory_for_date = '{}/{}/'.format(priors_directory, date_as_string)


In [16]:
get_dynamic_data(data_access_component, roi, roi_grid, date_as_string, next_date_as_string,
                 modis_directory_for_date, cams_directory_for_date, s2_l1c_directory_for_date)

Retrieving MODIS BRDF descriptors ...


INFO:ComponentProgress:0
INFO:ComponentProgress:2
INFO:ComponentProgress:5
INFO:ComponentProgress:8
INFO:ComponentProgress:10
INFO:ComponentProgress:13
INFO:ComponentProgress:16
INFO:ComponentProgress:18
INFO:ComponentProgress:21
INFO:ComponentProgress:24
INFO:ComponentProgress:27
INFO:ComponentProgress:29
INFO:ComponentProgress:32
INFO:ComponentProgress:35
INFO:ComponentProgress:37
INFO:ComponentProgress:40
INFO:ComponentProgress:43
INFO:ComponentProgress:45
INFO:ComponentProgress:48
INFO:ComponentProgress:51
INFO:ComponentProgress:54
INFO:ComponentProgress:56
INFO:ComponentProgress:59
INFO:ComponentProgress:62
INFO:ComponentProgress:64
INFO:ComponentProgress:67
INFO:ComponentProgress:70
INFO:ComponentProgress:72
INFO:ComponentProgress:75
INFO:ComponentProgress:78
INFO:ComponentProgress:81
INFO:ComponentProgress:83
INFO:ComponentProgress:86
INFO:ComponentProgress:89
INFO:ComponentProgress:91
INFO:ComponentProgress:94
INFO:ComponentProgress:97


Retrieving CAMS data ...


INFO:ComponentProgress:0
INFO:ComponentProgress:20
INFO:ComponentProgress:40
INFO:ComponentProgress:60
INFO:ComponentProgress:80


Retrieving S2 L1C data ...


INFO:ComponentProgress:0
INFO:ComponentProgress:25
INFO:ComponentProgress:50
INFO:ComponentProgress:75


Done retrieving dynamic data


# the Contributing team
<img src="pics/Logos.png">
