# MULTIPLY SAR Data Access and Pre-Processing

The purpose of this Jupyter Notebook is to show how the MULTIPLY platform can be used to retrieve S1 SLC Data from the Data Access Component and how it can be processed into S1 GRD Data using the SAR Pre-Processing functionality.

First, let's define working directories.

In [None]:
from vm_support import get_working_dir
name = 'm1'

# create and/or clear working directory
working_dir = get_working_dir(name)
print('Working directory is {}'.format(working_dir))
s1_slc_directory = '{}/s1_slc'.format(working_dir)
s1_grd_directory = '{}/s1_grd'.format(working_dir)

Now to ensure that all data stores are set up (step over this part if you have already done it). Please also set your Earth Data Authentication if you execute this.

In [None]:
#from vm_support import set_earth_data_authentication, set_up_data_stores
#set_up_data_stores()
#username = ''
#password = ''
#set_earth_data_authentication(username, password) # to download modis data, needs only be done once

We need to define start and end times and a region of interest.

In [None]:
start_time_as_string = '2018-06-01'
stop_time_as_string = '2018-06-10'
roi = 'POLYGON((9.99 53.51,10.01 53.51,10.01 53.49, 9.99 53.49, 9.99 53.51))'

For the SAR Pre-Processing we require a config file. Let's create it.

In [None]:
from vm_support import create_sar_config_file
create_sar_config_file(temp_dir=working_dir, roi=roi, start_time=start_time_as_string, end_time=stop_time_as_string,
                       s1_slc_directory=s1_slc_directory, s1_grd_directory=s1_grd_directory, temporal_filter='5')
config_file = f'{working_dir}/sar_config.yaml'

Next set up the Data Access Component.

In [None]:
from multiply_data_access import DataAccessComponent
dac = DataAccessComponent()

For the SAR Pre-Processing we need to have 15 products at least: 7 before and 7 after the product in question. However, the SAR Pre-Processing does not count all products as full products: If the products are located close to a border, they are counted as half products. As the determination whether a product is counted as a full or half a product is made by the SAR Pre-Processor, we need it to determine he products that are required. To do so, it is necessary to access the products, so we might already need to download.

Let's start with determining the actual start date:

In [None]:
import datetime
import logging
import os
from vm_support import create_sym_links
from sar_pre_processing import SARPreProcessor
one_day = datetime.timedelta(days=1)

before_sar_dir = f'{s1_slc_directory}/before'
if not os.path.exists(before_sar_dir):
    os.makedirs(before_sar_dir)

start = datetime.datetime.strptime(start_time_as_string, '%Y-%m-%d')
before = start
num_before = 0
while num_before < 7:
    before -= one_day
    before_date = datetime.datetime.strftime(before, '%Y-%m-%d')
    data_urls_before = dac.get_data_urls(roi, before_date, start_time_as_string, 'S1_SLC')
    create_sym_links(data_urls_before, before_sar_dir)
    processor = SARPreProcessor(config=config_file, input=before_sar_dir, output=before_sar_dir)
    list = processor.create_processing_file_list()
    num_before = len(list[0]) + (len(list[1]) / 2.)
logging.info(f'Set start date for collecting S1 SLC products to {before_date}.')

Now the actual end date. Take care not to set it in the future.

In [None]:
after_sar_dir = f'{s1_slc_directory}/after'
if not os.path.exists(after_sar_dir):
    os.makedirs(after_sar_dir)

end = datetime.datetime.strptime(stop_time_as_string, '%Y-%m-%d')
after = end
num_after = 0
while num_after < 7 and after < datetime.datetime.today():
    after += one_day
    after_date = datetime.datetime.strftime(after, '%Y-%m-%d')
    data_urls_after = dac.get_data_urls(roi, stop_time_as_string, after_date, 'S1_SLC')
    create_sym_links(data_urls_after, after_sar_dir)
    processor = SARPreProcessor(config=config_file, input=after_sar_dir, output=after_sar_dir)
    list = processor.create_processing_file_list()
    num_after = len(list[0]) + (len(list[1]) / 2.)
logging.info(f'Set end date for collecting S1 SLC products to {after_date}.')

We created extra directories for collecting the products. Let's clean up here.

In [None]:
import shutil

shutil.rmtree(before_sar_dir)
shutil.rmtree(after_sar_dir)

Now, we are finally set to collect the data:

In [None]:
sar_data_urls = dac.get_data_urls(roi, before_date, after_date, 'S1_SLC')
create_sym_links(sar_data_urls, s1_slc_directory)

Now that the data has been collected, we can run the actual SAR Pre-Processing. The Processing consists of three steps. The first two steps create one output product for one input product, while the third step merges information from multiple products. We can run steps 1 and 2 safely now on all the input folders.

In [None]:
processor = SARPreProcessor(config=config_file, input=s1_slc_directory, output=s1_grd_directory)
processor.create_processing_file_list()
logging.info('Start Pre-processing step 1')
processor.pre_process_step1()
logging.info('Finished Pre-processing step 1')
logging.info('Start Pre-processing step 2')
processor.pre_process_step2()
logging.info('Finished Pre-processing step 2')

Step 3 needs to be performed for each product separately. To do this, we need to make sure we hand in the correct products only. The output of the second step is located in an intermediate folder. First, we collect all these files and sort them temporally.

In [None]:
import glob

output_step2_dir = f'{s1_grd_directory}/step2'
sorted_input_files = glob.glob(f'{output_step2_dir}/*.dim')
sorted_input_files.sort(key=lambda x: x[len(output_step2_dir) + 18:len(output_step2_dir) + 33])
sorted_input_files

Now we can run the thrird step of the SAR Pre-Processing for every product for which there are at least 7 products before and 7 products after it available. For this, it is necessary to first create the file list, then to remove all files from it that shall not be considered during this step.

In [None]:
output_step3_dir = f'{s1_grd_directory}/step3'

for end in range(14, len(sorted_input_files)):
    file_list = processor.create_processing_file_list()
    start = end-14
    sub_list = sorted_input_files[start:end]
    for i, list in enumerate(file_list):
        for file in list:
            processed_name = file.replace('.zip', '_GC_RC_No_Su_Co.dim')
            processed_name = processed_name.replace(s1_slc_directory, output_step2_dir)
            if processed_name not in sub_list:
                list.remove(file)
    processor.set_file_list(file_list)
    logging.info(f'Start Pre-processing step 3, run {start}')
    processor.pre_process_step3()
    logging.info(f'Finished Pre-processing step 3, run {start}')
    files = os.listdir(output_step3_dir)
    for file in files:
        shutil.move(os.path.join(output_step3_dir, file), s1_grd_directory)