<img src="NotebookAddons/blackboard-banner.png" width="100%" />
<font face="Calibri">
<br>
<font size="5"> <b>Prepare a SAR Data Stack</b><img style="padding: 7px" src="NotebookAddons/UAFLogo_A_647.png" width="170" align="right"/></font>

<br>
<font size="4"> <b> Joseph H Kennedy and Alex Lewandowski; Alaska Satellite Facility </b> <br>
</font>

<font size="3"> This notebook downloads an ASF-HyP3 RTC project and prepares a deep multi-temporal SAR image data stack for use in other notebooks.</font></font>

<hr>
<font face="Calibri" size="5" color="darkred"> <b>Important Note about JupyterHub</b> </font>
<br><br>
<font face="Calibri" size="3"> <b>Your JupyterHub server will automatically shutdown when left idle for more than 1 hour. Your notebooks will not be lost but you will have to restart their kernels and re-run them from the beginning. You will not be able to seamlessly continue running a partially run notebook.</b> </font>


<hr>
<font face="Calibri">

<font size="5"> <b> 0. Importing Relevant Python Packages </b> </font>

<font size="3">In this notebook we will use the following scientific libraries:
<ol type="1">
    <li> <b><a href="https://www.gdal.org/" target="_blank">GDAL</a></b> is a software library for reading and writing raster and vector geospatial data formats. It includes a collection of programs tailored for geospatial data processing. Most modern GIS systems (such as ArcGIS or QGIS) use GDAL in the background.</li>
    <li> <b><a href="http://www.numpy.org/" target="_blank">NumPy</a></b> is one of the principal packages for scientific applications of Python. It is intended for processing large multidimensional arrays and matrices, and an extensive collection of high-level mathematical functions and implemented methods makes it possible to perform various operations with these objects. </li>
</font>
<br>
<font face="Calibri" size="3"><b>Our first step is to import them:</b> </font>

In [None]:
%%capture
import copy
import os
import glob
import json # for loads
import shutil
import re
import gdal
import numpy as np

from IPython.display import display, clear_output, Markdown

import asf_notebook as asfn

try:
    from hyp3_sdk import Batch
except:
    !python -m pip install hyp3-sdk==0.5 --user
    from hyp3_sdk import Batch

<hr>
<font face="Calibri">

<font size="5"> <b> 1. Load Your Own Data Stack Into the Notebook </b> </font>

<font size="3"> This notebook assumes that you've created your own data stack over your personal area of interest using the <a href="https://www.asf.alaska.edu/" target="_blank">Alaska Satellite Facility's</a> value-added product system HyP3, availablew via <a href="https://search.asf.alaska.edu/#/" target="_blank">ASF Data Search (Vertex)</a>. HyP3 is an ASF service used to prototype value added products and provide them to users to collect feedback.

This lab expects <a href="https://media.asf.alaska.edu/uploads/RTC/rtc_atbd_v1.2_final.pdf" target="_blank">Radiometric Terrain Corrected</a> (RTC) image products as input, so be sure to select an RTC process when creating the project for your input data within HyP. Prefer a unique orbit geometry **(choose ascending or descending, not both)** to keep geometric differences between images low.

We will retrieve HyP3 data via the HyP3 API. As both HyP3 and the Notebook environment sit in the <a href="https://aws.amazon.com/" target="_blank">Amazon Web Services (AWS)</a> cloud, data transfer is quick and cost effective.</font>
</font>

<hr>
<font face="Calibri" size="3"> Before we download anything, create a working directory for this analysis and change into it.
<br><br>
<b>Select or create a working directory for the analysis:</b></font>

In [None]:
while True:
    data_dir = asfn.input_path(f"\nPlease enter the name of a directory in which to store your data for this analysis.")
    if os.path.exists(data_dir):
        contents = glob.glob(f'{data_dir}/*')
        if len(contents) > 0:
            choice = asfn.handle_old_data(data_dir, contents)
            if choice == 1:
                shutil.rmtree(data_dir)
                os.mkdir(data_dir)
                break
            elif choice == 2:
                break
            else:
                clear_output()
                continue
        else:
            break
    else:
        os.mkdir(data_dir)
        break

<font face="Calibri" size="3"><b>Change into the analysis directory:</b></font>

In [None]:
analysis_directory = f"{os.getcwd()}/{data_dir}"
os.chdir(analysis_directory)
print(f"Current working directory: {os.getcwd()}")

<font face="Calibri" size="3"><b>Create a folder in which to download your RTC products.</b> </font>

In [None]:
rtc_path = "rtc_products"
asfn.new_directory(rtc_path)
products_path = f"{analysis_directory}/{rtc_path}"

<font face="Calibri" size="3"><b>Create a HyP3 object and authenticate using your Earthdata Credentials</b> </font>

In [None]:
hyp3 = asfn.hyp3_auth()

<font face="Calibri" size="3"><b>List your projects and select one:</b></font>

In [None]:
projects = asfn.get_RTC_projects(hyp3)

if len(projects) > 0:
    display(Markdown("<text style='color:darkred;'>Note: After selecting a project, you must select the next cell before hitting the 'Run' button or typing Shift/Enter.</text>"))
    display(Markdown("<text style='color:darkred;'>Otherwise, you will simply rerun this code cell.</text>"))
    print('\nSelect a Project:')
    project_select = asfn.select_parameter(projects)

project_select

<font face="Calibri" size="3"><b>Select a date range of products to download:</b> </font>

In [None]:
project = project_select.value
jobs = hyp3.find_jobs(name=project)
jobs = jobs.filter_jobs(running=False, include_expired=False)
jobs = Batch([job for job in jobs if job.job_type.startswith('RTC')])

if len(jobs) < 1:
    raise ValueError("There are no unexpired RTC products for this project.\nSelect a different project or rerun your jobs in Vertex.")

display(Markdown("<text style='color:darkred;'>Note: After selecting a date range, you should select the next cell before hitting the 'Run' button or typing Shift/Enter.</text>"))
display(Markdown("<text style='color:darkred;'>Otherwise, you may simply rerun this code cell.</text>"))
print('\nSelect a Date Range:')
dates = asfn.get_job_dates(jobs)
date_picker = asfn.gui_date_picker(dates)
date_picker

<font face="Calibri" size="3"><b>Save the selected date range:</b> </font>

In [None]:
date_range = asfn.get_slider_vals(date_picker)
date_range[0] = date_range[0].date()
date_range[1] = date_range[1].date()
print(f"Date Range: {str(date_range[0])} to {str(date_range[1])}")
project = asfn.filter_jobs_by_date(jobs, date_range)

<font face="Calibri" size="3"><b>Gather the available paths and orbit directions for the remaining products:</b></font>

In [None]:
display(Markdown("<text style='color:darkred;'><text style='font-size:150%;'>This may take some time for projects containing many jobs...</text></text>"))
project = asfn.get_paths_orbits(project)
paths = set()
orbit_directions = set()
for p in project:
    paths.add(p.path)
    orbit_directions.add(p.orbit_direction)
paths.add('All Paths')
display(Markdown(f"<text style=color:blue><text style='font-size:175%;'>Done.</text></text>"))

<hr>
<font face="Calibri" size="3"><b>Select a path or paths (use shift or ctrl to select multiple paths):</b></font>

In [None]:
display(Markdown("<text style='color:darkred;'>Note: After selecting a path, you must select the next cell before hitting the 'Run' button or typing Shift/Enter.</text>"))
display(Markdown("<text style='color:darkred;'>Otherwise, you will simply rerun this code cell.</text>"))
print('\nSelect a Path:')
path_choice = asfn.select_mult_parameters(paths)
path_choice

<font face="Calibri" size="3"><b>Save the selected flight path/s:</b></font>

In [None]:
flight_path = path_choice.value
if flight_path:
    if flight_path:
        print(f"Flight Path: {flight_path}")
    else:
        print('Flight Path: All Paths')
else:
    print("WARNING: You must select a flight path in the previous cell, then rerun this cell.")

<font face="Calibri" size="3"><b>Select an orbit direction:</b></font>

In [None]:
if len(orbit_directions) > 1:
    display(Markdown("<text style='color:red;'>Note: After selecting a flight direction, you must select the next cell before hitting the 'Run' button or typing Shift/Enter.</text>"))
    display(Markdown("<text style='color:red;'>Otherwise, you will simply rerun this code cell.</text>"))
print('\nSelect a Flight Direction:')
direction_choice = asfn.select_parameter(orbit_directions, 'Direction:')
direction_choice

<font face="Calibri" size="3"><b>Save the selected orbit direction:</b></font>

In [None]:
direction = direction_choice.value
print(f"Orbit Direction: {direction}")

<font face="Calibri" size="3"><b>Filter jobs by path and orbit direction:</b></font>

In [None]:
project = asfn.filter_jobs_by_path(project, flight_path)
project = asfn.filter_jobs_by_orbit(project, direction)
print(f"There are {len(project)} products to download.")

<font face="Calibri" size="3"><b>Download the products, unzip them into the rtc_products directory, and delete the zip files:</b> </font>

In [None]:
print(f"\nProject: {project.jobs[0].name}")
project_zips = project.download_files()
for z in project_zips:
    asfn.asf_unzip(products_path, str(z))
    z.unlink()

<font face="Calibri" size="3"><b>Determine the available polarizations:</b></font>

In [None]:
polarizations = asfn.get_RTC_polarizations(rtc_path)
polarization_power_set = asfn.get_power_set(polarizations)

<font face="Calibri" size="3"><b>Select a polarization:</b></font>

In [None]:
polarization_choice = asfn.select_parameter(sorted(polarization_power_set), 'Polarizations:')
polarization_choice

<font face="Calibri" size="3"><b>Create a paths variable, holding the relative path to the tiffs in the selected polarization/s:</b></font>

In [None]:
polarization = polarization_choice.value
print(polarization)
if len(polarization) == 2:
    regex = "\w[\--~]{{5,300}}(_|-){}.(tif|tiff)$".format(polarization)
    dbl_polar = False
else:
    regex = "\w[\--~]{{5,300}}(_|-){}(v|V|h|H).(tif|tiff)$".format(polarization[0])
    dbl_polar = True

<hr>
<font face="Calibri" size="3"> You may notice duplicates in your acquisition dates. As HyP3 processes SAR data on a frame-by-frame basis, duplicates may occur if your area of interest is covered by two consecutive  image frames. In this case, two separate images are generated that need to be merged together before time series processing can commence.
<br><br>
<b>Write functions to collect and print the paths of the tiffs:</b></font>

In [None]:
def get_tiff_paths(regex, polarization, pths):
    tiff_paths = []
    for pth in glob.glob(pths):
        tiff_path = re.search(regex, pth)
        if tiff_path:
            tiff_paths.append(pth)
    return tiff_paths

def print_tiff_paths(tiff_paths):
    print("Tiff paths:")
    for p in tiff_paths:
        print(f"{p}\n")

<font face="Calibri" size="3"><b>Write a function to collect the product acquisition dates:</b></font>

In [None]:
def get_dates(tiff_paths):
    dates = []
    for pth in tiff_paths:
        dates.append(asfn.date_from_product_name(pth).split('T')[0])
    return dates

<font face="Calibri" size="3"><b>Collect and print the paths of the tiffs:</b></font>

In [None]:
tiff_pth = f"{rtc_path}/*/*{polarization[0]}*.tif*"
tiff_paths = get_tiff_paths(regex, polarization, tiff_pth)
print_tiff_paths(tiff_paths)

<hr>
<font face="Calibri" size="4"> <b>1.2 Fix multiple UTM Zone-related issues</b> <br>
<br>
<font face="Calibri" size="3">Fix multiple UTM Zone-related issues should they exist in your data set. If multiple UTM zones are found, the following code cells will identify the predominant UTM zone and reproject the rest into that zone. This step must be completed prior to merging frames or performing any analysis.</font>
<br><br>
<font face="Calibri" size="3"><b>Use gdal.Info to determine the UTM definition types and zones in each product:</b></font>

In [None]:
coord_choice = asfn.select_parameter(["UTM", "Lat/Long"], description='Coord Systems:')
coord_choice

In [None]:
utm_zones = []
utm_types = []
print('Checking UTM Zones in the data stack ...\n')
for k in range(0, len(tiff_paths)):
    info = (gdal.Info(tiff_paths[k], options = ['-json']))
    info = json.dumps(info)
    info = (json.loads(info))['coordinateSystem']['wkt']
    zone = info.split('ID')[-1].split(',')[1][0:-2]
    utm_zones.append(zone)
    typ = info.split('ID')[-1].split('"')[1]
    utm_types.append(typ)
print(f"UTM Zones:\n {utm_zones}\n")
print(f"UTM Types:\n {utm_types}")

<font face="Calibri" size="3"><b>Identify the most commonly used UTM Zone in the data:</b></font>

In [None]:
if coord_choice.value == 'UTM':
    utm_unique, counts = np.unique(utm_zones, return_counts=True)
    a = np.where(counts == np.max(counts))
    predominant_utm = utm_unique[a][0]
    print(f"Predominant UTM Zone: {predominant_utm}")
else:
    predominant_utm = '4326'

<font face="Calibri" size="3"><b>Reproject all tiffs to the predominate UTM:</b></font>

In [None]:
reproject_indicies = [i for i, j in enumerate(utm_zones) if j != predominant_utm]
print('--------------------------------------------')
print('Reprojecting %4.1f files' %(len(reproject_indicies)))
print('--------------------------------------------')
for k in reproject_indicies:
    temppath = tiff_paths[k].strip()
    _, product_name, tiff_name = temppath.split('/')
    cmd = f"gdalwarp -overwrite rtc_products/{product_name}/{tiff_name}"\
          f" rtc_products/{product_name}/r{tiff_name} -s_srs {utm_types[k]}:"\
          f"{utm_zones[k]} -t_srs EPSG:{predominant_utm}"
    !$cmd
    rm_cmd = f"rm {tiff_paths[k].strip()}"
    !$rm_cmd

<font face="Calibri" size="3"><b>Update tiff_paths with any new filenames created during reprojection:</b></font>

In [None]:
tiff_paths = get_tiff_paths(regex, polarization, tiff_pth)
print_tiff_paths(tiff_paths)

<hr>
<font face="Calibri" size="4"> <b>1.3 Merge multiple frames from the same date.</b></font>
<br><br>
<font face="Calibri" size="3"><b>Create a list aquisition dates:</b></font>

In [None]:
dates = get_dates(tiff_paths)
print(dates)

<font face="Calibri" size="3"><b>Create a set from the date list, removing any duplicates:</b></font>

In [None]:
unique_dates = set(dates)
print(unique_dates)

<font face="Calibri" size="3"><b>Determine which dates have multiple frames. Create a dictionary with each date as a key linked to a value set as an empty string:</b></font>

In [None]:
dup_date_batches = [{}]
for date in unique_dates:
    count = 0
    for d in dates:
        if date == d:
            count +=1
    if count > 1:
        dup_date_batches[0].update({date : ""})
if dbl_polar:
    dup_date_batches.append(copy.deepcopy(dup_date_batches[0]))
print(dup_date_batches)

<font face="Calibri" size="3"><b>Update the key values in dup_paths with the string paths to all the tiffs for each date:</b></font>

In [None]:
if dbl_polar:
    polar_list = [polarization.split(' ')[0], polarization.split(' ')[2]]
else:
    polar_list = [polarization]

for i, polar in enumerate(polar_list):
    polar_regex = f"(\w|/)*_{polar}.(tif|tiff)$"
    polar_paths = get_tiff_paths(polar_regex, polar, tiff_pth)
    for pth in polar_paths:
        date = asfn.date_from_product_name(pth).split('T')[0]
        if date in dup_date_batches[i]:
            dup_date_batches[i][date] = f"{dup_date_batches[i][date]} {pth}"

for d in dup_date_batches:
    print(d)
    print("\n")

<font face="Calibri" size="3"><b>Merge all the frames for each date.</b></font>

In [None]:
for i, dup_dates in enumerate(dup_date_batches):
    for dup_date in dup_dates:
        output = f"{dup_dates[dup_date].split('/')[0]}/{dup_dates[dup_date].split('/')[1]}/new{dup_dates[dup_date].split('/')[2].split(' ')[0]}"
        gdal_command = f"gdal_merge.py -o {output} {dup_dates[dup_date]}"
        print(f"\n\nCalling the command: {gdal_command}\n")
        !$gdal_command
        for pth in dup_dates[dup_date].split(' '):
            if pth and asfn.path_exists(pth):
                os.remove(pth)
                print(f"Deleting: {pth}")

<hr>
<font face="Calibri" size="3"> <b>Verify that all duplicate dates were resolved:</b> </font>

In [None]:
tiff_paths = get_tiff_paths(regex, polarization, tiff_pth)
for polar in polar_list:
    polar_tiff_pth = tiff_pth.replace('V*', polar)
    polar_tiff_paths = get_tiff_paths(regex, polar, polar_tiff_pth)
    dates = get_dates(polar_tiff_paths)
    if len(dates) != len(set(dates)):
        print(f"Duplicate dates still present!")
    else:
        print(f"No duplicate dates are associated with {polar} polarization.")

<font face="Calibri" size="3"><b>Print the updated the paths to the tiffs:</b></font>

In [None]:
print_tiff_paths(tiff_paths)

<font face="Calibri" size="3"><b>Create a tiffs folder, move the tiffs into it, and delete the rtc_products folder:</b></font>

In [None]:
asfn.new_directory("tiffs")
for tiff in tiff_paths:
    os.rename(tiff, f"{analysis_directory}/tiffs/{tiff.split('/')[-1]}")
shutil.rmtree(rtc_path)

<font face="Calibri" size="3"><b>Print the path where you saved your tiffs.</b></font>

In [None]:
print(f"{analysis_directory}/tiffs")

<font face="Calibri" size="2"> <i>Prepare_RTC_Stack_HyP3_v2.ipynb - Version 1.0.1 - February 2021</i></font>