<img src="NotebookAddons/blackboard-banner.jpg" width="100%" />
<font face="Calibri">
<br>
<font size="7"> <b> GEOS 657: Microwave Remote Sensing<b> </font>

<font size="5"> <b>Lab 9: InSAR Time Series Analysis using GIAnT within Jupyter Notebooks<br>Part 1: Data Download & Preprocessing from a SARVIEWS Import <font color='rgba(200,0,0,0.2)'> -- [## Points] </font> </b> </font>

<br>
<font size="4"> <b> Franz J Meyer & Joshua J C Knicely; University of Alaska Fairbanks</b> <br>
<img src="NotebookAddons/UAFLogo_A_647.png" width="170" align="right" /><font color='rgba(200,0,0,0.2)'> <b>Due Date: </b>NONE</font>
</font>

<font size="3"> This Lab is part of the UAF course <a href="https://radar.community.uaf.edu/" target="_blank">GEOS 657: Microwave Remote Sensing</a>. This lab is divided into 3 parts: 1) data download and preprocessing, 2) GIAnT time series, and 3) a simple Mogi source inversion. The primary goal of this lab is to demonstrate how to download the requisite data, specifically interferograms, and preprocess them for use with the Generic InSAR Analysis Toolbox (<a href="http://earthdef.caltech.edu/projects/giant/wiki" target="_blank">GIAnT</a>) in the framework of *Jupyter Notebooks*.<br>

<b>Our specific objectives for this lab are to:</b>

- Download data using ASF tools: 
    - From a SARVIEWS subscription. 
- Pre-process data: 
    - Subset (crop) the data to an Area of Interest (AOI). 
    - Verify the quality of the data.
    - Cull data selection based on a timeframe and orbital characteristics. 
    - Reproject interferograms to a uniform UTM zone. 
    - Use TRAIN to remove static atmospheric effects related to surface elevation. (<i><b>TENTATIVE</i></b>)
</font>

<br>
<font face="Calibri">

<font size="5"> <b> Target Description </b> </font>

<font size="3"> In this lab, we will download interferograms covering a SARVIEWS area of interest.  </font>

<font size="4"> <font color='rgba(200,0,0,0.2)'> <b>THIS NOTEBOOK INCLUDES NO HOMEWORK ASSIGNMENTS.</b></font> <br>

Contact me at fjmeyer@alaska.edu should you run into any problems.
</font>

<font face='Calibri'><font size='5'><b>Overview</b></font>
<br>
<font size='3'><b>About TRAIN</b>
<br>
The tropospheric correction is one of the most significant challanges in InSAR. Without this correction, surface deformation signals can go completely unnoticed, or, perhaps worse, a false signal caused by the atmosphere can be taken as an accurate representation of surface deformation. This can often occur with volcanoes due to the characteristics of the atmosphere as well as the surface elevation (i.e., a taller point on the volcano means the InSAR signal passed through less atmosphere and will be affected differently from a point lower on the volcano). <br>We will use the Toolbox for Reducing Atmospheric InSAR Noise (TRAIN) today. The purpose of TRAIN is to add state of the art tropospheric correction methods to the InSAR processing chain. It can include corrections that are phase-based, using spectrometers, using weather models, and even data from balloon soundings. 
<br><br>
<b>Limitations</b>
<br>
The particular version we are using was created by the Alaska Satellite Facility. ASF took the original MATLAB code developed by David Dekaert and converted it into Python 2.7. Currently, it only allows processing using MERRA2 data. Including other data types can be done relatively easily, though it does require modification of the existing python code. 
<br>
Each of the correction methods included in TRAIN is ideal for different locations and conditions. Spectrometers provide the best correction, but can only be used in cloud-free and daylight conditions. Phase-based and weather model correction methods capture regional signals well, but fail to capture turbulent tropospheric signals. 
<br><br>
More information about TRAIN, its capabilities, and its limitations can be found in <a href="http://www.sciencedirect.com/science/article/pii/S0034425715301231">Bekaert et al. [2015]</a>, at David Dekaert's <a href="http://davidbekaert.com/#links">webpage</a>, or at the <a href="https://github.com/asfadmin/hyp3-TRAIN" target="_blank">ASF Github</a>. 
<br><br>
<b>Steps to use TRAIN</b><br>

- System Setup
    - Import Python Packages
    - Set User Inputs
- Download and Preprocess Data
    - Access SARVIEWs Subscriptions
    - Download and unzip the Data
    - Project all Geotiffs to the Same UTM Zone
    - Mosaic Geotiffs with Partial Coverage
- Identify Area of Interest
- Subset (Crop) Data to Area of Interest
    - Subset the Data
    - Check that Subsetted Geotiffs have Pixels
    - Check the Dimensions of Subsetted Geotiffs
- Create Input Files and Code for TRAIN
    - Create <font face='Courier New'>parms_aps.txt</font> file
    - Create <font face='Courier New'>ifgday.mat</font> file
    - Convert Subsetted Tiffs to GCS Coordinates
    - Adjust file names
- Run TRAIN
    - Minor Set Up
    - Steps 0-3
    - Step 4
    - Comparison of Corrected and Uncorrected Unwrapped Phase
    - Convert back to the original coordinate system

<br><br>
When you use TRAIN, cite the creator's work using:<br>
Bekaert et al., RSE 2015, "Statistical comparison of InSAR tropospheric correction techniques." <br>&emsp;Open access: http://www.sciencedirect.com/science/article/pii/S0034425715301231
<!-- <br><br><b><i>DO WE NEED TO ALSO GIVE ASF A CITATION???</i></b> -->

<font face='Calibri'><font size='5'><b>0. Move into Desired Directory</b></font><br>
    <font size='3'>Before we start running code, we are going to move into our desired directory. This can be any folder. Typically, we do this to maintain better folder readability and organization. </font></font>

<font face='Calibri'><font size='5'><b>1. System Setup</b></font><br>
    <font size='3'>We will first do some system setup. This involves importing requiesite Python libraries, activating Bokeh plotting, and defining all of our user inputs. </font></font>

<font face='Calibri'>
<font size="4"> <b> 1.1 Import Python Packages and Enable Extensions</b></font>    <br>
    <font size='3'>Let's <b>import the Python libraries</b> and packages we will need to run this lab, and then activate the Bokeh plotting. </font></font>

In [None]:
import os
import shutil
import re
import sys
import glob
import json
import pickle
from datetime import date
import tempfile
from urllib.parse import urlparse, parse_qs

import gdal
import osr
import pyproj
import numpy as np

import matplotlib.pyplot as plt
import matplotlib.animation
from matplotlib import animation
from matplotlib import rc
from matplotlib.widgets import RectangleSelector

from IPython.display import clear_output
from IPython.display import Markdown
plt.rcParams.update({'font.size': 12})

from asf_notebook import new_directory
from asf_notebook import path_exists
from asf_notebook import remove_nan_filled_tifs
from asf_notebook import AOI
from asf_notebook import select_parameter
from asf_notebook import EarthdataLogin
from asf_notebook import get_wget_cmd
from asf_notebook import asf_unzip
from asf_notebook import get_hyp3_subscriptions
from asf_notebook import gui_date_picker
from asf_notebook import get_subscription_products_info
from asf_notebook import get_products_dates_insar
from asf_notebook import get_slider_vals
from asf_notebook import input_path
from asf_notebook import handle_old_data

In [None]:
# FOR DEVELOPMENT PURPOSES ONLY
lst = ['MERRA2','ingrams','ingram_subsets','ingram_subsets_converted', \
       'GIAnT','GIAnT_Data','Geotiffs','DEM']
for item in lst:
    try:
        shutil.rmtree(item)
    except:
        pass

<font face='Calibri' size="4"> <b> 1.2 Define an Analysis Directory</b></font> 

In [None]:
while True:
    sub_dir = input_path(
        f"\nPlease enter the name of a directory in which to store your data for this analysis.")
    if os.path.exists(sub_dir):
        contents = glob.glob(f'{sub_dir}/*')
        if len(contents) > 0:
            choice = handle_old_data(sub_dir, contents)
            if choice == 1:
                shutil.rmtree(sub_dir)
                os.mkdir(sub_dir)
                break
            if choice == 2:
                break
            else:
                clear_output()
                continue
        else:
            break
    else:
        os.mkdir(sub_dir)
        break
os.chdir(sub_dir)
home = os.getcwd()

<font face='Calibri'>
<font size="4"> <b> 1.3 Set User Inputs </b></font>    <br>
  

<font face='Calibri'>
    <font size='3'><b>1.3.2 Directory Arrangement</b><br></font>
    <font size='3'>Set some variables that affect how the folders are arranged.
    </font></font>

In [None]:
# Set the name of the folder that will hold the full 
# interferograms and their associated files. 
ingram_folder = 'ingrams' 
replace_ingram = True # If True, any folder with the same 
                      # name will be deleted and recreated. 

# Designate the folder in which we wish to store our interferogram subsets. 
subset_folder = 'ingram_subsets' 
replace_subset = True 
delete_subsets = False # if True, this will delete the subset_folder at the end
                       # of the lab. 'delete_subsets' should be set to False when
                       # running this in class as we will use the uncorrected 
                       # subsets in Part 2. 
            
# Designate the folder in which we wish to store our converted interferogram subsets. 
# This is important later in the program when we convert our subsets from a 
# local geographic coordinate system to decimal degrees. 
corrected_folder = 'ingram_subsets_converted'
replace_corrected = True

<font face='Calibri'><font size='3'><b>1.3.3 HyP3 Login Information</b><br></font>
<font size='3'>To download data from ASF, we need to provide our <a href="https://www.asf.alaska.edu/get-data/get-started/free-earthdata-account/" target="_blank">NASA Earth Data</a> username to the system. Setup an EarthData account if you do not yet have one. <font color='rgba(200,0,0,0.2)'><b>Note that EarthData's End User License Agreement (EULA) applies when accessing the Hyp3 API from this notebook. If you have not acknowleged the EULA in EarthData, you will need to navigate to <a href="https://earthdata.nasa.gov/" target="_blank">EarthData's home page</a> and complete that process.</b></font><br><br>
    For some data processing later, we will also need to add <b>NASA GESDISC DATA ARCHIVE</b> to our list of approved applications. This is needed to provide access to MERRA2 data files needed by TRAIN to perfrom the atmospheric correction. This can be done by going to your <a href="https://urs.earthdata.nasa.gov/profile" target="_blank">EarthData's profile page</a>, clicking <b>Applications</b> and selecting <b>Approved Applications</b> from the drop down menu, select <b>Approve More Applications</b> at the bottom left, search for <b>NASA GESDISC DATA ARCHIVE</b>, select it, and agree to the terms and conditions. Once that is complete, you will have access to the MERRA2 data and TRAIN will be able to automatically download whichever files it requires. 
<br><br>
<b>Login to Earthdata:</b> </font> 

In [None]:
login = EarthdataLogin()

<font face='Calibri'>
    <font size='3'><b>1.3.4 Designate TRAIN Input Parameters</b><br></font>
    <font size='3'><i></i></font></font>

In [None]:
trainDir = '' # temporary blank directory
era_data_type = 'ECMWF'
ifgday_file = 'ifgday.mat'
merra2_datapath = './MERRA2'
demPathNName = os.path.join(trainDir,'myDEM.tif')
lambdaInMeters = 0.055465763 
incidenceAngle = 38.5/180*np.pi # This needs to be in radians. 
extra = 0.1

<hr>
<font face="Calibri">

<font size="5"> <b> 2. Download and Preprocess Data</b> </font>

<font size="3"> We will begin by acquiring the interferograms of selected in SARVIEWS.
</font>

<font face='Calibri'>
    <font size='4'><b>2.1 Access HyP3 Subscriptions</b><br></font>
<font size='3'>We will now access our subscriptions for download. We will demonstrate 2 methods: 1. using a HyP3 subscription, and 2. using a SARVIEWS list. 

<font face='Calibri'><font size='4'><b>2.2 Access SARVIEWs Subscription</b><br></font>
<font size='3'>The below code is a demonstration of how to download data via a SARVIEWs subscription. The selected HyP3 product IDs are stored in the URL. <br>The first cell below acquires user credentials and creates a .netrc file in order to access Earthdata. The 2nd cell executes some javascript that loads this URL into the notebook's python kernel. Subsequently, this URL is parsed and the IDs are extracted. Using the supplied event ID, the following cell compares each of the IDs to all of the products in this event group and then returns a list of URLS from products whos id is included in the URL.  <font face='Courier New'>products_temp</font>. The user name and password are the corresponding Earthdata user login. </font></font>

In [None]:
# Make sure you are logged in
try:
    if login:
        pass
except Exception as e:
    login = EarthdataLogin()

<font face='Calibri' size='4'><b> Use Javascript to extract this notebook's URL </b></font>

In [None]:
%%javascript 
var kernel = Jupyter.notebook.kernel; 
var command = ["notebookUrl = ",
               "'", window.location, "'" ].join('')
// alert(command)
kernel.execute(command)

In [None]:
print(f'Notebook URL: {notebookUrl}')


In [None]:
# Parse URL and retreive product IDs and Event

parsed_url = urlparse(notebookUrl)
params = parse_qs(parsed_url.query)
try:
    ids = params['ids'][0].split(',')
    event_id = params['event'][0]
    
    if len(ids) == 0:
        print('No products were found from the url')
    else:
        print(f'Found {len(ids)} product IDs in the URL to prepare: ')
        for product in ids:
            print(product)

    print(f'Using eventId: {event_id}')    
except:
    display(Markdown(f'<text style=color:red> ERROR: Missing Data</text>'))
    display(Markdown(f'<text style=color:red> Go to <a href="http://sarviews-hazards.alaska.edu" target="_blank"> sarviews-hazards.alaska.edu </a> to find SARVIEWS data.</text>'))
    

In [None]:
# Get download URLs using the event ID 
groups = login.api.get_groups_public(id=event_id)
groupName = groups[0]['name']
print(f'Loading Products from {groupName}...')

# Load every page of products
event_prods = []
page = 1
emptyQuery = False
while not emptyQuery:
    new_prods =  login.api.get_products_public(group_id=event_id, page=(page - 1))
    event_prods += new_prods
    if len(new_prods) == 0:
        emptyQuery = True
    else:
        print(f'Loaded page {page} of products')
        page += 1
        
download_urls = []
for product in event_prods:
    if str(product['id']) in ids:
        download_urls.append(product['url'])
        
if (len(download_urls) == len(ids)):
    print('All IDs are accounted for with download URLS! Ready to download.')
else: 
    print(f'Only {len(download_urls)} products match those selected from the URL and are ready to download. ')




<font face='Calibri' size='4'><b>2.3 Download and unzip the data. </b></font>
<br>
<font size='3' face='Calibri'>We should now have a list of the products we wish to download. The code below will download the requisite zip files, unpack them into our designated folder, and then delete any remaining zip files. Removing the zip files helps to reduce space usage.</font>
<br><br>
<font size='3' face='Calibri'><b>Create the folder in which we wish to store our downloaded interferograms.</b> This deletes the current ingram directory if you set replace_ingram to true earlier in the notebook.</font>

In [None]:
if replace_ingram:
    try:
        # Try to remove the folder tree to ensure no other data exists in the folder. 
        shutil.rmtree(ingram_folder)
    except: 
        pass
!mkdir -p {ingram_folder} # create the directory

<font size='3'><b>2.3.3 Download the Products</b><br></font>

In [None]:
download_urls.sort()
print(f"There are {len(download_urls)} products to download.")
from asf_notebook import asf_unzip
full_ingram_path = f"{home}/{ingram_folder}"
if path_exists(full_ingram_path):
    product_count = 1
    print(f"\nEvent : {groupName}")
    for url in download_urls:
        print(f"\nProduct Number {product_count} of {len(download_urls)}:")
        product_count += 1
        
        parsed = urlparse(url)
        file_name = os.path.basename(parsed.path) 
        # print(f'Filename: {file_name}')
        
        # if not already present, we need to download and unzip products
        newProductFolder = file_name.split('.zip')[0]
        # print(f"Location to Unzip to: {newProductFolder}")
        if not os.path.exists(newProductFolder):
            print(
                f"\n{newProductFolder} is not present.\nDownloading from {url}")
            cmd = get_wget_cmd(url, login)
            !$cmd
            if os.path.exists(file_name):
                zippedProduct = f"{home}/{file_name}"
                zippedProduct = zippedProduct.split('\n')[0]
                print(f"zipped Product: {zippedProduct}")

                asf_unzip(full_ingram_path, zippedProduct)

                try:
                    os.remove(file_name)
                except OSError:
                    pass
                print(f"\nDone.")
            else:
                print('Download failed, does this HyP3 product still exist?')
        else:
            print(f"{newProductFolder} already exists.")

<font face='Calibri' size='3'><b>Write functions to grab and print the path information for the amplitude, unwrapped phase, and the coherence files.</b> This information is useful later.</font>

In [None]:
def get_tiff_paths(paths):
    tiff_paths = !ls $paths | sort -t_ -k5,5
    return tiff_paths

def print_tiff_paths(tiff_paths):
    print("Tiff paths:")
    for p in tiff_paths:
        print(f"{p}\n")

<font face='Calibri' size='3'><b>Call the functions we just wrote to gather the path information.</font>

In [None]:
# Grab the paths of the amplitude imagery
paths_amp    = f"{ingram_folder}/*/*_amp.tif"
paths_ingram = f"{ingram_folder}/*/*_unw_phase.tif"
paths_cohr   = f"{ingram_folder}/*/*_corr.tif"
amp_paths    = get_tiff_paths(paths_amp)
ingram_paths = get_tiff_paths(paths_ingram)
cohr_paths    = get_tiff_paths(paths_cohr)
print(f"amp_path[0]:    {amp_paths[0]}")
print(f"ingram_path[0]: {ingram_paths[0]}")
print(f"cohr_path[0]:   {cohr_paths[0]}")

<font face='Calibri'><font size='4'><b>2.5 Orbit Direction</b><br></font>
<font size='3'><font face='Calibri'><font size='3'>We will acquire orbit information about our interferograms, retain only those that match our desired orbit direction, and then pickle the variable <font face='Courier New'>heading_avg</font> for use later in Part 2 of this lab. </font></font>

<font face='Calibri' size='3'><b>Write a function that returns a list of only the paths to products in our selected orbit direction.</b></font>

In [None]:
def get_orbit(paths, txtToReplace):
    filesToSort = paths.copy()
    # Loop through 'filesToSort' in reverse order to remove 
    # those that don't meet our orbit requirement. 
    orbits = []
    # We again must loop through these in reverse
    # order to prevent skipping of entries. 
    for i in sorted(range(0,len(filesToSort)), reverse=True):
        # parse filesToSort
        path, file = os.path.split(filesToSort[i])
        # get datetime stamp from file and use that to open 
        # the .txt file that holds the heading information
        
        # Get the name of the file to open and read. 
        file = file.replace(txtToReplace,'.txt')
        metadata = open(os.path.join(path,file),"r")
        for line in metadata:
            t = line.split(':')
            if 'Heading' in t[0]:
                heading = float(t[1])
                if abs(heading) >= 90.0:
                    orbits.append('descending')
                if abs(heading) < 90.0:
                    orbits.append('ascending')
    ascending = orbits.count('ascending')
    descending = orbits.count('descending')
    if ascending >= descending:
        return 'ascending'
    elif descending > ascending:
        return 'descending'


def sort_orbits(paths, orbit, txtToReplace):
    filesToSort = paths.copy()
    # Loop through 'filesToSort' in reverse order to remove 
    # those that don't meet our orbit requirement. 
    headings = []
    # We again must loop through these in reverse
    # order to prevent skipping of entries. 
    for i in sorted(range(0,len(filesToSort)), reverse=True):
        # parse filesToSort
        path, file = os.path.split(filesToSort[i])
        # get datetime stamp from file and use that to open 
        # the .txt file that holds the heading information
        
        # Get the name of the file to open and read. 
        file = file.replace(txtToReplace,'.txt')
        metadata = open(os.path.join(path,file),"r")
        for line in metadata:
            t = line.split(':')
            if 'Heading' in t[0]:
                heading = float(t[1])
                headings.append(heading)
        # Remove entries based on 'orbit'
        if orbit.lower() == 'ascending':
            # Remove any entries with a heading greater than 
            # or equal to 90.0 degrees from north
            if abs(heading) >= 90.0:
                del filesToSort[i]
            #else:
            # print("Keeping: {}".format(filesToSort[i]))
        elif orbit.lower() == 'descending':
            # Remove any entries with a heading less than 90.0 degrees from north
            if abs(heading) < 90.0:
                del filesToSort[i]
            #else:
                #print("Keeping: {}".format(filesToSort[i]))
        else:
            print(f"Improper orbit designation.")
            print(f"orbit: {orbit}")
            print(f"Accepted Designations: 'ascending', 'descending'")
            break
    heading_avg = np.mean(headings)
    return filesToSort, heading_avg

<font face='Calibri' size='3'><b>Call sort_orbits to edit the paths to include only those that meet our orbit direction criterion.</b></font>

In [None]:
orbit = get_orbit(amp_paths,'_amp.tif') 
print(f'Primary orbit: {orbit}')

amps,heading_avg = sort_orbits(amp_paths,orbit,'_amp.tif') 
ingrams, _ = sort_orbits(ingram_paths,orbit,'_unw_phase.tif')
cohrs, _ = sort_orbits(cohr_paths,orbit,'_corr.tif')
# In Python, the underscore symbol means to ignore output. 
# As the 'heading_avg' will be the same for all of these, 
# we can skip getting that information again. Alternatively, 
# if something seems to be going wrong, we could get 
# the average heading information all 3 times and place it 
# into a different variable each time for comparison. 

print('Average Heading: ', heading_avg)

print(len(amps)) # if this is zero, then something is wrong.

<font face='Calibri'><font size='3'><b>2.5.2 Pickle <font face='Courier New'>heading_avg</font> for GIAnT</b><br></font>
<font size='3'><font face='Calibri'><font size='3'>We will need the variable <font face='Courier New'>'filename = heading_avg'</font> for later use in Part 2 of the lab with GIAnT. As this is only a single variable that we must pass, this could be done manually, but it's good to know how to pass variables like this. We may need to transfer something much larger than a float at some point. </font></font>
<br><br>
<font face='Calibri' size='3'><b>Create an output file, pickle filename and store it for use in another lab.</b> We will unpickle this in Part 2.</font> 

In [None]:
filename = 'heading_avg'
outfile = open(filename,'wb')

pickle.dump(heading_avg, outfile)
outfile.close()

<font face='Calibri'><font size='4'><b>2.6 Project all 'tiff's to the Same UTM Zone</b><br></font>
<font size='3'><font face='Calibri'><font size='3'>Some of the geotiffs may use different UTM zones. This can cause errors in processing the data. In the code below, we will identify the predominant UTM zone and reproject the rest into that zone. </font></font>

<font size='3' face='Calibri'><b>Create a list of all of the geotiff files.</b></font>

In [None]:
tiff_paths = amps + ingrams + cohrs
print(f"Example tiff and path: {tiff_paths[0]}")

<font face="Calibri" size="3"><b>Write a function that uses gdal.Info to determine the UTM definition types and zones in each product:</b></font>

In [None]:
def getUTM_znt(tiff_paths):
    utm_zones = []
    utm_types = []
    print('Checking UTM Zones in the data stack ...\n')
    for k in range(0, len(tiff_paths)):
        info = (gdal.Info(tiff_paths[k], options = ['-json']))
        info = (json.loads(info))['coordinateSystem']['wkt']
        zone = info.split('ID')[-1].split(',')[1][0:-2]
        utm_zones.append(zone)
        typ = info.split('ID')[-1].split('"')[1]
        utm_types.append(typ)
    return utm_zones, utm_types

<font face="Calibri" size="3"><b>Call getUTM_znt to determine the UTM definition types and zones in each product:</b></font>

In [None]:
utm_zones, utm_types = getUTM_znt(tiff_paths)
print(f"Unique UTM Zones: {list(set(utm_zones))}")
print(f"Unique UTM Types: {list(set(utm_types))}")

<font face="Calibri" size="3"><b>Identify the most commonly used UTM Zone in the data.</b></font>

In [None]:
utm_unique, counts = np.unique(utm_zones, return_counts=True)
a = np.where(counts == np.max(counts))
predominate_utm = utm_unique[a][0]
print(f"Predominate UTM Zone: {predominate_utm}")
print(f"Number of UTM Zones:  {len(utm_unique)}")

<font face="Calibri" size="3"><b>Make a list of indicies in utm_zones that need to be reprojected.</b></font>

In [None]:
reproject_indicies = [i for i, j in enumerate(utm_zones) if j != predominate_utm]
print('Reprojecting %4.1f files' %(len(reproject_indicies)))
print(reproject_indicies)

<font face="Calibri" size="3"><b>Call gdalwarp to reproject any geotiffs not in the predominant UTM zone.</b> These will be stored in a new file with a leading 'r' for identification. The originals are then deleted and the new geotiffs renamed. These new geotiffs are placed in an entirely new file as GDAL may overwrite parts of the original file before accessing them. </font>

In [None]:
for k in reproject_indicies:
    folder, tiff_name = os.path.split(tiff_paths[k])
    cmd = (f"gdalwarp -overwrite {folder}/{tiff_name} {folder}/r{tiff_name}"
           f"-s_srs {utm_types[k]}:{utm_zones[k]} -t_srs EPSG:{predominate_utm}")
    print(f"Calling the command: {cmd}")
    !{cmd}
    rm_command = f"rm {tiff_paths[k].strip()}"
    #print(f"Calling the command: {rm_command}")
    #!{rm_command}
    # remove the leading 'r' from the new file. 
    print(f"Old = {os.path.join(folder,'r'+tiff_name)}")
    print(f"New = {os.path.join(folder,tiff_name)}")
    os.rename(os.path.join(folder,'r'+tiff_name),os.path.join(folder,tiff_name))

<font face="Calibri" size="3"><b>Double check that all of the 'tiff's now have the same UTM Zone and type.</b></font>

In [None]:
utm_zones, utm_types = getUTM_znt(tiff_paths)
print(f"Unique UTM Zones: {list(set(utm_zones))}")
print(f"Unique UTM Types: {list(set(utm_types))}")

<font face="Calibri" size="3"><b>Assign the UTM zone to the 'utm' variable.</b></font>

In [None]:
utm = utm_zones[0][:]
print(f"UTM Zone: {utm}")

<font face='Calibri'><font size='4'><b>2.7 Mosaic Geotiffs with Partial Coverage</b><br></font>
    <font size='3'>In this subsection, we will merge multiple frames from the same date into a single geotiff. This code makes the assumption that any imagery taken on the same day are frames that do not overlap the same areas. For Sentinel1 imagery, this holds true; for other data sources, it may not. </font></font>

<font face='Calibri'><font size='3'><b>Get the paths for the files to be merged.</b></font>

In [None]:
paths_full_amp  = f"{ingram_folder}/*/*_amp.tif"
paths_full_corr = f"{ingram_folder}/*/*_corr.tif"
paths_full_unw  = f"{ingram_folder}/*/*_unw_phase.tif"
amp_full_paths  = get_tiff_paths(paths_full_amp)
corr_full_paths = get_tiff_paths(paths_full_corr)
unw_full_paths  = get_tiff_paths(paths_full_unw)
print_tiff_paths(amp_full_paths+corr_full_paths+unw_full_paths)

<font face='Calibri'><font size='3'><b>Write a function to get date info from filenames</b></font>

In [None]:
def get_dates(paths):
    dates = []
    pths = glob.glob(paths)
    for p in pths:
        part = p.split("/")[2]
        date1 = part.split("_")[0][0:8]
        date2 = part.split("_")[1][0:8]
        dates.append([date1,date2])
    dates.sort()
    return dates

<font face="Calibri" size="3"><b>Create a list containing each date pair</b></font>

In [None]:
dates = get_dates(paths_full_unw)
for datepair in dates: print(datepair)

<font face="Calibri" size="3"><b>Create a list of groups of paths to products which share acquisition dates</b></font>

In [None]:
dup_date_batches = []
dup_dates = []
for i in range(0,len(dates)-1):
    dte = dates[i]
    for j in range(i+1,len(dates)):
        if dates[j] == dte:
            dup_dates.append(dte)
print(f"dup_dates       : {dup_dates}")

<font face="Calibri" size="3"><b>Write a function using gdal_merge to merge products with duplicate dates</b></font>

In [None]:
def merge_dup_pairs(dups, paths):
    for date_pair in dups:
        matching = [s for s in paths if all(xs in s for xs in date_pair)]
        if len(matching) == 2:
            # create the output file name and merge the two files
            # with matching sets of dates. 
            path,file = os.path.split(matching[0])
            outputFile = os.path.join(path, 'MERGED'+file)
            cmd = f"gdal_merge.py -o {outputFile} {matching[0]} {matching[1]}"
            !{cmd}
            
            # The below code does some clean up. 
            # First, delete the unmerged original
            os.remove(matching[0])
            # Second, rename merged file to be that of the original file
            os.rename(outputFile,matching[0])
            # Third, remove matching[1] from the list of paths. 
            paths.remove(matching[1])
        else:
            print(f"Error: there is not a pair of matching entries.")
            print(f"Number of matches: {len(matching)}")
    return None 

<font face="Calibri" size="3"><b>Call merge_dup_pairs to merge amp, unw, and corr products with duplicated dates</b></font>

In [None]:
merge_dup_pairs(dup_dates, amp_full_paths)
merge_dup_pairs(dup_dates, unw_full_paths)
merge_dup_pairs(dup_dates, corr_full_paths)
tiff_paths = amp_full_paths + unw_full_paths + corr_full_paths
print_tiff_paths(tiff_paths)

<font face='Calibri'>
    <font size='5'> <b> 3. Identify Area of Interest</b> </font>
    <br>
    <font size='3'> Here we identify our area of interest (AOI). Our AOI must contain all of the expected deformation and a surrounding region of little to no deformation. Following our selection of this region, we will subset our data to this region. This helps reduce computation time. </font>
    </font>

In [None]:
amp_tiff_paths = get_tiff_paths(paths_full_amp)
print_tiff_paths(amp_tiff_paths)

<font size="3"> <b>Create a string containing paths to one image for each area represented in the stack:</b> </font> 

In [None]:
to_merge = {}
for pth in amp_tiff_paths:
    info = (gdal.Info(pth, options = ['-json']))
    info = (json.loads(info))['wgs84Extent']['coordinates']
    
    coords = [info[0][0], info[0][3]]
    for i in range(0, 2):
        for j in range(0, 2):
            coords[i][j] = round(coords[i][j])
    str_coords = f"{str(coords[0])}{str(coords[1])}"
    if str_coords not in to_merge:
        to_merge.update({str_coords: pth})
merge_paths = ""
for pth in to_merge:
    merge_paths = f"{merge_paths} {to_merge[pth]}"
    
print(merge_paths)

<font face="Calibri" size="3"><b>Merge the images for display in the Area-Of-Interest selector:</b></font>

In [None]:
full_scene = f"{home}/full_scene.tif"
if os.path.exists(full_scene):
    os.remove(full_scene)
gdal_command = f"gdal_merge.py -o {full_scene} {merge_paths}"
!{gdal_command}

<font face="Calibri" size="3"><b>Create a Virtual Raster Stack:</b> </font>

In [None]:
image_file = f"{home}/raster_stack.vrt"
!gdalbuildvrt -separate $image_file -overwrite $full_scene

<font face="Calibri" size="3"><b>Convert the VRT into an array:</b> </font>

In [None]:
img = gdal.Open(image_file)
rasterstack = img.ReadAsArray()

<font face="Calibri" size="3"><b>Print the number of bands, pixels, and lines:</b> </font>

In [None]:
print(img.RasterCount) # Number of Bands
print(img.RasterXSize) # Number of Pixels
print(img.RasterYSize) # Number of Lines

<font face="Calibri" size="3"><b>Write an AOI selector class:</b> </font>

In [None]:
class AOI_Selector:
    def __init__(self, 
                 image,
                 fig_xsize=None, fig_ysize=None,
                 cmap=plt.cm.gist_gray,
                 vmin=None, vmax=None
                ):
        display(Markdown(f"<text style=color:blue><b>Area of Interest Selector Tips:\n</b></text>"))
        display(Markdown(f'<text style=color:blue>- This plot uses "matplotlib notebook", whereas the other plots in this notebook use "matplotlib inline".</text>'))
        display(Markdown(f'<text style=color:blue>-  If you run this cell out of sequence and the plot is not interactive, rerun the "%matplotlib notebook" code cell.</text>'))
        display(Markdown(f'<text style=color:blue>- Use the pan tool to pan with the left mouse button.</text>'))
        display(Markdown(f'<text style=color:blue>- Use the pan tool to zoom with the right mouse button.</text>'))
        display(Markdown(f'<text style=color:blue>- You can also zoom with a selection box using the zoom to rectangle tool.</text>'))
        display(Markdown(f'<text style=color:blue>- To turn off the pan or zoom to rectangle tool so you can select an AOI, click the selected tool button again.</text>'))
        
        display(Markdown(f'<text style=color:red><b>IMPORTANT!</b></text>'))
        display(Markdown(f'<text style=color:red>- Upon loading the AOI selector, the selection tool is already active.</text>'))
        display(Markdown(f'<text style=color:red>- Click, drag, and release the left mouse button to select an area.</text>'))
        display(Markdown(f'<text style=color:red>- The square tool icon in the menu is <b>NOT</b> the selection tool. It is the zoom tool.</text>'))
        display(Markdown(f'<text style=color:red>- If you select any tool, you must toggle it off before you can select an AOI</text>'))
        
        self.image = image
        self.x1 = None
        self.y1 = None
        self.x2 = None
        self.y2 = None
        if not vmin:
            self.vmin = np.nanpercentile(self.image, 1)
        else:
            self.vmin = vmin
        if not vmax:
            self.vmax=np.nanpercentile(self.image, 99)
        else:
            self.vmax = vmax
        if fig_xsize and fig_ysize:
            self.fig, self.current_ax = plt.subplots(figsize=(fig_xsize, fig_ysize)) 
        else:
            self.fig, self.current_ax = plt.subplots() 
        self.fig.suptitle('Area-Of-Interest Selector', fontsize=16)
        self.current_ax.imshow(self.image, cmap=plt.cm.gist_gray, vmin=self.vmin, vmax=self.vmax)


        def toggle_selector(self, event):
            print(' Key pressed.')
            if event.key in ['Q', 'q'] and toggle_selector.RS.active:
                print(' RectangleSelector deactivated.')
                toggle_selector.RS.set_active(False)
            if event.key in ['A', 'a'] and not toggle_selector.RS.active:
                print(' RectangleSelector activated.')
                toggle_selector.RS.set_active(True)
                
        toggle_selector.RS = RectangleSelector(self.current_ax, self.line_select_callback,
                                               drawtype='box', useblit=True,
                                               button=[1, 3],  # don't use middle button
                                               minspanx=5, minspany=5,
                                               spancoords='pixels',
                                               rectprops = dict(facecolor='red', edgecolor = 'yellow', 
                                                                alpha=0.3, fill=True),
                                               interactive=True)
        plt.connect('key_press_event', toggle_selector)

    def line_select_callback(self, eclick, erelease):
        'eclick and erelease are the press and release events'
        self.x1, self.y1 = eclick.xdata, eclick.ydata
        self.x2, self.y2 = erelease.xdata, erelease.ydata
        print("(%3.2f, %3.2f) --> (%3.2f, %3.2f)" % (self.x1, self.y1, self.x2, self.y2))
        print(" The button you used were: %s %s" % (eclick.button, erelease.button))

<font face="Calibri" size="3"><b>Create an AOI selector from your raster stack:</b> </font>

In [None]:
%matplotlib notebook

In [None]:
fig_xsize = 7.5
fig_ysize = 7.5
aoi = AOI_Selector(rasterstack, fig_xsize, fig_ysize)

<font face="Calibri" size="3"><b>Gather and define projection details:</b> </font>

In [None]:
geotrans = img.GetGeoTransform()
projlatlon = pyproj.Proj('EPSG:4326') # WGS84
projimg = pyproj.Proj(f'EPSG:{utm}')

<font face="Calibri" size="3"><b>Write a function to convert the pixel, line coordinates from the AOI selector into geographic coordinates in the stack's EPSG projection:</b> </font>

In [None]:
def geolocation(x, y, geotrans,latlon=True):
    ref_x = geotrans[0]+x*geotrans[1]
    ref_y = geotrans[3]+y*geotrans[5]
    if latlon:
        ref_y, ref_x = pyproj.transform(projimg, projlatlon, ref_x, ref_y)
    return [ref_x, ref_y]

<font face="Calibri" size="3"><b>Call geolocation to gather the aoi_coords:</b> </font>

In [None]:
print(aoi.x2 - aoi.x1)
print(aoi.y2 - aoi.y1)

try:
    aoi_coords = [geolocation(aoi.x1, aoi.y1, geotrans, latlon=False), geolocation(aoi.x2, aoi.y2, geotrans, latlon=False)]
    print(f"aoi_coords in EPSG {utm}: {aoi_coords}")
except TypeError:
    print('TypeError')
    display(Markdown(f'<text style=color:red>This error occurs if an AOI was not selected.</text>'))
    display(Markdown(f'<text style=color:red>Note that the square tool icon in the AOI selector menu is <b>NOT</b> the selection tool. It is the zoom tool.</text>'))
    display(Markdown(f'<text style=color:red>Read the tips above the AOI selector carefully.</text>'))

<font face='Calibri' size='5'><b> 4. Subset (Crop) Data to Area of Interest </b> </font>
<br>
<font face='Calibri' size='3'>We now subset our data to our AOI. We must do this for both the interferograms and the coherence files. In this lab, we will also subset the amplitude image files for later display purposes, though this is not necessary.</font>

<font face="Calibri" size="3"><b>Create a subset folder</b></font> 

In [None]:
if replace_subset:
    try:
        # Try to remove the folder tree to ensure no other data exists in the folder. 
        shutil.rmtree(subset_folder)
    except: 
        pass
# Make the directory
!mkdir -p {subset_folder} # create the directory


<font face="Calibri" size="3"><b>Crop the interferograms</b></font> 

In [None]:
tiff_paths = amp_full_paths + unw_full_paths + corr_full_paths
# Loop through each interferogram in the in list of 'tiff_paths'. 
print("Subsetting amplitude, coherence, and interferogram files.")
for tiff_path in tiff_paths:
    path,tiff = os.path.split(tiff_path)

    gdal_command = (f"gdal_translate -epo -eco -projwin {aoi_coords[0][0]} "
                    f"{aoi_coords[0][1]} {aoi_coords[1][0]} {aoi_coords[1][1]} "
                    f"-projwin_srs 'EPSG:{utm}' -co \"COMPRESS=DEFLATE\" "
                    f"-a_nodata 0 {tiff_path} {subset_folder}/{tiff} > /dev/null")
    # Uncomment line below to see exactly what gdal command in being called
    # print(f"\nCalling the command: {gdal_command}") 
    !{gdal_command} # Call the GDAL command. 
print("Subsetting complete.")

<font size='3'><font face='Calibri'><font size='3'>It is possible GDAL returned an error for one or more of the interferogram, coherence, and amplitude files. This is because of our use of the "-epo" and "-eco" options. Those rasters which return an error are either partially or completely outside of our AOI. Including these empty files would cause errors in TRAIN and GIAnT. It is possible to include those files that only have partial extent within our AOI, but would require significantly more advanced processing which we exclude for simplicity. </font>

<font face='Calibri'><font size='4'><b>4.2 Check that the Subsetted Tiffs have Pixels</b></font>
<br>
<font size='3'>Some of the subsetted geotiffs do not have pixels in our AOI despite the "-epo" and "-eco" options which should cause an error for all of these and skip them. Below, we will <b>check which geotiffs actually have pixels in our AOI and remove those that don't.</b> This can be done with a simple NaN search or by checking the band statistics of the file. 

In [None]:
paths_subsets = f"{subset_folder}/*.tif"
subset_paths = get_tiff_paths(paths_subsets)
remove_nan_filled_tifs(subset_folder,subset_paths) 

<font face='Calibri' size='4'><b>4.3 Check the Dimensions of the Subsetted Tiffs</b> </font>
<br>
<font face='Calibri' size='3'>In some instances, the <font face='Courier New'><b>gdal_translate</b></font> function will return subsetted imagery with slightly different extents; for example, one subset may be 1000 x 1000 pixels while another is 1001 x 1000. This is usually more of a problem when different different data sensors are used as these sensors will often have different pixel sizes and/or their pixel locations will be slightly offset from each other. Since all of our data comes from Sentinel1, this is generally not a problem, but it is still good to double check.</font>
<br><br>
<font face='Calibri' size='3'>Loop through each of the files, find all of the unique dimensions, and compare them. We will do this by defining and calling a function named <font face='Courier New'>pixel_check</font>. We have chosen to define this as a function as we will use it to check the geotiff dimensions later in this lab and it is good programming practice to create a function for any task that is done multiple times.</font>
<font face='Calibri'><font size='3'>Identify all of the '.tif' files, and put their paths and names in the list 'tiff_paths'.</font></font>

In [None]:
# find files in directory 'subset_folder' and add 
# them to list 'files' if they end with '.tiff'
subset_paths = get_tiff_paths(paths_subsets)
subset_paths.sort()
print(f"Number of '.tif' files: {len(subset_paths)}")
for path in subset_paths:
    print(path)

<font face='Calibri'><font size='3'>Identify all of the '.tif' files, and put their paths and names in the list 'tiff_paths'.</font></font>

In [None]:
def pixel_check(tiff_paths):
    # get the pixel and line size for each tiff file 
    # and add it to lists 'Pixels' and 'Lines'
    Pixels,Lines = [],[]
    for file in tiff_paths:
        im = gdal.Open(file)
        raster = im.GetRasterBand(1).ReadAsArray()
        XSize, YSize = im.RasterXSize, im.RasterYSize
        Pixels.append(XSize)
        Lines.append(YSize)
        
    # Get unique values of 'Pixels' and 'Lines'
    Pixel_set, Line_set = set(Pixels), set(Lines)
    # Check the number of unique values. 
    if len(Pixel_set) >1 or len(Line_set) > 1:
        print(f"Problem: More than 1 pixel or line value. This indicates "
              f"two or more subsetted '.tif' files have different sizes.")
        print(f"Number of unique Pixel Counts: {len(Pixel_set)}")
        print(f"Number of unique Line Counts:  {len(Line_set)}")
    else: 
        print("All subsetted '.tif' files are the same size. Hurray!")
        print(f"Pixels, Lines = {Pixels[0]}, {Lines[0]}")
    return Pixels,Lines

In [None]:
Pixels, Lines = pixel_check(subset_paths)

<font size='3'> If the '.tif' files are different sizes, use GDAL to make them uniform. We will do this by defining and calling a function named <font face='Courier New'>pixel_correction</font>.</font>

In [None]:
def pixel_correction(tiff_files, Pixels, Lines,coords):
    # Find the index of the smallest of the '.tif' files. 
    idx_Pixel = np.argmin(Pixels)
    idx_Lines = np.argmin(Lines)
    idx = max([idx_Pixel, idx_Lines]) # 'idx' is 0 if all files are the same size. 
    
    # Clip the other files according to that smallest '.tif' file. 
    if idx > 0:
        PSize, LSize = Pixels[idx], Lines[idx] # Pixel and Line size for all rasters. 
        for file in tiff_files:
            if file is not tiff_files[idx]:
                cmd = (f"gdal_translate -of GTIFF -projwin {coords[0][0]} "
                       f"{coords[0][1]} {coords[1][0]} {coords[1][1]} {file} {file}")
                try: 
                    print(f"Correcting file: {file}")
                    !{cmd}
                except:
                    print(f"Drat, GDAL failed to correct this file: {file}!")
    else: 
        print("Nothing to do.")

In [None]:
pixel_correction(subset_paths, Pixels, Lines,coords)

<font face='Calibri'>
<font size='5'> <b> 5. Create Input Files and Code for TRAIN </b> </font>
<br>
<font size='3'> In this section, we will use TRAIN to remove static atmospheric effects that can cause decoherence of the interferograms. This primarily corrects for effects caused by elevation differences between different locations. <br>If we think of the propogating radar wave as a set of discrete rays, each ray will follow a different path. Those that reflect from elevated locations will pass through less of the atmosphere and therefore be less altered than those rays that reflect from locations at lower elevations. Without this correction, interferograms often produce exaggerated deformation. This is especially important in hazard monitoring of active volcanoes as a false alarm can be extremely costly and cause the general public to ignore future warnings. 
</font>
    
</font>

<font face='Calibri'>
    <font size='3'>Let's create the input files and modify the files as required by TRAIN. The necessary items and actions are listed below. <br>
        
- parms_aps.txt
    - List of parameters defining how TRAIN will run.
- ifgday.txt
    - Text file listing the master and slave date pairs. 
    - 2 column vector [master slave]
    - Format: YYYYMMDD
- Convert subsetted '.tif' Files to GCS Coordinates
    - TRAIN requires the input '.tif' files to have a global coordinate system. 
    - We will convert them to EPSG:4326. 
- Adjust File Names
    - Many SAR codes expect the input files to have a particular name format. 
    - For TRAIN, this is <font face='Courier New'>&lt;master\_date&gt;\_&lt;slave\_date&gt;\_&lt;unwrapped, amplitude, or coherence designation&gt;.tif</font>.
<br></font>
    </font>

<font face='Calibri'>
    <font size='4'> <b>5.1 Create parms_aps.txt file</b> </font>
    <br>
    <font size='3'>We will need to create the file 'parms_aps.txt'. TRAIN will read parameters from this file. In order to do this, we will first need to extract some information from the satellite metadata files. We'll start with getting the UTC time of the satellite pass over our study area. <br>
        
</font>
    </font>

<font face='Calibri'>
    <font size='3'><b>5.1.1 Extract the UTC Time</b> </font>
    <br>
    <font size='3'>This information is contained in the file name. Alternatively, it can be found in the metadata file (titled <font face='Courier New'>$<$master timestamp$>$_$<$slave timestamp$>$.txt</font>) that comes with the interferogram. The function <font face='Courier New'>getUTC_sat</font> extracts and returns the median UTC time in HR:MIN format and as an integer representing the number of seconds since the start of the day.</font>
    </font>

In [None]:
# Extract UTC time of satellite pass
def getUTC_sat(files):
    UTC_hr,UTC_min,UTC_sec = [],[],[]
    for file in files:
        vals = file.split('_')
        tstamp = vals[0][9:16]
        UTC_hr.append(int(tstamp[0:2]))
        UTC_min.append(int(tstamp[2:4]))
        UTC_sec.append(int(tstamp[4:6]))    
    
    # UTC time as HH:MIN; we extract the median value and pad with up to 2 zeroes. 
    ###############################UTC_sat = str(int(np.median(UTC_hr))).zfill(2)+':'+str(int(np.median(UTC_min))).zfill(2)
    UTC_sat = (f"{str(int(np.median(UTC_hr))).zfill(2)}:"
               f"{str(int(np.median(UTC_min))).zfill(2)}")
    # UTC time as an integer; Method from Tom Logan's prepGIAnT code
    # Can also be found inside <date>_<date>.txt file and hard coded/extracted
    c_l_utc = np.median(UTC_hr)*3600 + np.median(UTC_min)*60 + np.median(UTC_sec) 
    return UTC_sat, c_l_utc

<font face='Calibri' size='3'>Create a list of unwrapped, phase corrected tiffs.</font>

In [None]:
files = [f for f in os.listdir(subset_folder) if f.endswith('_unw_phase.tif')] 
for file in files: print(file)

<font face='Calibri' size='3'>Find the UTC time of the satellite pass for each \_unw\_phase.tif.</font>

In [None]:
UTC_sat, c_l_utc = getUTC_sat(files)
print(c_l_utc)
print(UTC_sat)

<font face='Calibri'>
    <font size='3'><b>5.1.2 Create parms_aps.txt file</b> </font>
    <br>
    <font size='3'>Make the parms_aps.txt file. This gives TRAIN information on how to process the data. </font>
    </font>

In [None]:
inProj = pyproj.Proj(f'EPSG:{utm}')
outProj = pyproj.Proj('EPSG:4326')
llx, lly = pyproj.transform(inProj,outProj,coords[0][0],coords[0][1])
urx, ury = pyproj.transform(inProj,outProj,coords[1][0],coords[1][1])
region_lats = abs(np.diff([lly,ury])[0]) + extra
region_lons = abs(np.diff([llx,urx])[0]) + extra

!mkdir -p {merra2_datapath} # create the directory

parms_aps_Template = '''
# Input parameters for TRAIN

crop_flag: n
date_origin: file
dem_null: -32768
DEM_origin: asf
DEM_file: {7}
era_data_type: {1}
ifgday_file: {2}
incidence_angle: {9}
lambda: {8}
look_angle: 21
meric_perc_coverage: 80
merra2_datapath: {4}
non_defo_flag: n
region_lat_range: {5}
region_lon_range: {6}
region_res: 0.008333000000000
save_folder_name: aps_estimation
small_baseline_flag: n
stamps_processed: n
UTC_sat: {3}

'''

with open(os.path.join(trainDir,'parms_aps.txt'), 'w') as fid:
    fid.write(parms_aps_Template.format(trainDir, era_data_type,
                                        ifgday_file, UTC_sat,
                                        merra2_datapath, region_lats,
                                        region_lons, demPathNName,
                                        lambdaInMeters,incidenceAngle))

<font face='Calibri'><font size='3'>You may notice that the <font face='Courier New'>parms_aps.txt</font> file we created has very little geographic information; we've only included the width and height of the subsetted interferograms in decimal degrees (the variables <font face='Courier New'>region_lats</font> and <font face='Courier New'>region_lons</font>). This is because we will use one of our subsetted and converted tiffs as a geographic reference file. Otherwise, we would need to create a text file that contains the latitude and longitude of each pixel in the interferogram.</font>

<font face='Calibri'>
    <font size='4'><b>5.2 Create ifgday.mat file</b> </font>
    <br>
    <font size='3'>Make the ifgday.mat file. This gives TRAIN the master and slave dates of the interferograms as 2 column vectors. The dates must be separated by a single space.<br>
        
- Interferogram dates stored as a matrix with name ifgday and size [n_ifgs 2]. 
- Master image is in the first and slave in the second column. 
- Specify dates as a numeric value in YYYYMMDD format.

    </font>
    </font>

In [None]:
files = [f for f in os.listdir(subset_folder) if \
         f.endswith('_unw_phase.tif')] # Get file names. 
print(len(files))

In [None]:
# Get all of the master and slave dates. 
masterDates,slaveDates = [],[]
for file in files:
    tstamps = file.split('_')
    masterDates.append(tstamps[0][0:8])
    slaveDates.append(tstamps[1][0:8])
# Sort the dates according to the master dates. 
mDates,sDates = (list(t) for t in zip(*sorted(zip(masterDates,slaveDates))))

with open( os.path.join(trainDir, ifgday_file), 'w') as fid:
    for i in range(len(mDates)):
        masterDate = mDates[i] # pull out master Date (first set of numbers)
        slaveDate = sDates[i] # pull out slave Date (second set of numbers)
        
        # write values to the 'ifgday_file'; make sure there is only 1 space between the dates. 
        fid.write(f'{masterDate} {slaveDate}\n') 

In [None]:
# Print the contents of 'ifgday_file'
ifg = open(os.path.join(trainDir,ifgday_file),'r') # open the file.
ifg_contents = ifg.read()                        # read the contents.
print(ifg_contents)                              # print the contents. 
ifg.close()                                      # close the file.

<font face='Calibri'>
    <font size='4'><b>5.3 Convert Subsetted '.tif' Files to GCS Coordinates</b> </font>
    <br>
    <font size='3'>The ASF version of TRAIN requries our subsets to be in GCS coordinates (i.e., for pixel every location in the subset to be designated by latitude and longitude). Currently, they are in a UTM coordinate system, which gives pixel location in meters based on a local coordinate system. We will convert our subsets. <br>We've already extracted and stored the original coordinate system in the variable <font face='Courier New'>utm</font>. </font>
    </font>

In [None]:
print(f"Current Coordinate System - EPSG:{utm}")

<font face='Calibri' size='3'>We convert the coordinate system of each subsetted '.tif' to GCS coordinates using GDAL's warp function. This is typically designated as EPSG:4326. </font>
 

In [None]:
# set desired TRAIN coordinate system; this will be used later
coord_TRAIN = '4326'

<font face='Calibri' size='3'>Create a directory for converted files. </font>

In [None]:
if replace_corrected:
    try:
        shutil.rmtree(corrected_folder)
    except: 
        pass
# Make the directory
!mkdir -p {corrected_folder} # create the directory

<font face='Calibri' size='3'>Create the converted subsets.</font>

In [None]:
for file in files:
    # Designate the input file and its path
    inFile = os.path.join(subset_folder,file)
    # Designate the output file and its path; ideally these are the same. 
    # GDAL can't do that (it'll overwrite data sometimes), 
    # so we're going to create entirely new files in a new folder. 
    outFile = os.path.join(corrected_folder,file)
    cmd = f"gdalwarp -t_srs EPSG:{coord_TRAIN} {inFile} {outFile}"
    #print(cmd)
    !{cmd}

<font face='Calibri'><font size='3'>Now that we have created our new files in the <font face='Courier New'>corrected_folder</font> directory, the original subsets in the <font face='Courier New'>subset_folder</font> is superfluous and could be removed. <b>This is an optional step.</b> We will keep the original files for the purpose of comparing the corrected and uncorrected time series in Part 2. </font></font>

In [None]:
if delete_subsets:
    shutil.rmtree(subset_folder)
    print(f"Folder {subset_folder} removed.")
else:
    print("This step has been skipped.")

<font face='Calibri'>
    <font size='4'><b>5.4 Adjust File Names</b></font>
    <font size='3'><br>Many SAR codes expect the input files to have a particular name format. For TRAIN, this is <font face='Courier New'>&lt;master_date&gt;_&lt;slave_date&gt;_&lt;unwrapped, amplitude, or coherence designation&gt;.tif</font>. We will adjust the files to match this name convention. The code below assumes that the files all come from Sentinel-1 and that every interferogram has a unique master and slave date pair. <br><b>This is not always true; some interferograms will have identical master/slave date pairs, but have been taken at different times.</b> This is a relatively rare occurrence, but it is good to keep in mind. To keep this exercise relatively simple, we assume each interferogram has a unique master/slave date pair.</font> </font>

In [None]:
def renameFiles(datadirectory,files):
    # Rename the interferogram, coherence, and amplitude (for plotting later) files.  
    for file in files:
        if "T" in file: # only change files needing to be renamed (containing a 'T') 
            oldname, oldExt = os.path.splitext(file)
            # Uncomment print statement below to see parsed file name and extension. 
            #print(f"\nCurrent Name: {oldname}\nCurrent Extension: {oldExt}") 
            tstamps = oldname.split('_')
            master,slave = tstamps[0][0:8],tstamps[1][0:8]
            if "_unw" in file:
                newname = master + '_' + slave + '_unw_phase' + oldExt
            elif "_corr" in file:
                newname = master + '_' + slave + '_corr' + oldExt
            elif "_amp" in file:
                newname = master + '_' + slave + '_amp' + oldExt
            exists = os.path.isfile(os.path.join(datadirectory,newname))
            if exists:
                print("This one already exists: "+newname)
            else:
                os.rename(os.path.join(datadirectory, file),
                          os.path.join(datadirectory, newname))
    print("Files renamed.")
    return

<font face='Calibri'><font size='3'>We will rename both the GCS converted files and the original subsets. First, the original subsets.</font>

In [None]:
fExt = '.tif'
files = [f for f in os.listdir(subset_folder) if f.endswith(fExt)] 
files.sort()
print(len(files))

# print every entry of list 'files' separated by the newline character, "\n"
print(*files, sep="\n") 

In [None]:
renameFiles(subset_folder, files)

<font face='Calibri'><font size='3'>Now, the GCS converted files. </font>

In [None]:
fExt = '.tif'
files = [f for f in os.listdir(corrected_folder) if f.endswith(fExt)] 
files.sort()
print(len(files))

# print every entry of list 'files' separated by the newline character, "\n"
print(*files, sep="\n")

In [None]:
renameFiles(corrected_folder,files)

<font face='Calibri'><font size='3'>Now that we've renamed our files, we get a list of the converted files. We later use on of these with TRAIN as our geographic reference file. </font>

In [None]:
fExt = '.tif'
files_converted = [f for f in os.listdir(corrected_folder)] 

<font face='Calibri'><font size='5'><b>6. Run TRAIN</b></font>
    <br>
    <font size='3'>We have now created all of the necessary files to run TRAIN, so let's do it. 

<font face='Calibri'><font size='4'><b>6.1 Minor Set Up</b></font>
    <br>
    <font size='3'>We have to set up some path information in order to run TRAIN. <i>Eventually, this will be modified so we don't have to include the full path to TRAIN.</i> Additionally, we show multiple ways in which to call TRAIN.</font></font>

In [None]:
# General path to the TRAIN code. This is a temporary necessity.
# In the future,the path to TRAIN will be unnecessary. 
train_path = "/usr/local/TRAIN/src" # only for the code that we will not modify. 
cmd = os.path.join(train_path,'aps_weather_model.py')
print(cmd)
georef_path = os.path.join(home, corrected_folder, files_converted[0])
print(f"Path to Georeference File: {georef_path}")

In [None]:
# Display some help information
!python2.7 $train_path/aps_weather_model.py -h

In [None]:
# Display the same help information via a slightly different call method
!python2.7 $cmd -h

<font face='Calibri'><font size='4'><b>6.2 Steps 0-3</b></font>
    <br>
    <font size='3'>Now we run steps 0 through 3. Step 4 requires a few extra actions. </font></font>

In [None]:
netrc_path = '/home/jovyan/.netrc'
with open(netrc_path, 'w+') as netrc:
    netrc.write(f'machine urs.earthdata.nasa.gov login {login.username} password {login.password}')

# Step 0 - Identify weather data files to download. 
!python2.7 $train_path/aps_weather_model.py -g {georef_path} 0 0

os.remove(netrc_path)

In [None]:
# Delete any existing MERRA2 downloads. 
# This is to prevent unrelated MERRA2 files from being used. 
try:
    shutil.rmtree(merra2_datapath)
except:
    pass

In [None]:
netrc_path = '/home/jovyan/.netrc'
with open(netrc_path, 'w+') as netrc:
    netrc.write(f'machine urs.earthdata.nasa.gov login {login.username} password {login.password}')

# Step 1 - Download weather data. 
# This will download a series of '*.nc4' files for each master. 
!python2.7 $train_path/aps_weather_model.py -g {georef_path} 1 1


os.remove(netrc_path)

In [None]:
netrc_path = '/home/jovyan/.netrc'
with open(netrc_path, 'w+') as netrc:
    netrc.write(f'machine urs.earthdata.nasa.gov login {login.username} password {login.password}')


# Step 2 - Calculate wet and hydrostatic zenith delays. 
# This will download a set of '*.xyz' files and use those 
# to calculate the necessary delays. 
!python2.7 $train_path/aps_weather_model.py -g {georef_path} 2 2

os.remove(netrc_path)

In [None]:
netrc_path = '/home/jovyan/.netrc'
with open(netrc_path, 'w+') as netrc:
    netrc.write(f'machine urs.earthdata.nasa.gov login {login.username} password {login.password}')


# Step 3 - Calculate the SAR delays.
# This produces *_*_{hydro_correction, wet_correction, and correction}.bin files
!python2.7 $train_path/aps_weather_model.py -g {georef_path} 3 3

os.remove(netrc_path)

<font face='Calibri'><font size='3'>Step 3 may give a strange message: "No correction for &lt;insert date&gt;". If this occurs, it is most likely because TRAIN wasn't able to access the MERRA2 files due to missing permissions in your Earthdata user account. This can be done by going to your <a href="https://urs.earthdata.nasa.gov/profile" target="_blank">EarthData's profile page</a>, clicking <b>Applications</b> and selecting <b>Approved Applications</b> from the drop down menu. Select <b>Approve More Applications</b> at the bottom left, search for <b>NASA GESDISC DATA ARCHIVE</b>, select it, and agree to the terms and conditions. Once that is complete, restart from Step 0. <br><br>Note that you may have to correct the current working directory to the folder in which this notebook resides. </font></font>
    
    

<font face='Calibri'><font size='4'><b>6.3 Step 4 of TRAIN</b></font>
    <br>
    <font size='3'>In step 4, we apply the correction to our <font face='Courier New'>&lt;\*\_\*\_unw\_phase.tif&gt;</font> files. For this step, we need to do 2 things first: 
1. Move all of the '*.bin' files into the same directory as our converted geotiffs. 
2. Make our current working directy the same as our converted geotiffs. 

In [None]:
# find files. 


print(os.listdir())

fExt = '.bin'
files = [f for f in os.listdir('.') if f.endswith(fExt)] 
files.sort()
print(len(files))

# print every entry of list 'files' separated by the newline character, "\n"
print(*files, sep="\n") 

In [None]:
# Move the files into the desired location
for file in files:
    shutil.move(file,os.path.join(corrected_folder,file))

In [None]:
# Change working directory
os.chdir(corrected_folder)

In [None]:
# Step 4 - Subtract calculated delay from interferograms. 
!python2.7 $train_path/aps_weather_model.py -g {georef_path} 4 4

In [None]:
# Return to the home directory
os.chdir(home)

<font face='Calibri'><font size='3'>If we check our <font face='Courier New'>corrected_folder</font> directory, we will find new files with the naming convention <font face='Courier New'>&lt;\*\_\*\_unw\_phase\_corrected.tif&gt;</font>. The uncorrected files, <font face='Courier New'>&lt;\*\_\*\_unw\_phase.tif&gt;</font>, are now technically superfluous and can be deleted. However, we will keep these files for the purpose of comparing the corrected and uncorrected times series in Part 2.
</font></font>

<font face='Calibri'><font size='4'><b>6.4 Comparison of Corrected and Uncorrected Unwrapped Phase</b></font>
    <br>
    <font size='3'>We will make a quick and simple comparison between the corrected and uncorrected unwrapped phase geotiffs. This is meant to highlight the importance of these atmospheric corrections.</font></font>

In [None]:
paths_cor = f"{home}/{corrected_folder}/*_unw_phase_corrected.tif"
paths_unc = f"{home}/{corrected_folder}/*_unw_phase.tif"
cor_paths = get_tiff_paths(paths_cor)
unc_paths = get_tiff_paths(paths_unc)

In [None]:
print(cor_paths[0])
print(unc_paths[0])

In [None]:
corrected = gdal.Open(cor_paths[0])
uncorrected = gdal.Open(unc_paths[0])
im_c = corrected.GetRasterBand(1).ReadAsArray()
im_u = uncorrected.GetRasterBand(1).ReadAsArray()

fig = plt.figure(figsize=(18,10))
ax1 = fig.add_subplot(221)
ax2 = fig.add_subplot(222)
ax1.imshow(im_c,cmap='gray')
ax2.imshow(im_u,cmap='gray')
plt.title('Sierra Negra - Corrected and Uncorrected Unwrapped Phase')

In [None]:
difference = np.subtract(im_c,im_u)
fig = plt.figure(figsize=(18,10))
ax1 = fig.add_subplot(111)
fig_plot = ax1.imshow(difference,cmap='RdBu')
fig.colorbar(fig_plot, fraction=0.24, pad=0.02)
ax1.set(title='Sierra Negra - TRAIN Correction Difference Map [mm]')
plt.grid()

<font face='Calibri'><font size='3'>In looking at the Correction Difference Map, we can see that size of the correction correlates with surface elevation. Without this correction, a volcanologist would see the extra elevation difference as an indication of magma injection and possible eruptive activity.</font></font>

<font face='Calibri'><font size='4'><b>6.5 Convert back to original coordinate system</b></font>
    <br>
    <font size='3'>GIAnT requires the interferograms to be in a particular coordinate system. The original coordinate system is one of those accepted, so we will convert our interferograms back to that. </font></font>

In [None]:
#coord_TRAIN = '4326'
#utm = '32715' # the original coordinate system identified when reprojecting the interferograms to the same UTM

In [None]:
print(f"original coordiante system = EPSG:{utm}")
print(f"TRAIN coordinate system =    EPSG:{coord_TRAIN}")

In [None]:
paths = f"{corrected_folder}/*.tif"
tiff_paths = get_tiff_paths(paths)
print(tiff_paths[0])

In [None]:
# double check that the current coordinate system of the 
# files is different from the desired. 
utm_zones, utm_types = getUTM_znt(tiff_paths)
print(f"Current UTM Types & Zones = EPSG:{list(set(utm_zones))}")
print(f"Expected current system   = EPSG:{coord_TRAIN}")
print(f"Desired coordinate system = EPSG:{utm}")

In [None]:
## Create the converted subsets. 
for file in tiff_paths:
    # Designate the input file and its path
    inFile = file
    # Designate the output file and its path; ideally these are the same. 
    # GDAL can't do that (it'll overwrite data sometimes), so we're going 
    # to create entirely new files in a new folder, delete the old files, 
    # and then rename the newly created file. 
    path,fl= os.path.split(file)
    desig = 'TEMP_'
    new_name = desig+fl
    outFile = os.path.join(path,new_name)
    # create the convert command
    cmd = f"gdalwarp -t_srs EPSG:{utm} {inFile} {outFile}"
    #print(cmd)
    !{cmd} # convert the file
    # delete the file in the EPSG:4326 coordinate system
    try:
        os.remove(inFile)
    except:
        pass
    
    # rename the file in the utm coordinate system to our original name of 'inFile'. 
    try:
        os.rename(outFile,inFile)
    except:
        pass

In [None]:
# Check that the coordinate conversion worked. 
utm_zones, utm_types = getUTM_znt(tiff_paths)
print(f"Current UTM Types & Zones = EPSG:{list(set(utm_zones))}")
print(f"Expected current system   = EPSG:{utm}")
print(f"Desired coordinate system = EPSG:{utm}")

<font face='Calibri'><font size='4'><b>6.6 Do another pixel check</b></font>
    <br>
    <font size='3'>Check the pixel sizes again, and then do the pixel correction if necessary.<br><br><i>This could be a student assignment</i>
    </font></font>

In [None]:
# get the tiff paths
paths = f"{corrected_folder}/*.tif"
tiff_paths = get_tiff_paths(paths)
for tiff in tiff_paths:
    print(tiff)

In [None]:
Pixels, Lines = pixel_check(tiff_paths)

In [None]:
pixel_correction(tiff_paths,Pixels,Lines,coords)

<font face='Calibri'><font size='3'>
You have now corrected the interferograms for atmospheric conditions and can proceed to Part 2: GIAnT. 
</font></font>

<font face="Calibri" size="2"> <i>GEOS 657 Microwave Remote Sensing - Version 1.0 - May 2020 </i>
</font>