<img src="NotebookAddons/blackboard-banner.jpg" width="100%" />
<font face="Calibri">
<br>
<font size="5"><b>Subset Data Stack</b><img style="padding: 7px" src="NotebookAddons/UAFLogo_A_647.png" width="170" align="right"/></font>

<br>
<font size="4"> <b>Alex Lewandowski; University of Alaska Fairbanks</b> <br>
</font>

<font size="3"> This notebook crops a directory of tiffs to a subset area of interest using an interactive Bokeh map.
<font>
</font>

<hr>
<font face="Calibri" size="5" color="red"> <b>Important Note about JupyterHub</b> </font>
<br><br>
<font face="Calibri" size="3"> <b>Your JupyterHub server will automatically shutdown when left idle for more than 1 hour. Your notebooks will not be lost but you will have to restart their kernels and re-run them from the beginning. You will not be able to seamlessly continue running a partially run notebook.</b> </font>


<hr>
<font face="Calibri">

<font size="5"> <b> 0. Importing Relevant Python Packages </b> </font>

<font size="3">In this notebook we will use the following scientific library:
<ol type="1">
    <li> <b><a href="https://www.gdal.org/" target="_blank">GDAL</a></b> is a software library for reading and writing raster and vector geospatial data formats. It includes a collection of programs tailored for geospatial data processing. Most modern GIS systems (such as ArcGIS or QGIS) use GDAL in the background.</li>

</font>


<font face="Calibri" size="4" color="red"><b>IMPORTANT</b></font>
<br><br>
<font face="Calibri" size="3"><b>The first time you run a notebook containing an interactive Bokeh plot, you must first enable the jupyter serverextension by running the code cell below. It won't be enabled until the server is restarted. Before completing the rest of this notebook, run the cell below, click the "Control Panel" button at the top-right of the screen, click the "Stop My Server" button that appears, and then the "Start My Server" button. Finally, restart the notebook and run as normal.</b>
<br><br>
The code cell below may be commented out after performing the steps descibed above once.</font>

In [None]:
!jupyter serverextension enable --py nbserverproxy

<font face="Calibri" size="3"><b>Import the necesssary libraries and modules:</b> </font>

In [None]:
import os
import glob
import json # for loads
import shutil

import gdal

from pyproj import Proj, transform

from bokeh.io import output_notebook

from asf_notebook import new_directory
from asf_notebook import path_exists
from asf_notebook import remove_nan_filled_tifs
from asf_notebook import remote_jupyter_proxy_url
from asf_notebook import AOI
from asf_notebook import select_parameter

<font face="Calibri" size="3"><b>Setup interactive Bokeh plotting</b> inside the notebook:</font>

In [None]:
output_notebook()

<hr>
<font face="Calibri" size="3"><b>Write functions to gather and print individual tiff paths:</b> </font>

In [None]:
def get_tiff_paths(paths):
    tiff_paths = !ls $paths | sort -t_ -k5,5
    return tiff_paths

def print_tiff_paths(tiff_paths):
    print("Tiff paths:")
    for p in tiff_paths:
        print(f"{p}\n")

<font face="Calibri" size="3"><b>Enter the path to the directory holding your tiffs:</b> </font>

In [None]:
while True:
    print("Enter the absolute path to the directory holding your tiffs.")
    tiff_dir = input()
    paths = f"{tiff_dir}/*.tif"
    if os.path.exists(tiff_dir):
        tiff_paths = get_tiff_paths(paths)
        if len(tiff_paths) < 1:
            print(f"{tiff_dir} exists but contains no tifs.")
            print("You will not be able to proceed until tifs are prepared.")
        break
    else:
        print(f"\n{tiff_dir} does not exist.")
        continue

<font face="Calibri" size="3"><b>Determine the path to the analysis directory containing the tiff directory:</b> </font>

In [None]:
analysis_dir = os.path.dirname(tiff_dir)
print(analysis_dir)

<font face="Calibri" size="3"><b>Determine the UTM zone for your images.</b> This assumes you have already reprojected any tiffs with errant UTM zones to a single predominae UTM zone, using the Prepare_Data_Stack_Hyp3 notebook.</font>

In [None]:
info = (gdal.Info(tiff_paths[0], options = ['-json']))
info = (json.loads(info))['coordinateSystem']['wkt']
utm = info.split('"EPSG","')[-1].split('"')[0]
print(f"UTM Zone: {utm}")

<hr>
<font face="Calibri">

<font size="5"> <b>Subset The Tiffs</b> </font> 

<font size="3"><b>As a first step, retrieve the maximum extent coordinates for the image stack so we can zoom into the coverage area on the Bokeh map:</b>
</font> 
</font>

In [None]:
lower_left = [30000000, 30000000]
upper_right = [-30000000, -30000000]
for p in tiff_paths:
    info = (gdal.Info(p, options = ['-json']))
    l_l = (json.loads(info))['cornerCoordinates']['lowerLeft']
    u_r = (json.loads(info))['cornerCoordinates']['upperRight']
    if l_l[0] < lower_left[0]:
        lower_left[0] = l_l[0]
    if l_l[1] < lower_left[1]:
        lower_left[1] = l_l[1]
    if u_r[0] > upper_right[0]:
        upper_right[0] = u_r[0]
    if u_r[1] > upper_right[1]:
        upper_right[1] = u_r[1]

<font face="Calibri" size="3"> <b>Convert the coordinates to EPSG:3857 (web-mercator), which is the projection used by Bokeh:</b> </font> 

In [None]:
out_proj = Proj(init="epsg:3857") #web mercator
in_proj = Proj(init=f"epsg:{utm}")
lower_left[0], lower_left[1] = transform(in_proj, out_proj, lower_left[0], lower_left[1])
upper_right[0], upper_right[1] = transform(in_proj, out_proj, upper_right[0], upper_right[1])
print(f"Lower Left Coord: {lower_left}")
print(f"Upper Right Coord: {upper_right}")

<font face="Calibri" size="3"><b>Create and display an interactive Area-of-Interest selector:</b></font> 

In [None]:
aoi = AOI(lower_left, upper_right)
aoi.display_AOI()

<font face="Calibri" size="3"> <b>Convert the EPSG:3857 coords back to the predominate EPSG in the data stack:</b></font> 

In [None]:
if not aoi.subset_coords[0][0]:
    print(f"WARNING: You must select a subset area of interest in the previous cell before continuing.")
    print(f"\nPlease make a selection and rerun this code cell.")
else:
    in_proj = Proj(init="epsg:3857")
    out_proj = Proj(init=f"epsg:{utm}")
    coords = [[None, None], [None, None]]
    coords[0][0], coords[0][1] = transform(in_proj, out_proj, 
                                           aoi.subset_coords[0][0], 
                                           aoi.subset_coords[1][1])
    coords[1][0], coords[1][1] = transform(in_proj, out_proj, 
                                           aoi.subset_coords[1][0], 
                                           aoi.subset_coords[0][1])
    print(coords)

<font size="3"> <b>Update the list of all the absolute paths of the tiffs:</b> </font> 

In [None]:
tiff_paths = get_tiff_paths(paths)
#print_tiff_paths(tiff_paths)

In [None]:
print("Choose a directory name in which to store the subset geotiffs.")
print("Note: this will sit alongside the directory containing your pre-subset geotiffs.")
while True:
    sub_name = input()
    if sub_name == "":
        print("Please enter a valid directory name")
        continue
    else:
        break

<font size="3"><b>Subset the tiffs and move them from the individual product directories into their own directory, /tiffs:</b></font> 

In [None]:
subset_dir = f"{analysis_dir}/{sub_name}/"
new_directory(subset_dir)
for i, tiff_path in enumerate(tiff_paths):
    date = tiff_path.split('/')[-1].split('_')[3].split('T')[0]
    polarization = tiff_path.split('/')[-1].split('_')[6][0:2]
    print(f"\nProduct #{i+1}:")
    gdal_command = f"gdal_translate -projwin {coords[0][0]} {coords[0][1]} {coords[1][0]} {coords[1][1]} -projwin_srs 'EPSG:{utm}' -co \"COMPRESS=DEFLATE\" -a_nodata 0 {tiff_path} {subset_dir}{date}_{polarization}.tiff"
    print(f"Calling the command: {gdal_command}")
    !{gdal_command}

<font size="3"><b>Delete any subset tifs that are filled with NaNs and contain no data.</b></font>

In [None]:
subset_paths = f"{subset_dir}*.tiff"
tiff_paths = get_tiff_paths(subset_paths)
remove_nan_filled_tifs(subset_dir, tiff_paths)

<font size="3"><b>Decide whether or not to cleanup the original tiffs:</b></font> 

In [None]:
cleanup = select_parameter('', ["Save original tiffs", "Delete original tiffs"])
cleanup

In [None]:
if cleanup.value == 'Delete original tiffs':
    shutil.rmtree(tiff_dir)

<font size="3"><b>Print the path to your subset directory:</b></font> 

In [None]:
print(subset_dir[:-1])

<font face="Calibri" size="2"> <i>GEOS 657 Microwave Remote Sensing - Version 1.0 - April 2019 </i>
</font>