## Topographic Complexity/Variability: Mexican Hat Wavelet Analysis  
* The original MATLAB code was developed by Dr. Adam M. Booth (Portland State Univeristy).  
    * Citations:  
        * Booth, A.M., Roering, J.J., Perron, J.T., 2009. Automated landslide mapping using spectral analysis and high-resolution topographic data: Puget Sound lowlands, Washington, and Portland    Hills, Oregon. Geomorphology 109, 132-147. https://doi.org/10.1016/j.geomorph.2009.02.027  
        * Booth, A.M., LaHusen, S.R., Duvall, A.R., Montgomery, D.R., 2017. Holocene history of deep-seated landsliding in the North Fork Stillaguamish River valley from surface roughness analysis, radiocarbon dating, and numerical landscape evolution modeling. Journal of Geophysical Research: Earth Surface 122, 456-472. https://doi.org/10.1002/2016JF003934  
* This MATLAB code was later adapted by Dr. Sean R. LaHusen (Univeristy of Washington) & revised by Erich N. Herzig (Univeristy of Washington).
    * Citations:  
       * LaHusen, S.R., Duvall, A.R., Booth, A.M., Montgomery, D.R., 2016. Surface roughness dating of long-runout landslides near Oso, Washington (USA), reveals persistent postglacial hillslope instability. Geology 44, 111-114. https://doi.org/10.1130/G37267.1  
       * LaHusen, S.R., Duvall, A.R., Booth, A.M., Grant, A., Mishkin, B.A., Montgomery, D.R., Struble, W., Roering, J.J., Wartman, J., 2020. Rainfall triggers more deep-seated landslides than Cascadia earthquakes in the Oregon Coast Range, USA. Science Advances 6, eaba6790. https://doi.org/10.1126/sciadv.aba6790  
       * Herzig, E.N., Duvall, A.R., Booth, A.R., Stone, I., Wirth, E., LaHusen, S.R., Wartman, J., Grant, A.; Evidence of Seattle Fault Earthquakes from Patterns in Deep‐Seated Landslides. Bulletin of the Seismological Society of America 2023; https://doi.org/10.1785/0120230079 
* In November, 2023; this is code translated and optimized into this python version by Dr. Larry Syu-Heng Lai (Univeristy of Washington).

In [None]:
import os
import numpy as np
from scipy.signal import fftconvolve
import rasterio
import dask.array as da
from dask.diagnostics import ProgressBar

### Mexican function using FFT Colvolve
"""
    Computes the 2D CWT of dem using the Mexican Hat wavelet. 
    Optimized using fft
    
    dem = digital elevation model
    a = wavelet scale
    dx = grid spacing (same in x- and y-directions)

    Returns:
    C = array of wavelet coefficients, indexed to dem
    frq = bandpass frequency of wavelet at scale a
    wave = wavelength (inverse of frq)
"""
In this version, masking the artifacts at the edges are conducted during chunk processing.

In [None]:
def conv2_mexh_fft(dem, a, dx):
    # Generate the mexican hat wavelet kernel at wavelet scale a.
    # The kernel must be large enough for the wavelet to decay to ~0 at the edges.
    X, Y = np.meshgrid(np.arange(-8*a, 8*a+1), np.arange(-8*a, 8*a+1))

    # This psi has been scaled, so C is equal to curvature:
    psi = (-1/(np.pi*(a*dx)**4)) * (1 - (X**2 + Y**2)/(2*a**2)) * np.exp(-(X**2 + Y**2)/(2*a**2))  # units of [1/(m^4)]

    # Convolve dem with psi using scipy's convolve2d function, multiplying by dx^2
    # to approximate the double integral. 'same' mode crops C to same size as dem.
    C = (dx**2) * fftconvolve(dem*0.3048, psi, mode='same')  # units of [(m^2) x (m) x (1/(m^4)) = (1/m)]

    # Frequency and wavelength vectors:
    wave = 2*np.pi*dx*a/(5/2)**(1/2)  # Torrence and Compo [1998]
    frq = 1./wave

    return C, frq, wave

### Define function for chunk processing (avoid RAM issues for big GeoTIFF)
In this current method, the maximum size allowed for imported GeoTIFF is limited by your RAM capacity. Usually files with size <3.7GB would be preferrable in this version. To process much larger GeoTIFFs, please consider using the version 'chunkIC' with incremental writing capability, which ensures worability yet with lower perfermance (longer processing time).

In [None]:
def process_with_dask(input_dir, output_dir, a, dx, overlap, chunksize=(1024, 1024)):
    with rasterio.open(input_dir) as src:
        meta = src.meta.copy()
        
        # Create a Dask array from the raster data
        dem = da.from_array(src.read(1), chunksize)  # Adjust chunk size as needed
        
        # Use map_overlap to apply processing with overlap
        processed_data = dem.map_overlap(
            lambda block: np.abs(conv2_mexh_fft(block, a, dx)[0]),
            depth=overlap,  # Specifies the overlap
            boundary='reflect',
            trim=True,
            dtype=np.float32
        )

    # Setup output metadata
    meta.update(dtype=rasterio.float32, count=1, compress='lzw', bigtiff='IF_SAFER')
    
    with ProgressBar():
        with rasterio.open(output_dir, 'w', **meta) as dst:
            # Compute the processed data
            result = processed_data.compute()
            
            # Mask edge effects with NaN (no data) values
            fringeval = int(np.ceil(a*4))
            result[:fringeval, :] = np.nan
            result[:, :fringeval] = np.nan
            result[-fringeval:, :] = np.nan
            result[:, -fringeval:] = np.nan
            
            # Write result to file, taking into account the dtype
            if meta['dtype'] == rasterio.float32:
                dst.write(result, 1)
            else:
                # For non-float data types, need to ensure NaNs are handled properly
                # This example assumes float32 output; adjust as needed for other types
                raise ValueError("Output data type must be float32 to support NaN values.")


### Setup files directory

In [None]:
# Main execution
base_dir = '.../' #put your own directory
#input_dir = os.path.join(base_dir, 'Test_DEM.tif')
#output_dir = os.path.join(base_dir, 'Test_pymexhat_chunk.tif')
input_dir = os.path.join(base_dir, 'Testbig_DEM.tif')
output_dir = os.path.join(base_dir, 'Testbig_pymexhat_chunk.tif')

In [None]:
# Read the DEM file
with rasterio.open(input_dir) as src:
    dem = src.read(1)
    transform = src.transform
    crs = src.crs

    # Print the CRS information
    print(f"CRS: {crs}")
    print(f"CRS as WKT: {crs.wkt}")
    print(f"CRS as PROJ string: {crs.to_proj4()}")
    print(f"CRS as EPSG code: {crs.to_epsg()}")
    print(f"CRS as dictionary: {crs.to_dict()}")

In [None]:
dx = 1.8288  # Grid spacing
a = 4.1  # Scale of the wavelet

# Define overlap size
overlap = int(a * 4)  # Example overlap based on wavelet scale 'a'
#overlap = 50  # define arbitrary size of overlap

In [None]:
import time
start_time = time.time()  # Start timer

chunksize = (1024, 1024) # Adjust the chunk size to best utilize the memory.
process_with_dask(input_dir, output_dir, a, dx, overlap, chunksize)

end_time = time.time()  # Stop timer
print(f"Execution time: {end_time - start_time:.6f} seconds")