
# FFT Basics on Detrended SNR (GNSS‑IR)

In this notebook, you'll compute a **Fast Fourier Transform (FFT)** of detrended SNR
in the natural variable \(x = \sin E\) to identify the **dominant frequency** and convert it to
**reflector height** \( h \approx (\lambda/2) f \).

**You will:**
- Load cleaned arcs from `cleaned_arcs.csv` (from the previous notebook).
- Resample SNR onto a **uniform** grid in \(x = \sin E\) (needed for FFT).
- Apply a taper window, compute the amplitude spectrum, and find the peak.
- Convert peak frequency to reflector height \(h\), with a simple peak‑width diagnostic.


In [None]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from pathlib import Path



## Resampling to a Uniform \(\sin E\) Grid

FFT assumes uniformly sampled data. We therefore **resample** the detrended SNR to a uniform grid
over the observed \(\sin E\) range.


In [None]:

def resample_uniform_sinE(sinE, y, n=1024):
    """Resample y(sinE) to a uniform grid in sinE using linear interpolation.
    Returns x_u (uniform sinE), y_u, and grid spacing dx.
    """
    sinE = np.asarray(sinE, float)
    y = np.asarray(y, float)
    m = np.isfinite(sinE) & np.isfinite(y)
    sinE, y = sinE[m], y[m]
    if sinE.size < 8:
        raise ValueError('Not enough points to resample')
    smin, smax = sinE.min(), sinE.max()
    if not np.isfinite(smin) or not np.isfinite(smax) or smax <= smin:
        raise ValueError('Invalid sinE range')
    x_u = np.linspace(smin, smax, int(n))
    y_u = np.interp(x_u, np.sort(sinE), y[np.argsort(sinE)])
    dx = x_u[1] - x_u[0]
    return x_u, y_u, dx



## Windowing and FFT

We apply a Hann window to mitigate spectral leakage, then compute the **one‑sided** FFT amplitude spectrum.


In [None]:

def hann_window(n):
    return 0.5 - 0.5*np.cos(2*np.pi*np.arange(n)/n)

def rfft_amplitude(y, dx):
    y = np.asarray(y, float)
    n = len(y)
    Y = np.fft.rfft(y)
    # Convert to amplitude spectrum (not power), with simple normalization
    amp = (2.0 / n) * np.abs(Y)
    # frequency axis in cycles per unit x, where x=sinE
    freqs = np.fft.rfftfreq(n, d=dx)
    return freqs, amp



## Peak Picking (with Quadratic Refinement)

We locate the maximum (excluding the DC bin), then refine its position using a simple **parabolic fit**
to the peak and its neighbors.


In [None]:

def quadratic_peak_refine(freqs, amp):
    # exclude DC bin
    if len(amp) < 4:
        return None, None
    k = int(np.argmax(amp[1:])) + 1
    if k <= 0 or k >= len(amp)-1:
        f0 = freqs[k]
        return f0, amp[k]
    # parabolic interpolation using three points (k-1, k, k+1)
    y1, y2, y3 = amp[k-1], amp[k], amp[k+1]
    denom = (y1 - 2*y2 + y3)
    if denom == 0:
        f0 = freqs[k]
        apeak = y2
    else:
        delta = 0.5*(y1 - y3)/denom  # peak offset in bins
        f0 = freqs[k] + delta*(freqs[1]-freqs[0])
        apeak = y2 - 0.25*(y1 - y3)*delta
    return f0, apeak



## From Frequency to Reflector Height

Given a dominant frequency \(f\) (cycles per unit \(\sin E\)), the **reflector height** estimate is
\( h \approx (\lambda/2) f \).


In [None]:

# Wavelengths (m)
c = 299792458.0
f_L1 = 1575.42e6
f_L2 = 1227.60e6
f_L5 = 1176.45e6
lambda_L1 = c / f_L1
lambda_L2 = c / f_L2
lambda_L5 = c / f_L5

def height_from_freq(f_peak, band='L1'):
    wl = {'L1': lambda_L1, 'L2': lambda_L2, 'L5': lambda_L5}.get(band, lambda_L1)
    return 0.5 * wl * f_peak



## Load Cleaned Arcs

Either load the CSV written by the previous notebook (`/mnt/data/cleaned_arcs.csv`)
or fall back to a small synthetic example.


In [None]:

def synth_clean():
    # simple synthetic arc in sinE with a cosine at f ~ 20 cycles/unit
    rng = np.random.default_rng(0)
    sinE = np.linspace(np.sin(np.deg2rad(5)), np.sin(np.deg2rad(30)), 400)
    f_true = 20.0  # cycles per unit sinE
    y = np.cos(2*np.pi*f_true*sinE + 0.3) + 0.2*rng.standard_normal(len(sinE))
    df = pd.DataFrame({'sinE': sinE, 'snr_detrended_z': y, 'segment_id': 1,
                       'prn':'G12','elev_deg':np.rad2deg(np.arcsin(sinE))})
    return df

path = Path('/mnt/data/cleaned_arcs.csv')
if path.exists():
    data = pd.read_csv(path)
    print('Loaded:', path)
else:
    print('Using synthetic clean data (create cleaned_arcs.csv to use real data).')
    data = synth_clean()

# Choose a segment to analyze
if 'segment_id' in data.columns:
    seg_id = int(data['segment_id'].iloc[0])
    arc = data[data['segment_id']==seg_id].copy()
else:
    arc = data.copy()

arc = arc.dropna(subset=['sinE','snr_detrended_z'])
arc.head()



## Compute FFT and Pick the Peak


In [None]:

# Resample to uniform grid
x_u, y_u, dx = resample_uniform_sinE(arc['sinE'].values, arc['snr_detrended_z'].values, n=2048)

# Apply Hann window
w = hann_window(len(y_u))
y_w = (y_u - np.mean(y_u)) * w

# FFT
freqs, amp = rfft_amplitude(y_w, dx)

# Peak detection (skip DC)
f_peak, a_peak = quadratic_peak_refine(freqs, amp)
f_peak, a_peak, dx



## Estimate Reflector Height and a Simple Peak‑Width Diagnostic


In [None]:

# Choose band for wavelength (adjust as needed)
BAND = 'L1'
h_est = height_from_freq(f_peak, band=BAND)

# Simple width diagnostic: full-width at half-maximum (approx) around peak index
import numpy as np
if np.isfinite(f_peak):
    k = np.argmin(np.abs(freqs - f_peak))
    half = amp[k] / 2.0
    # search left
    iL = k
    while iL>1 and amp[iL] > half:
        iL -= 1
    # search right
    iR = k
    while iR < len(amp)-1 and amp[iR] > half:
        iR += 1
    fwhm = freqs[iR] - freqs[iL] if (iR>iL) else np.nan
else:
    fwhm = np.nan

h_est, f_peak, fwhm



## Plots
1) Detrended z‑score vs \(\sin E\) (raw and uniform‑resampled).  
2) Amplitude spectrum with peak marker and inferred \(h\).


In [None]:

fig, ax = plt.subplots(2, 1, figsize=(8,8))

# Top: signal vs sinE
ax[0].plot(arc['sinE'].values, arc['snr_detrended_z'].values, '.', ms=3, label='original')
ax[0].plot(x_u, y_u, '-', lw=1, label='uniform resample')
ax[0].set_xlabel('sin(E)')
ax[0].set_ylabel('Detrended Z-score')
ax[0].set_title('Signal in sin(E)')
ax[0].grid(True)
ax[0].legend()

# Bottom: spectrum
ax[1].plot(freqs, amp, lw=1)
if np.isfinite(f_peak):
    ax[1].axvline(f_peak, ls='--', label=f'Peak f={f_peak:.2f} cyc/(sinE)')
    ax[1].text(f_peak, max(amp)*0.9, f'h≈{h_est:.2f} m ({BAND})', ha='center')
ax[1].set_xlabel('Frequency [cycles per unit sin(E)]')
ax[1].set_ylabel('Amplitude (a.u.)')
ax[1].set_title('One-sided FFT Amplitude Spectrum')
ax[1].grid(True)
ax[1].legend()

plt.tight_layout()
plt.show()



## Notes & Guidance

- **Uniform grid:** Lomb–Scargle avoids resampling; we'll cover it in the next notebook. Here we use FFT to build intuition.
- **Windowing:** Hann taper reduces leakage when the arc doesn't contain an integer number of cycles.
- **Peak width (FWHM):** Narrower peaks generally indicate better coherence and a more stable height estimate.
- **Elevation window:** If your arc spans a small range in \(E\), the frequency resolution worsens (shorter record in \(x=\sin E\)).
- **Multiple peaks:** Can indicate mixed reflections, vegetation dynamics, or imperfect detrending.



## Exercises
1. Vary the resample size (e.g., 512, 1024, 4096). How does the **frequency resolution** and peak sharpness change?
2. Try different tapers (Hann vs no taper). How much **leakage** do you observe?
3. Change the **elevation window** in the previous notebook and re‑export `cleaned_arcs.csv`. Compare spectra.
4. Estimate \(h\) using L2 and L5 wavelengths; discuss differences vs L1.



## Next Notebook
**`07_LombScargle_Spectral_Analysis.ipynb`** — We’ll compute Lomb–Scargle spectra (no uniform resampling required), compare with FFT, and improve peak picking and confidence metrics.
