
# Lomb–Scargle Spectral Analysis (GNSS‑IR)

This notebook estimates the dominant oscillation in detrended SNR **without** resampling
by using a **Lomb–Scargle periodogram** in the natural variable \(x = \sin E\).

**You will:**
- Load cleaned arcs from `cleaned_arcs.csv`.
- Compute a Lomb–Scargle periodogram on irregular \(x=\sin E\) samples.
- Pick the dominant frequency and convert it to reflector height \( h \approx (\lambda/2) f \).
- Compare with the FFT result qualitatively.



> **Note on dependencies:**  
> We implement a **self‑contained** Lomb–Scargle (classic form) so you don’t need external libraries.
> It fits \(y(x) \\approx a\\cos(2\\pi f x) + b\\sin(2\\pi f x)\) across a frequency grid and computes the
> least‑squares amplitude (power). This is sufficient for GNSS‑IR teaching and most demos.


In [None]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from pathlib import Path



## Lightweight Lomb–Scargle Implementation

Classic (normalized) Lomb–Scargle periodogram for uneven samples \(x\) with observations \(y\).


In [None]:

def lomb_scargle_classic(x, y, freqs):
    """Classic Lomb–Scargle periodogram.
    x: sample positions (e.g., sinE), shape (N,)
    y: observations (detrended, zero-mean recommended), shape (N,)
    freqs: frequencies (cycles per unit x), shape (M,)
    Returns: power array shape (M,)
    """
    x = np.asarray(x, float)
    y = np.asarray(y, float)
    freqs = np.asarray(freqs, float)

    # subtract mean
    y = y - np.nanmean(y)
    m = np.isfinite(x) & np.isfinite(y)
    x = x[m]; y = y[m]
    if x.size < 8:
        raise ValueError("Not enough points for Lomb–Scargle")

    w = 2.0 * np.pi * freqs[:, None]  # angular freq grid (M,N)
    # tau per frequency
    two_omega_x = 2.0 * w * x[None, :]
    tan2wtaunum = np.sum(np.sin(two_omega_x), axis=1)
    tan2wtau = tan2wtaunum / np.sum(np.cos(two_omega_x), axis=1)
    tau = 0.5 * np.arctan(tan2wtau) / (2.0 * np.pi)  # convert from angular to cycles
    # shift x by tau: x' = x - tau
    # compute C,S terms
    wt = 2.0 * np.pi * (freqs[:, None]) * (x[None, :] - tau[:, None])
    cos_wt = np.cos(wt)
    sin_wt = np.sin(wt)

    C = np.sum(y[None, :] * cos_wt, axis=1)
    S = np.sum(y[None, :] * sin_wt, axis=1)
    CC = np.sum(cos_wt**2, axis=1)
    SS = np.sum(sin_wt**2, axis=1)

    power = 0.5 * (C**2 / CC + S**2 / SS)
    # normalize by variance of y (optional)
    var_y = np.nanvar(y)
    if var_y > 0:
        power = power / var_y
    return power



## Frequency → Height


In [None]:

# Wavelengths (m)
c = 299792458.0
f_L1 = 1575.42e6
f_L2 = 1227.60e6
f_L5 = 1176.45e6
lambda_L1 = c / f_L1
lambda_L2 = c / f_L2
lambda_L5 = c / f_L5

def height_from_freq(f_peak, band='L1'):
    wl = {'L1': lambda_L1, 'L2': lambda_L2, 'L5': lambda_L5}.get(band, lambda_L1)
    return 0.5 * wl * f_peak



## Load Cleaned Data (or Use Synthetic)


In [None]:

def synth_clean(f_true=20.0, noise=0.2, phase=0.3):
    rng = np.random.default_rng(0)
    sinE = np.sort(np.sin(np.deg2rad(np.linspace(5, 30, 400))))
    y = np.cos(2*np.pi*f_true*sinE + phase) + noise*rng.standard_normal(len(sinE))
    return pd.DataFrame({'sinE': sinE, 'snr_detrended_z': y, 'segment_id': 1})

path = Path('/mnt/data/cleaned_arcs.csv')
if path.exists():
    df = pd.read_csv(path)
    if 'segment_id' in df.columns:
        seg_id = int(df['segment_id'].iloc[0])
        arc = df[df['segment_id']==seg_id].dropna(subset=['sinE','snr_detrended_z']).copy()
    else:
        arc = df.dropna(subset=['sinE','snr_detrended_z']).copy()
    print('Loaded:', path, f'(N={len(arc)})')
else:
    print('Using synthetic clean data (create cleaned_arcs.csv to use real data).')
    arc = synth_clean()



## Compute Lomb–Scargle and Pick the Peak


In [None]:

# Frequency search grid (cycles per unit sinE)
# Rough heuristic: f in [0, 60] is typical for L-band and h<~6 m
fmin, fmax, M = 0.0, 60.0, 4000
freqs = np.linspace(fmin, fmax, M)
power = lomb_scargle_classic(arc['sinE'].values, arc['snr_detrended_z'].values, freqs)

# Peak (ignore DC region a bit)
k0 = 5
k_peak = k0 + int(np.argmax(power[k0:]))
f_peak = freqs[k_peak]
P_peak = power[k_peak]

# Band selection for height conversion
BAND = 'L1'
h_est = height_from_freq(f_peak, band=BAND)
f_peak, P_peak, h_est



## Plots
- Detrended z‑score vs \(\sin E\) (irregular sampling).  
- Lomb–Scargle power spectrum with peak/height annotation.


In [None]:

plt.figure(figsize=(7,4.5))
plt.plot(arc['sinE'].values, arc['snr_detrended_z'].values, '.', ms=3)
plt.xlabel('sin(E)'); plt.ylabel('Detrended Z-score')
plt.title('Signal in sin(E) (irregular sampling)')
plt.grid(True)
plt.show()

plt.figure(figsize=(7,4.5))
plt.plot(freqs, power, lw=1)
plt.axvline(f_peak, ls='--', label=f'Peak f={f_peak:.2f} cyc/(sinE)')
plt.text(f_peak, max(power)*0.9, f'h≈{h_est:.2f} m ({BAND})', ha='center')
plt.xlabel('Frequency [cycles per unit sin(E)]')
plt.ylabel('Normalized Power')
plt.title('Lomb–Scargle Periodogram')
plt.grid(True); plt.legend()
plt.show()



## Notes
- Lomb–Scargle avoids the **uniform resampling** required by FFT and works directly on irregular \(x\).
- Frequency grid density controls resolution. Make sure \(f_{\\max}\) exceeds your expected \((2h/\\lambda)\).
- For higher rigor, you can implement generalized LS with offsets/weights and false‑alarm probabilities.



## Next Notebook
**`08_Multi_Frequency_Comparison.ipynb`** — Estimate heights from multiple bands (L1/L2/L5) and compare.
