# GNSS Signal & Ionospheric Analysis
## Station MORP – 20 January 2026

This notebook analyses GPS-only RINEX 2.11 observation data from
station MORP (Day 020, 2026), obtained from NASA CDDIS.

The objectives are:

1. Understand GNSS observation types (C1, C2, L1, L2, S1, S2)
2. Analyse satellite visibility and signal strength
3. Investigate dual-frequency behaviour
4. Estimate ionospheric delay effects
5. Prepare structured features for AI-based GNSS modelling

This work integrates GNSS physics with data-driven methods,
forming a foundation for Geospatial Artificial Intelligence research.



In [None]:
!pip install georinex

In [None]:
import georinex as gr
import xarray as xr
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
obs = gr.load("/content/morp0200.26o")   # Update the link with your file-path
obs

In [None]:
print("RINEX Version:", obs.attrs.get("version"))
print("Available Observables:", list(obs.data_vars))
print("Number of Epochs:", obs.dims["time"])
print("Number of Satellites:", obs.dims["sv"])

## GNSS Observation Types (RINEX 2.11)

The RINEX file contains the following key observables:

### C1 – L1 C/A Code Pseudorange
- Code-based distance measurement on L1 frequency
- Units: metres
- Used for positioning

### C2 – L2 Code Pseudorange
- Code-based distance measurement on L2 frequency
- Used for dual-frequency corrections

### L1 – Carrier Phase (L1)
- High precision phase measurement
- Units: cycles
- Used in precise positioning

### L2 – Carrier Phase (L2)
- Carrier phase on second frequency

### S1 – Signal-to-Noise Ratio (L1)
- Signal quality indicator
- Units: dB-Hz

### S2 – Signal-to-Noise Ratio (L2)

Dual-frequency measurements (C1 & C2) allow estimation of
ionospheric delay, since ionospheric error is frequency-dependent.

In [None]:
# List satellites
satellites = obs.sv.values
satellites

## Station Coordinates

The RINEX header provides approximate receiver coordinates
in ECEF (Earth-Centered Earth-Fixed) system.


In [None]:
print("Station ECEF Position (m):")
print(obs.attrs["position"])

## Satellite Availability

Satellite availability influences positioning accuracy.
More visible satellites generally improve solution reliability
and reduce Dilution of Precision (DOP).


In [None]:
# Use C1 pseudorange (RINEX 2 observable)
pseudorange = obs["C1"]

satellite_counts = pseudorange.notnull().sum(dim="sv")

plt.figure()
satellite_counts.plot()
plt.title("Number of GPS Satellites Observed Over Time")
plt.xlabel("Time")
plt.ylabel("Satellite Count")
plt.show()

print("Mean Satellite Count:", float(satellite_counts.mean()))
print("Min Satellite Count:", int(satellite_counts.min()))
print("Max Satellite Count:", int(satellite_counts.max()))

## Pseudorange Observation Behaviour

Pseudorange measurements represent the apparent distance
between satellite and receiver.

We inspect one satellite as an example.


In [None]:
first_sat = obs.sv.values[0]
print("Analysing satellite:", first_sat)

plt.figure()
obs["C1"].sel(sv=first_sat).plot()
plt.title(f"Pseudorange (C1) – {first_sat}")
plt.ylabel("Range (m)")
plt.show()

## Signal Strength Analysis (S1)

SNR reflects signal quality.
Lower SNR may indicate multipath or poor satellite geometry.

In [None]:
snr = obs["S1"]

mean_snr = snr.mean(dim="sv")

plt.figure()
mean_snr.plot()
plt.title("Average Signal-to-Noise Ratio Over Time")
plt.ylabel("SNR (dB-Hz)")
plt.show()

print("Average SNR:", float(mean_snr.mean()))

## Dual-Frequency Comparison (C1 vs C2)

Differences between L1 and L2 pseudorange can indicate
ionospheric delay effects.

This is foundational for advanced GNSS error modelling.

In [None]:
# Compute mean difference between C1 and C2
c1 = obs["C1"]
c2 = obs["C2"]

diff = (c1 - c2).mean(dim="sv")

plt.figure()
diff.plot()
plt.title("Mean C1 - C2 Difference Over Time")
plt.ylabel("Range Difference (m)")
plt.show()

print("Mean Dual-Frequency Difference:", float(diff.mean()))

In [None]:
summary = {
    "Mean Satellite Count": float(satellite_counts.mean()),
    "Average SNR": float(mean_snr.mean()),
    "Mean C1-C2 Difference": float(diff.mean())
}

summary

## Preparing Dataset for AI-Based Modelling

We create a structured dataframe containing:

- Time (seconds from start)
- Satellite count
- Mean SNR
- Dual-frequency difference

This dataset can later be used for regression or filtering models.

In [None]:
df = pd.DataFrame({
    "time": satellite_counts.time.values,
    "satellite_count": satellite_counts.values,
    "mean_snr": mean_snr.values,
    "c1_c2_diff": diff.values
})

df["time_sec"] = (
    pd.to_datetime(df["time"]) - pd.to_datetime(df["time"]).min()
).dt.total_seconds()

df.head()

# Ionosphere-Free Pseudorange Combination

The ionosphere introduces frequency-dependent delay in GNSS signals.

By combining L1 and L2 pseudorange measurements,
we can eliminate first-order ionospheric effects.

The ionosphere-free (IF) pseudorange combination is:

P_IF = (f1² * P1 − f2² * P2) / (f1² − f2²)

Where:
- P1 = C1 pseudorange
- P2 = C2 pseudorange
- f1 = 1575.42 MHz
- f2 = 1227.60 MHz

This combination is fundamental in precise positioning
and forms the basis of PPP (Precise Point Positioning).


In [None]:
# GPS Frequencies (Hz)
f1 = 1575.42e6
f2 = 1227.60e6

c1 = obs["C1"]
c2 = obs["C2"]

In [None]:
p_if = (f1**2 * c1 - f2**2 * c2) / (f1**2 - f2**2)

# Compute mean IF value per epoch
p_if_mean = p_if.mean(dim="sv")

In [None]:
plt.figure()
p_if_mean.plot()
plt.title("Ionosphere-Free Pseudorange (Mean Across Satellites)")
plt.ylabel("Range (m)")
plt.show()

In [None]:
raw_mean = c1.mean(dim="sv")

plt.figure()
raw_mean.plot(label="Raw C1")
p_if_mean.plot(label="Iono-Free")
plt.legend()
plt.title("Raw vs Ionosphere-Free Pseudorange")
plt.show()

In [None]:
raw_std = c1.std(dim="sv")
if_std = p_if.std(dim="sv")

plt.figure()
raw_std.plot(label="Raw Std Dev")
if_std.plot(label="IF Std Dev")
plt.legend()
plt.title("Measurement Variability: Raw vs Iono-Free")
plt.show()

# Measurement Noise Estimation

To model GNSS signal instability, we define an epoch-level
noise metric as the standard deviation of C1 pseudorange
across all visible satellites.

Higher variability may indicate:

- Multipath effects
- Poor satellite geometry
- Atmospheric disturbances
- Signal degradation

This will be our machine learning target variable.

In [None]:
# Compute pseudorange variability per epoch
noise_metric = obs["C1"].std(dim="sv")

plt.figure()
noise_metric.plot()
plt.title("Epoch-Level Pseudrange Variability (Noise Metric)")
plt.ylabel("Standard Deviation (m)")
plt.show()

print("Average Noise Level (m):", float(noise_metric.mean()))

In [None]:
sat_count = obs["C1"].notnull().sum(dim="sv")
mean_snr = obs["S1"].mean(dim="sv")
iono_diff = (obs["C1"] - obs["C2"]).mean(dim="sv")

df_ml = pd.DataFrame({
    "time": sat_count.time.values,
    "satellite_count": sat_count.values,
    "mean_snr": mean_snr.values,
    "iono_diff": iono_diff.values,
    "noise": noise_metric.values
})

df_ml["time_sec"] = (
    pd.to_datetime(df_ml["time"]) - pd.to_datetime(df_ml["time"]).min()
).dt.total_seconds()

df_ml.head()

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score

X = df_ml[["satellite_count", "mean_snr", "iono_diff", "time_sec"]]
y = df_ml["noise"]

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

In [None]:
model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

rmse = np.sqrt(mean_squared_error(y_test, y_pred))
r2 = r2_score(y_test, y_pred)

print("RMSE:", rmse)
print("R²:", r2)

In [None]:
plt.figure()
plt.scatter(y_test, y_pred, alpha=0.5)
plt.xlabel("Actual Noise")
plt.ylabel("Predicted Noise")
plt.title("Actual vs Predicted GNSS Noise")
plt.show()

In [None]:
importances = model.feature_importances_

feature_names = X.columns

plt.figure()
plt.bar(feature_names, importances)
plt.title("Feature Importance in Noise Prediction")
plt.show()

# Interpretation of AI Results

The model predicts GNSS pseudorange variability using:

- Satellite availability
- Signal strength
- Dual-frequency behaviour
- Temporal variation

Feature importance reveals which physical factors
most influence measurement instability.

This demonstrates how AI can support GNSS signal quality assessment
without explicitly solving positioning equations.

Such approaches are valuable in:

- Urban GNSS quality monitoring
- Autonomous navigation
- Sensor fusion systems
- Intelligent positioning algorithms