## Import libraries and dataset

In [2]:
from pathlib import Path
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# project root = two levels up from this notebook
PROC_DIR = Path.cwd().parents[1] / "data" / "01_processed" / "elset_history_aodr"
print(PROC_DIR)  # sanity check

c:\Users\ash\Desktop\wid-datathon\data\01_processed\elset_history_aodr


In [3]:
day = "2025-08-01"
day_dir = PROC_DIR / f"epoch_date={day}"

# read the whole partition (all files in that directory)
df_day = pd.read_parquet(day_dir)
df_day.head()

Unnamed: 0,algorithm,apogee,argOfPerigee,bStar,classificationMarking,createdAt,createdBy,eccentricity,source,semiMajorAxis,...,perigee,origNetwork,meanMotionDot,meanMotionDDot,meanMotion,meanAnomaly,inclination,idOnOrbit,idElset,epoch
0,SGP4,10199.975,135.3433,1.4e-05,U,2025-08-01 08:07:46.375000+00:00,system.ob-ingest,0.184124,18th SPCS,8613.943,...,7027.911,OPS1,4e-08,0.0,10.859248,241.1088,34.2411,5,,2025-08-01 02:34:42.703680+00:00
1,SGP4,9278.216,36.5756,0.00052,U,2025-08-01 08:07:46.376000+00:00,system.ob-ingest,0.144852,18th SPCS,8104.295,...,6930.374,OPS1,9.81e-06,0.0,11.899532,332.5492,32.8632,11,,2025-08-01 04:45:56.765952+00:00
2,SGP4,9278.219,38.0078,0.000535,U,2025-08-01 20:07:49.125000+00:00,system.ob-ingest,0.144852,18th SPCS,8104.292,...,6930.365,OPS1,1.01e-05,0.0,11.899538,331.4339,32.8632,11,,2025-08-01 10:48:36.631008+00:00
3,SGP4,9669.19,212.4916,0.000684,U,2025-08-01 15:07:53.223000+00:00,system.ob-ingest,0.16477,18th SPCS,8301.373,...,6933.557,OPS1,1.187e-05,0.0,11.478307,136.2848,32.9035,12,,2025-08-01 10:57:46.225728+00:00
4,SGP4,9669.191,212.9515,0.00068,U,2025-08-01 20:07:49.130000+00:00,system.ob-ingest,0.16477,18th SPCS,8301.373,...,6933.555,OPS1,1.181e-05,0.0,11.478309,135.6898,32.9035,12,,2025-08-01 13:03:01.027296+00:00


In [4]:
df_day.info()
# The entire idElset row is null. Can build a time series dashboard using satNo + epoch (or satNo + createdAt).

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 38508 entries, 0 to 38507
Data columns (total 24 columns):
 #   Column                 Non-Null Count  Dtype              
---  ------                 --------------  -----              
 0   algorithm              38508 non-null  object             
 1   apogee                 38508 non-null  float64            
 2   argOfPerigee           38508 non-null  float64            
 3   bStar                  38508 non-null  float64            
 4   classificationMarking  38508 non-null  string             
 5   createdAt              38508 non-null  datetime64[us, UTC]
 6   createdBy              38508 non-null  string             
 7   eccentricity           38508 non-null  float64            
 8   source                 38508 non-null  string             
 9   semiMajorAxis          38508 non-null  float64            
 10  satNo                  38508 non-null  Int64              
 11  revNo                  38508 non-null  Int64          

## Orbit Size and Shape

Types

1. Low Earth Orbit (LEO) - comms and remote sensing systems

2. Medium Earth Orbit (MEO) - navigation systems, U.S. GPS

3. Geosynchronous (GSO) & Geostationary (GEO) Orbits - telecom and Earth observation

    The orbital speed of GSO objects match the Earth's rotation. A GEO object is a specific GSO that orbits the equator.

4. Highly Elliptical Orbit (HEO) - comms, radio
    
    An HEO is oblong, with one end nearer the Earth and other more distant.

In [5]:
# Earth's radius = 6378 km

df_day["altitude"] = df_day["semiMajorAxis"] - 6378

df_day.loc[:,["altitude","eccentricity"]].describe()

Unnamed: 0,altitude,eccentricity
count,38508.0,38508.0
mean,2763.405184,0.026428
std,7905.456463,0.113314
min,199.692,4e-06
25%,498.24975,0.000153
50%,612.441,0.000779
75%,972.09325,0.004763
max,265569.319,0.900796


### Semi-Major Axis

Based on the IQR, center spread of satellites have altitudes between 498-972 km, categorizing them as LEO objects. Mean-median comparision indicates a right-skew due to HEO objects - consider maximum altitude at 265,569 km.

In [5]:
# The percent of satellites with eccentricity e

# e <= 0.01
print(f"{df_day['eccentricity'].le(0.01).mean()*100:.0f}%") # 84%

# e <= 0.05
print(f"{df_day['eccentricity'].le(0.05).mean()*100:.0f}%") # 94%

# e >= 0.5
print(f"{df_day['eccentricity'].ge(0.5).mean()*100:.0f}%") # 3%

84%
94%
3%


### Eccentricity
94% of the satellites in the dataset have eccentricity values at or below 0.05. Most satellites have a nearly circular orbit.

## Next: Try deterministic “bin + split” to assign sky-lane IDs

1) Assign sky-lane IDs (tune tolerances to your dataset)
labeled = assign_sky_lanes(
    df,
    inc_tol_deg=0.15,   # how tight to group inclination
    raan_tol_deg=0.6,   # max RAAN gap inside a plane
    a_tol_km=15.0       # semi-major-axis shell width
)

2) “Coplanar constellations” = satellites sharing the same lane_id
lane_stats = summarize_lanes(labeled)

# Next: Plot sky lanes/orbits in 3D
Your dataset has all the orbital elements you need to reconstruct and plot the actual orbit of each satellite in 3D. Research Keplerian orbital element set.

Convert orbital elements to position (x, y, z) at a given epoch:
Solve Kepler’s equation to get the true anomaly from meanAnomaly & eccentricity.
Compute the satellite’s position in its orbital plane.
Rotate into Earth-Centered Inertial (ECI) coordinates using raan, inclination, argOfPerigee.

Propagate over time:
Use meanMotion to advance mean anomaly and recompute position at future times.
This gives you the full trajectory around Earth.

Plot the orbit:
In 3D with matplotlib or plotly, showing Earth as a sphere and the satellite’s orbit around it.
