# Input Data Processing

This notebook serves the following purposes:

1. Read XML data from space-track.org
2. Read the CSV SATCAT catalog from celestrak
3. Based on above data estimate mass and radius (characteristic length), get activity state
4. Propagate all the satellites to the same point in time
5. Investigate data and clean-up unwanted data. Then store.
6. Old code used for generating test data.

## Input files
- CSV data from the [CelesTrak SATCAT catalog](https://celestrak.com/pub/satcat.csv) following this [format](https://celestrak.com/satcat/satcat-format.php)
- XML 3LE data from the [Space-Track.org catalog](https://www.space-track.org/) following their [format](https://www.space-track.org/documentation#/tle)

## Output files

- Satellite data in CSV format with data on Satellite ID, Position, Velocity, Mass, Radius (characteristic length) and Activity State

In [None]:
### Imports
%load_ext autoreload
%autoreload 2

# Append main folder
import sys
sys.path.append("../")

import pykep as pk
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from tqdm import tqdm 

starting_t = pk.epoch_from_string('2022-01-01 00:00:00.000')
lower_cutoff_in_km = 6371 + 200 # Earth radius + ...
higher_cutoff_in_km = 6371 + 2000

## 1. Read XML data

In [None]:
import xml.etree.ElementTree as ET
def parse_xml(file):
    """Parse spacetrack xml. 
    This function is inspired by https://github.com/brandon-rhodes/python-sgp4/blob/master/sgp4/omm.py , MIT Licensed
    """
    root = ET.parse(file).getroot()
    for segment in root.findall('.//segment'):
        metadata = segment.find('metadata')
        data = segment.find('data')
        meanElements = data.find('meanElements')
        tleParameters = data.find('tleParameters')
        userParameters = data.find('userDefinedParameters')
        fields = {}
        for element in metadata, meanElements, tleParameters, userParameters:
            fields.update((field.items()[0][1], field.text) if len(field.items()) > 0 else (field.tag, field.text)  for field in element)
        yield fields

In [None]:
# Create a generator to iterate over the data
fields = parse_xml("../data/spacetrack.xml")

## 2. Read SATCAT data

In [None]:
# Load all the xml data from space-track
satellites = []
while True:
    try:
        satellites.append(next(fields))
        if len(satellites) % 5000 == 0:
            print("Loaded ",len(satellites), "sats...")
    except StopIteration:
        print("Loaded ",len(satellites), "sats...Done")
        break

In [None]:
# Read satcat data from celestrak
satcat = pd.read_csv("../data/satcat.csv")
satcat

## 3. Compute mass and radius (characteristic length) and get status

This follows the formulas from 
Nicholas L Johnson, Paula H Krisko, J-C Liou, and Phillip D Anz-Meador.
Nasa’s new breakup model of evolve 4.0. Advances in Space Research, 28(9):1377–
1384, 2001.

According to space-track , RCS small, medium and large are, respectively < 0.1 , 0.1 < RCS < 1.0 and 1.0 < RCS. For simplicity using above formula we convert this to 15cm, 55cm, 200cm

We get activity status from the celestrak data following https://celestrak.com/satcat/status.php

In [None]:
sats_with_info = []
for sat in tqdm(satellites):
    
    satcat_sat = satcat[satcat["OBJECT_ID"] == sat["OBJECT_ID"]]
    
    # Skip decayed ones or ones not in celestrak
    if len(satcat_sat) == 0 or satcat_sat["OPS_STATUS_CODE"].values == "D":
        continue
    
    # Determine L_C
    if not np.isnan(satcat_sat["RCS"].values):
        sat["RADIUS"] = np.sqrt(float(satcat_sat["RCS"].values) / np.pi)
    else:
        if sat["RCS_SIZE"] == "SMALL":
            sat["RADIUS"] = 0.15
        elif sat["RCS_SIZE"] == "MEDIUM":
            sat["RADIUS"] = 0.55
        elif sat["RCS_SIZE"] == "LARGE":
            sat["RADIUS"] = 2.0
        else:
            # skip if no info was found
            continue
            
    # Determine Mass
    if sat["RADIUS"] > 0.01:
        sat["MASS"] = 4 / 3 * np.pi *(sat["RADIUS"] / 2)**3 * 92.937 * sat["RADIUS"]**(-0.74)
    else:
        sat["MASS"] = 4 / 3 * np.pi *(sat["RADIUS"] / 2)**3 * 2698.9
        
        
    # Determine if active satellite
    if satcat_sat["OPS_STATUS_CODE"].values in ["+","P","B","S","X"]:
        sat["TYPE"] = "evasive"
    else:
        sat["TYPE"] = "passive"
    
    # Add planet
    t0 = pk.epoch_from_string(sat["EPOCH"].replace("T"," "))
    elements = [float(sat["SEMIMAJOR_AXIS"]) * 1000.,
                float(sat["ECCENTRICITY"]),
                float(sat["INCLINATION"]) * pk.DEG2RAD,
                float(sat["RA_OF_ASC_NODE"]) * pk.DEG2RAD,
                float(sat["ARG_OF_PERICENTER"]) * pk.DEG2RAD,
                float(sat["MEAN_ANOMALY"]) * pk.DEG2RAD,
               ]
    planet = pk.planet.keplerian(t0,elements,pk.MU_EARTH,6.67430e-11*sat["MASS"],sat["RADIUS"] / 2,sat["RADIUS"] / 2)
    sat["PLANET"] = planet
    
    sats_with_info.append(sat)
    
print("Now we have a total of ",len(sats_with_info), "sats.")

### Plot some examples

In [None]:
fig = plt.figure(figsize=(6,6),dpi=100)
ax = plt.axes(projection='3d');
for i in range (10):
    pk.orbit_plots.plot_planet(sats_with_info[i]["PLANET"],axes=ax)

## 4. Propagate all objects to t and discard too low and high ones

In [None]:
objects = []
count_too_low = 0
count_too_high = 0

for sat in sats_with_info:
    try:
        planet = sat["PLANET"]
        pos,v = planet.eph(starting_t)
        
        # convert to km and numpy
        pos = np.asarray(pos) / 1000.0 
        v = np.asarray(v) / 1000.0
        sma,_,_,_,_,_ = pk.ic2par(pos * 1000,v *1000,mu=pk.MU_EARTH)
        
        altitude = np.linalg.norm(pos)
        if altitude < lower_cutoff_in_km:
            count_too_low += 1
            continue
        if sma / 1000. > higher_cutoff_in_km or altitude > higher_cutoff_in_km:
            count_too_high += 1
            continue
        
        objects.append({"ID": sat["OBJECT_NAME"],
                        "R": tuple(pos),
                        "V": tuple(v),
                        "M": sat["MASS"],
                        "RADIUS": sat["RADIUS"],
                        "TYPE": sat["TYPE"]
                       })
    except RuntimeError as e:
        print(e, " propagating ",planet.name)
        
print("Successfully propagated ",len(objects)," objects.")
print(count_too_low," had a too small altitude")
print(count_too_high," had a too high altitude")

## 5. Plot, clean up and store results

In [None]:
fig = plt.figure(figsize=(6,6),dpi=100)
ax = plt.axes(projection='3d');

positions = np.array([obj["R"] for obj in objects])
velocities = np.array([obj["V"] for obj in objects])

ax.scatter(positions[:,0],positions[:,1],positions[:,2],".",alpha=0.25)

In [None]:
# Convert to pandas dataframe and drop ISS and any duplicate entries.
df = pd.DataFrame(objects)
df = df.drop(np.argmax(df["ID"] == "ISS (ZARYA)"))
df = df.drop(df[df.ID.str.startswith('STARLINK')].index)
df = df.drop(df[df.ID.str.startswith('ONEWEB')].index)
df = df.drop_duplicates(subset=['R'])
df

In [None]:
# new df from the column of lists
split_df = pd.DataFrame(df['R'].tolist(), columns=['r_x', 'r_y', 'r_z'])
df = pd.concat([df, split_df], axis=1)

split_df = pd.DataFrame(df['V'].tolist(), columns=['v_x', 'v_y', 'v_z'])
df = pd.concat([df, split_df], axis=1)

df = df.drop(columns="R")
df = df.drop(columns="V")

# display df
df

In [None]:
# Write to csv
df.to_csv("../data/initial_population.csv")

# 6. (deprecated) Propagate test set by some time

In [None]:
# Load test data
pos = np.loadtxt("../data/pos.csv",delimiter=",")
v = np.loadtxt("../data/v.csv",delimiter=",")

In [None]:
# Propagate by t seconds
t = 10
objects = []
t_end = pk.epoch(starting_t.mjd + t * pk.SEC2DAY,"mjd")
for pos_i,v_i in zip(pos,v):
    try:
        p = pk.planet.keplerian(starting_t,pos_i * 1000.0,v_i * 1000.0,pk.MU_EARTH,1.,1.,1.)
        r,v = p.eph(t_end)
        
        objects.append((np.array(r) / 1000., np.array(v) / 1000.))
        
    except RuntimeError as e:
        print(e, " propagating ",p.name)

In [None]:
# unpack
positions = np.array([r for r,_ in objects])
velocities = np.array([v_i for _,v_i in objects])

In [None]:
#look at them
pos

In [None]:
positions

In [None]:
# Save
np.savetxt("../data/pos_test_10s.csv",positions,delimiter=",")
np.savetxt("../data/v_test_10s.csv",velocities,delimiter=",")