This is a second dataset I would like to use with the ucs dataset.

First, we need some imports and utility functions for classifying objects from this dataset so that object types atleast loosly match up with object types from the ucs dataset so we can compare apples to apples.

The classify_orbit function acts as a translator. It uses the laws of physics (specifically Kepler's Third Law) to convert the time it takes an object to orbit Earth (Period) into an altitude category (Class).

In [1]:
import pandas as pd
df_debris = pd.read_csv('../data/original/satcat.csv')

def classify_orbit(row):
    # look at a single row in the dataframe and grab the value in period_minutes
    period = row['period_minutes']
    
    # classify the object based on the period of time, in minutes, it takes to orbit the earth.
    if pd.isnull(period) or period == 0:
        # filter for missing/blank (nan) data. Period of 0 is physically impossible.
        return 'Unknown'
    elif period < 128:
        # Low Earth Orbit (LEO) - Crowded City
        # An object in leo orbit has an altitude of up to 2,000km and has an orbital period of exactly 127 minutes.
        return 'LEO'
    elif 1400 <= period <= 1460:
        # Geostationary Orbit (GEO) - Single Precise Lane
        # An object in geo orbit should be at exactly 35,786 km above the equator. To stay perfectly positioned over the same one spot on earth,
        # you must be exactly 35,786 km above.  Too low and you spin faster than earth, too high and you drift backwards.
        return 'GEO'
    elif 128 <= period < 1400:
        # Medium Earth Orbit (MEO) - Wide Open Highway
        # An object in meo orbit should be between 2,000 kb and 35,786 km and can operate at altitudes
        # anywhere in between. Popular uses for this orbit include Navigation Satellites
        # Orbital Period should be between 2hrs and 24hrs.
        # Good for when the satellite can't in LEO orbit because it would be too low and it would require larger clusters like StarLink, but
        # also it can't be in GEO because that would be too high, cause the signal to weaker, and the satellite would be stuck over one spot.
        # MEO orbit for nav satellitess is just right, it allows a smaller constellation to circle the earth every 12 hours.
        return 'MEO'
    elif period > 1460:
        # These satellites don't move in circles, they move in extreme, long loops (ellipses).
        # Moves very fast near perigee and very very slow near apogee.
        # Spend a long time over a single, high location, and spend much less time elsewhere.
        # Examples include "The Molniya Orbit", a Soviet spy and communication satellite that used to hover over the USSR for 11hrs and then spend only
        # 1hr flying around the rest of earth just to hover over the USSR for another 11 hours.
        # This category is also a catch-all for Graveyard Orbits.  Sates and debris are often boosted into higher orbits to effective get them out of the way
        # while items in leo orbit tend to eventually become defeated by the forces of gravity and naturally deorbit and burn up in the atmosphere.
        return 'High Elliptical / Deep'
    else:
        # Nothing else applied, avoid a crash in the math.
        return 'Unknown'
    
def categorize_object(row):
    if row['source'] == 'both':
        # The satellite appears in both datasets.  This means it is an active satellite because it appears in both the the active satellite 
        # catelog from UCS (the then), but also in the satcat from celestrak that shows ALL tracked satellites, debris, unknown, etc known to date (the now).
        return 'Active Satellite'
    elif row['object_type'] == 'PAY':
        return 'Inactive Satellite' # Payload but not in UCS = Dead
    elif row['object_type'] == 'R/B':
        return 'Rocket Body' # Spent boosters
    elif row['object_type'] == 'DEB':
        return 'Debris' # Fragments/Shrapnel
    else:
        # Nothing else applied, avoid a crash in the math.
        return 'Unknown'

In [None]:
import pandas as pd
df_debris = pd.read_csv('../data/original/satcat.csv')

# Rename every column now so we don't have to deal with it later.
debris_mapping = {
    'OBJECT_NAME': 'object_name',
    'OBJECT_ID': 'object_id',          
    'NORAD_CAT_ID': 'norad_id',        # The Join Key "Primary Key"
    'OBJECT_TYPE': 'object_type',      # PAY (Payload), R/B (Rocket Body), DEB (Debris)
    'OPS_STATUS_CODE': 'ops_status',   
    'OWNER': 'owner',
    'LAUNCH_DATE': 'launch_date',
    'LAUNCH_SITE': 'launch_site',
    'DECAY_DATE': 'decay_date',
    'PERIOD': 'period_minutes',
    'INCLINATION': 'inclination_degrees',
    'APOGEE': 'apogee_km',
    'PERIGEE': 'perigee_km',
    'RCS': 'rcs',                      # Radar Cross Section (Size)
    'DATA_STATUS_CODE': 'data_status', 
    'ORBIT_CENTER': 'orbit_center',    
    'ORBIT_TYPE': 'orbit_type'         # Life Status (ORB, IMP, etc.)
}

df_debris.rename(columns=debris_mapping, inplace=True)

# Filter out all objects that have already decayed. Decayed objects nolonger present any form of clutter danger.
# If the object is in space, aka it hasn't yet decayed, then decay_date will be null/NaN/None so we are going
# to copy the debris that has not decayed into a new clean dataframe, current_junk.
current_junk = df_debris[df_debris['decay_date'].isnull()].copy()

# classify orbit_class for all junk still in space.
current_junk['orbit_class'] = current_junk.apply(classify_orbit, axis=1)

# load our clean ucs dataset
ucs_data = pd.read_csv('../data/clean/ucs_cleaned.csv')

# Merge to see which objects match your the active satellite list
merged_data = current_junk.merge(
    ucs_data[['norad_id', 'users', 'purpose']], 
    on='norad_id', 
    how='left', 
    indicator='source'
)

# assign a category to each piece of debris to determine if its an active sate or debris/junk/etc.
merged_data['category'] = merged_data.apply(categorize_object, axis=1)

print("Total Objects in Orbit:", len(merged_data)) # the full length (count) of satellites

# highest values found in debris, inactive sats, and other space junk.
print("\nComposition of our Skies:")
print(merged_data['category'].value_counts())
print("\nWhere is it located?")
print(merged_data['orbit_class'].value_counts())

Total Objects in Orbit: 32695

Composition of our Skies:
category
Debris                12662
Inactive Satellite    11978
Active Satellite       5610
Rocket Body            2397
Unknown                  48
Name: count, dtype: int64

Where is it located?
orbit_class
LEO                       26616
MEO                        3603
GEO                        1545
Unknown                     615
High Elliptical / Deep      316
Name: count, dtype: int64


In [3]:
print(merged_data.info())
print()
print(merged_data['source'].value_counts())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32695 entries, 0 to 32694
Data columns (total 22 columns):
 #   Column               Non-Null Count  Dtype   
---  ------               --------------  -----   
 0   object_name          32695 non-null  object  
 1   object_id            32695 non-null  object  
 2   norad_id             32695 non-null  int64   
 3   object_type          32695 non-null  object  
 4   ops_status           16024 non-null  object  
 5   owner                32695 non-null  object  
 6   launch_date          32695 non-null  object  
 7   launch_site          32695 non-null  object  
 8   decay_date           0 non-null      object  
 9   period_minutes       32080 non-null  float64 
 10  inclination_degrees  32080 non-null  float64 
 11  apogee_km            32080 non-null  float64 
 12  perigee_km           32080 non-null  float64 
 13  rcs                  14799 non-null  float64 
 14  data_status          940 non-null    object  
 15  orbit_center       

In [4]:
# Save the newly cleaned dataset for visualization in a meaningful way!
merged_data.to_csv('../data/clean/orbital_clutter_cleaned.csv', index=False)

print("File saved successfully!")

File saved successfully!


Future considerations: Consider finding orbital data for astroids and other objects not originating from earth (not man made but naturally occurring).
This includes things like astroids, comets, and other 'space rocks' that could be hazardous to satellites orbiting at high speeds.