Dataset Source: Union of Concerned Scientists, https://www.ucsusa.org/resources/satellite-database

Dashboard is published at: https://public.tableau.com/app/profile/pavel7982/viz/ActiveSatellites/Dashboard1?publish=yes

In [1]:
import pandas as pd
import plotly.express as px
import numpy as np

df = pd.read_csv("./satellites.csv", sep=";")

df = df.rename(columns={
    "Perigee (km)": 'perigee',
    "Apogee (km)": "apogee",
    "Current Official Name of Satellite": "name",
    "Name of Satellite, Alternate Names": "name2",
    "Country of Operator/Owner": "country",
    "Operator/Owner": "company",
    "Class of Orbit": "orbit_class",
    "Type of Orbit": "orbit_type"
})

df["perigee"] =  df["perigee"].str.replace(" ", '').astype("float")
df["apogee"] =  df["apogee"].str.replace(" ", '').astype("float")


df = df[["name", "name2", "country", "company", "orbit_class", "orbit_type", "perigee", "apogee"]]
df.columns


Index(['name', 'name2', 'country', 'company', 'orbit_class', 'orbit_type',
       'perigee', 'apogee'],
      dtype='object')

# Look for outliers

In [2]:
def variancestats(df, var):
    d = dict()
    d["mean"] = df[var].mean()
    d["std"] = df[var].std()
    d["q25th"] = df[var].quantile(0.25)
    d["q75th"] = df[var].quantile(0.75)

    return d

def zscore(df, var):
    r = variancestats(df, var)
    mean = r["mean"]
    std = r["std"]

    zscorevar = "{}_zscore".format(var)
    outliervar = "{}_zscore_outlier".format(var)

    df[zscorevar] = (df[var] - mean) / std
    df[outliervar] = 0
    df.loc[np.abs(df[zscorevar]) >= 2, outliervar] = 1

    return df

In [3]:
def zscore_analysis(df, var):
    dt = df.copy()
    dt = zscore(dt, var)

    zscorevar = "{}_zscore".format(var)
    outliervar = "{}_zscore_outlier".format(var)

    dt = dt.groupby([var, zscorevar, outliervar]).agg({
        "name": "count"
    })

    dt.reset_index(inplace=True)

    fig = px.scatter(dt, x=var, y="name", color=zscorevar)
    fig.show()

zscore_analysis(df, "perigee")
zscore_analysis(df, "apogee")

In [4]:
dq = zscore(df, "perigee")
dt = dq[dq["perigee_zscore_outlier"] == 1]
dt = dt.sort_values(by=["perigee"], ascending=False)

dt.head()

Unnamed: 0,name,name2,country,company,orbit_class,orbit_type,perigee,apogee,perigee_zscore,perigee_zscore_outlier
1161,Interstellar Boundary Explorer,Interstellar Boundary EXplorer (IBEX),USA,National Aeronautics and Space Administration ...,Elliptical,Deep Highly Eccentric,62200.0,268679.0,5.235524,1
866,Geomagnetic Tail Laboratory (Geotail),Geotail (Geomagnetic Tail Laboratory),Multinational,Institute of Space and Astronautical Science (...,Elliptical,Deep Highly Eccentric,49551.0,191451.0,4.083549,1
987,GSAT-29,GSAT-29,India,Indian Space Research Organization (ISRO),GEO,,37782.0,37807.0,3.011718,1
5131,TianLian 2,"TianLian 2 (TL-1-02, CTDRS)",China,China Aerospace Science and Technology Corp. (...,GEO,,37778.0,37794.0,3.011353,1
5024,STPSat-6,STPSat-6,USA,Atlas 5,GEO,,36097.0,36110.0,2.85826,1


Insight: What's so unique about Interstellar Boundary Explorer? It's a satellite launched by NASA on elliptical, deep highly eccentric orbit with perigee of whopping 62200, a 25% higher than his next competitor - international Geomagnetic Tail Laboratory.

In [5]:
stats = variancestats(dt, "perigee")
stats

{'mean': 35806.03380782918,
 'std': 1302.837668300367,
 'q25th': 35768.0,
 'q75th': 35782.0}

In [6]:
dq = dt[dt["perigee"] > stats["q75th"]]
dq.count()

name                      131
name2                     131
country                   131
company                   131
orbit_class               131
orbit_type                  2
perigee                   131
apogee                    131
perigee_zscore            131
perigee_zscore_outlier    131
dtype: int64

Insight: most satellites have perigee lower than 35782km, yet 131 satellites have a higher perigee. At the same time, only 2 satellites have perigee higher than 40000km.

# Relations

In [7]:
fig = px.scatter(df, x="apogee", y="perigee", trendline="ols")
fig.show()

As we can see on a graph above, there's a linear correlation between satellite's perigree and apogee with R2 of 0.63.
However, all satellites with apogee higher than 44.66k KM lie lower than the trendline.

In science, such type of orbit is called a Highly Elliptical Orbit (also known as Molniya orbit), described by a low-altitude (about 1000km) at perigree and a very high-altitude (greater than 35,786km) at apogee. These very elongated orbits have the advantage of long dwell times at a point in the sky during the approach to and descent from apogee. 

Source: William Emery, Adriano Camps, in Introduction to Satellite Remote Sensing (https://www.sciencedirect.com/book/9780128092545/introduction-to-satellite-remote-sensing), 2017