In [None]:
import numpy as np
import babypandas as bpd
import pandas as pd

import matplotlib.pyplot as plt
from matplotlib_inline.backend_inline import set_matplotlib_formats
set_matplotlib_formats("svg")
plt.style.use('ggplot')
plt.rcParams['figure.figsize'] = (10, 5)

Data comes from here: https://www.kaggle.com/datasets/adityamishraml/nasaexoplanets

Original source: https://exoplanetarchive.ipac.caltech.edu/

Also useful data source: https://exoplanets.nasa.gov/discovery/exoplanet-catalog/

First, convert mass and radius to their multipliers relative to Earth. Information used for that conversion comes from here: https://nssdc.gsfc.nasa.gov/planetary/factsheet/jupiterfact.html

Now mass columnn represents the ratio of this exoplanet's mass to the earth's mass (how many times more massive it is than earth), similarly for radius (ratio relative to earth).

In [None]:
exo = pd.read_csv('exoplanets.csv').get(['name', 'distance', 'stellar_magnitude', 'planet_type',
       'discovery_year', 'detection_method', 'mass_multiplier', 'mass_wrt', 'radius_multiplier',
       'radius_wrt'])

exo.loc[exo['mass_wrt'] == 'Jupiter', 'mass_wrt'] = 317.83
exo.loc[exo['mass_wrt'] == 'Earth', 'mass_wrt'] = 1

# approximate bc it's not clear which radius is given in data, polar or equatorial
exo.loc[exo['radius_wrt'] == 'Jupiter', 'radius_wrt'] = 11 
exo.loc[exo['radius_wrt'] == 'Earth', 'radius_wrt'] = 1

exo = exo.assign(mass=pd.to_numeric(exo.get('mass_multiplier')*exo.get('mass_wrt')))
exo = exo.assign(radius=pd.to_numeric(exo.get('radius_multiplier')*exo.get('radius_wrt')))
exo = exo.drop(columns=['mass_multiplier', 'mass_wrt', 'radius_multiplier', 'radius_wrt'])
exo = exo.dropna()
exo

Sirius is the brightest star in the night sky. Its apparent magnitude is -1.46. Using the formula

$$10^{-0.4*(\text{stellar magnitude} + 1.46)}$$

we can get each star's brightness relative to Sirius (what fraction of Sirius' brightness this star has). Example:0.5 means it's half as bright as Sirius. These are faraway exoplanets so the fractions should be less than 1.

This follows the example calculation here: https://en.wikipedia.org/wiki/Apparent_magnitude#Example:_Sun_and_Moon

In [None]:
exo.assign(brightness = 10 ** (-0.4 * (exo.get('stellar_magnitude') + 1.46)))


This does not help the plots, so I am abandoning it and will stick with the original stellar magnitude.

In [None]:
exo.to_csv('../data/exoplanets.csv', index=False)

Another thing we can do is separate the name of the star around which the planet orbits and the order in which the planet was discovered in that solar system. Most follow the [standard naming convention](https://en.wikipedia.org/wiki/Exoplanet_naming_convention#:~:text=Following%20an%20extension%20of%20the,planets%20are%20given%20subsequent%20letters) but a few had to be handled manually. 

In [None]:
def get_star(planet_name):
    return planet_name[:-2]

get_star("1RXS J160929.1-210524 b")

In [None]:
letter_to_number = {"b": 1, "c": 2, "d": 3, "e": 4, "f": 5, "g": 6, "h": 7, "i": 8, "j": 9, "1": 1, "2": 2}

def get_order(planet_name):
    # manually handle planets that don't follow the convention
    if planet_name == "EPIC 201170410.02" or planet_name == "EPIC 201757695.02":
        return 1
    return letter_to_number[planet_name[-1]]

get_order("1RXS J160929.1-210524 b")

In [None]:
exo

In [None]:
exo = exo.assign(Order=exo.get('name').apply(get_order), 
                               Star=exo.get('name').apply(get_star))
exo

I also wound up not using this, but it's here if we ever want to use those columns.