# Exoplanet Discovery
For the past 25+ years, NASA has used ground- and space-based methods to [identify exoplanets](https://exoplanets.nasa.gov/exep/about/missions-instruments) (planets outside of our solar system). In the past ten years in particular, campaigns like Kepler, K2, and TESS have produced an explosion of results. To date, approximately 4,400 exoplanets have been identified, and over 3,000 potential exoplanet candidates have been discovered. In this notebook, we will use Holoviews and Panel together with Astropy to visualize exoplanet discovery since 1996.


In [11]:
import pandas as pd
import holoviews as hv
import panel as pn
from colorcet import fire

pn.extension()

# Loading data
For this notebook, we will be loading our exoplanet data from three different CSV files: [stars](data/stars.csv), a [dataset of 257,000 stars](https://www.kaggle.com/solorzano/257k-gaia-dr2-stars?select=257k-gaiadr2-sources-with-photometry.csv) identified by the European Gaia space mission; [exoplanets](data/exoplanets.csv), a collection of 480 exoplanets obtained from the [NASA Exoplanet Archive](https://exoplanetarchive.ipac.caltech.edu/); and [candidates](data/candidates.csv), a collection of approximately 3,000 candidate exoplanets collated from the [Kepler](http://exoplanets.org/table?datasets=kepler) and [TESS](https://exofop.ipac.caltech.edu/tess/view_toi.php) campaigns. Note that the 480 exoplanets used in this example are the only ones of the 4,400 identified exoplanets to have both mass and radius recorded.

In [12]:
stars = pd.read_csv("data/stars.csv")
exoplanets = pd.read_csv("data/exoplanets.csv")
candidates = pd.read_csv("data/candidates.csv")

If we were to use the entire ``stars`` dataframe in our plot, this program would run very slowly. Thus, we consider only the brightest stars (those with the G magnitude, abbreviated "phot_g_mean_mag," over 11), and of those, we sample 10% using ``.sample(frac=0.1)``. Similarly, we sample 10% of the ``candidates`` dataframe, since it is much larger than the ``exoplanets``, and plotting the entire ``candidates`` dataframe would overshadow the confirmed points.

In [13]:
stars = stars[stars["phot_g_mean_mag"]>11].sample(frac=0.1)
candidates = candidates.sample(frac=0.1)

# Converting coordinates
Because our goal is to generate a map of the exoplanets and stars, we need a standardized coordinate system for all three of our dataframes. Here, we'll use the [Astropy](https://www.astropy.org/) package to perform coordinate transformations. The original datasets use an equatorial coordinate system, given by ``ra`` (right acension) and ``dec`` (declination), but the specific notation varies among the datasets, and equatorial coordinates are less commonly used to visualize space. We will convert to galactic coordinates, a spherical coordinate system centered at the sun. Points in the galactic coordinate system are represented by two values: longitude (abbreviated "l") and latitude (abbreviated "b").

Using the Astropy ``SkyCoord`` function, we define two functions, ``eqtogalL`` and ``eqtogalB``, which convert equatorial coordinates to galactic coordinates. ``eqtogalL`` takes (``ra``, ``dec``) as an argument and produces the longitude ``l``, while ``eqtogalB`` takes (``ra``, ``dec``) and produces the latitude ``b``.

After converting, we create new columns in each of our dataframes for latitude and longitude. Note that the size of the ``stars`` dataset means that converting to galactic coordinates can take up to a minute or two. To speed up the process, you can sample a smaller fraction of the stars in the previous step.

In [14]:
from astropy import units as u
from astropy.coordinates import SkyCoord

def eqtogalL(a,b):
    "Convert right acension and declination to longitude"
    ret = SkyCoord(ra=a*u.degree,dec=b*u.degree,frame='icrs').galactic
    return float(ret.to_string("decimal").split( )[0])

def eqtogalB(a,b):
    "Convert right acension and declination to latitude"
    ret = SkyCoord(ra=a*u.degree,dec=b*u.degree,frame='icrs').galactic
    return float(ret.to_string("decimal").split( )[1])

stars['l'] = pd.Series(eqtogalL(a,b) for (a,b) in zip(stars["ra"],stars["dec"]))
stars['b'] = pd.Series(eqtogalB(a,b) for (a,b) in zip(stars["ra"],stars["dec"]))

exoplanets['l'] = pd.Series(eqtogalL(a,b) for (a,b) in zip(exoplanets["ra"],exoplanets["dec"]))
exoplanets['b'] = pd.Series(eqtogalB(a,b) for (a,b) in zip(exoplanets["ra"],exoplanets["dec"]))


candidates['l'] = pd.Series(eqtogalL(a,b) for (a,b) in zip(candidates["ra"],candidates["dec"]))
candidates['b'] = pd.Series(eqtogalB(a,b) for (a,b) in zip(candidates["ra"],candidates["dec"]))

# The Goldilocks zone and the Tsiolkovsky rocket equation
One of the methods used to determine which exoplanets could potentially support life is to check whether liquid water could exist there. For water to be present on the planet as liquid, the planet's temperature must be within a fairly narrow range, and therefore the planet must be within a certain distance of the nearest star. Exoplanets within this range are said to be in the "Goldilocks zone."

If intelligent life were to exist on one of these planets, would it be capable of space travel? If the hypothetical life forms used similar methods to humans — for example, hydrogen- and oxygen-powered chemical rockets — would they even be able to leave their planet? A heavier rocket requires exponentially more fuel, but more fuel means more mass. The Tsiolkovsky rocket equation makes this question more precise:

$$\Delta v = v_e\ln\left(\frac{m_0}{m_f}\right),$$

where $\Delta v$ is the [impulse per mass unit](https://en.wikipedia.org/wiki/Impulse_(physics)) required for the rocket to travel its course, $v_e$ is [effective exhaust velocity](https://en.wikipedia.org/wiki/Specific_impulse#Specific_impulse_as_effective_exhaust_velocity), $m_0$ is the initial mass of the rocket, and $m_f$ is the final mass of the rocket (here, equal to $m_0$ minus the mass of the fuel spent on the flight).

To see the rocket equation in action, consider a planet of the same density as Earth with radius $R$ double Earth's and thus mass $M$ eight times Earth's. For the purposes of this example, we'll assume that $$\Delta v = \sqrt{\frac{GM}{R}},$$ where $G\approx 6.67\cdot 10^{-11}$ (in reality, some complicating factors exist, but our formula works as an approximation at relatively low altitudes$^*$). Then

$$\Delta v = \sqrt{\frac{6.67\cdot 10^{-11}\cdot 4.78\cdot10^{25}}{1.27\cdot10^7}}\approx 22407 \frac{\text{m}}{\text{s}}.$$

Using the [highest recorded exhaust velocity of a chemical rocket](https://en.wikipedia.org/wiki/Tripropellant_rocket#:~:text=In%20the%201960s%2C%20Rocketdyne%20fired,for%20a%20chemical%20rocket%20motor.), $5320\frac{\text{m}}{\text{s}},$ and we'll calculate the approximate percent of the rocket's mass that would have to be fuel in order to propel the rocket to $250$ m$^*$:

$$22407= 5320 \ln\left(\frac{m_0}{m_f}\right),$$

so

$$\frac{m_0}{m_f}\approx 67.5.$$ In other words, about $98.5\%$ of the rocket's initial mass must be fuel. For comparison, the rocket with the highest initial-to-final mass ratio ever built was the [Soyuz-FG](https://en.wikipedia.org/wiki/Soyuz-FG) rocket, which was $91\%$ fuel by mass. Moreover, we were very generous with the conditions used to compute the mass ratio to escape our imaginary planet. The exhaust velocity we used was only ever recorded for a highly corrosive, dangerous, expensive propellant that, with the current state of technology, is not feasible for use in space travel.

$^*$We won't go into detail here, but the $\Delta v$ calulation for $250$ m is derived from the [vis-viva equation](https://en.wikipedia.org/wiki/Vis-viva_equation).

# Filtering by feasaibility of space travel

We can use the rocket equation to get an idea of which exoplanets might be the right size to allow for space travel. Let's assume that the hypothetical life forms on an exoplanet can make a chemical rocket with exhaust velocity at most $5320\frac{\text{m}}{\text{s}}.$ Let's also say that they've figured out how to make rockets that are up to $95\%$ fuel by mass (so $\frac{m_0}{m_f}=20$). These two assumptions will allow us to make an educated guess of whether the mass and radius of the exoplanet would allow for space travel with these rockets:

$$\sqrt{\frac{GM}{R}}\approx \Delta v \leq 5320\ln{20}.$$

We can now define a function  ``deltav`` that approximates $\Delta v$ for each exoplanet and returns ``True`` or ``False`` depending on whether that value is small enough. We'll then add a corresponding column ``escapable`` in our dataframe.

In [44]:
import math

def deltav(m,r,h):
    "Determine whether delta-v is sufficiently small for feasible space travel with chemical rockets"
    G = 6.67*(10**(-11))
    if math.sqrt(G*m/r)<=5320*math.log(20) and h == True:
        return True
    else:
        return False
    
exoplanets['escapable'] = pd.Series(deltav(m,r,h) for (m,r,h) in zip(exoplanets["mass"],exoplanets["radius"],
                                                                     exoplanets['habitable']))
      
# need to find some way to visualize this -- could color points or could have a filter option

# Defining widgets

We will use Panel to define widgets for our dashboard: a slider representing discovery year, a checkbox determining whether to show unconfirmed exoplanets, a second checkbox determining whether to display only planets in the potentially habitable zone, and two dropdown menus to determine what the size and color of the points on the plot will represent.

In [None]:
year_slider = pn.widgets.RangeSlider(name='Discovery year range', start=1996, end=2021)
checkbox_candidates = pn.widgets.Checkbox(name='Show sampling of uncomfirmed exoplanets')
checkbox_habitable = pn.widgets.Checkbox(name='Show only planets in potentially habitable zone')
select_size = pn.widgets.Select(name='Size points by:', options={"Earth radius":"radius", "Earth mass":"mass"})
select_color = pn.widgets.Select(name='Color points by:', options={"Earth radius":"radius", "Earth mass":"mass",
                                                                   "Temperature": "temperature"})

We'll also create a point representing the sun to orient users.

In [None]:
d = {'b':[0],'l':[0]}
origin = pd.DataFrame(data=d)

# Filtering and plotting points
To generate our plot, we'll need a function ``filter_df`` that takes the values of our widgets as input, uses them to filter the data, and outputs a plot of the relative positions of the exoplanets (and candidates, depending on whether the corresponding checkbox is selected) with the data points from ``stars`` as the background and a yellow point ``sun`` at (0,0) representing the sun.

Note that when "mass" is selected to deterimine the size of the points, we scale the points to 1% of the mass using ``size_scale``; this way, planets with large masses do not overwhelm the plot but the relative size of the points retains its meaning.

In [None]:
@pn.depends(year_slider, checkbox_candidates, checkbox_habitable, select_size, select_color)
def filter_df(year_range, checkbox_candidates, checkbox_habitable, select_size, select_color):
    exo_lower = exoplanets.disc_year>=year_range[0]
    exo_upper = exoplanets.disc_year<=year_range[1]
    hab = exoplanets.habitable == True
    exo_filter = exo_lower & exo_upper
    if checkbox_habitable:
        exo_filter = exo_filter & hab
    filtered_exoplanets = exoplanets[exo_filter]
    star_background = (stars.hvplot.scatter(x='b',y='l',datashade=True,
                                               color="phot_g_mean_mag",cmap=fire,
                                               colorbar=True))
    overlay_points = (filtered_exoplanets.hvplot.scatter(x='b',y='l',color=select_color,
                                               xlabel='longitude (deg)',
                                               ylabel='latitude (deg)',
                                               clabel=select_color).opts(cmap='blues',
                                               size=hv.dim(select_size)))
    # scaling by mass not working anymore, and if you select a different optoin for coloring, the sizing disappears
    size_scale = 0.01 if select_size == "mass" else 1
    overlay_points.opts(size = size_scale*hv.dim(select_size))
    sun = origin.hvplot.scatter(x='b',y='l',size=60,color="yellow")
    layers = [star_background, sun, overlay_points]
    if checkbox_candidates:
        can_lower = candidates.year>=year_range[0]
        can_upper = candidates.year<=year_range[1]
        can_mask = can_lower & can_upper
        filtered_candidates = candidates[can_mask]
        candidate_points = (filtered_candidates.hvplot.scatter(x='b',y='l',
                                                       size=30,color="#33FF36",alpha=0.5).opts(cmap='greens',
                                                                                               cnorm='log'))
        layers.append(candidate_points)
    return hv.Overlay(layers).collate().opts(bgcolor="black")
    

Then we'll define a funciton ``radius_mass`` that outputs a scatterplot of mass compared to radius of confirmed exoplanets, with points colored according to habitability.

In [None]:
# would like some other simple plots -- ideas?
def radius_mass():
    habitable = exoplanets[exoplanets['habitable']==True]
    uninhabitable = exoplanets[exoplanets['habitable']==False]
    habitable_points = habitable.hvplot.scatter(x='mass',y='radius',color="red",
                                                label="Potentially habitable",size=30)
    uninhabitable_points = uninhabitable.hvplot.scatter(x='mass',y='radius',
                                                        color="blue",alpha=0.5,
                                                        label="Uninhabitable",size=10)
    return uninhabitable_points*habitable_points

# Putting it all together
Finally, we create a panel from our widgets and plots to display the final dashboard.

In [None]:
filtered_view = pn.Row(
    pn.Column(year_slider, select_size, select_color, checkbox_candidates, checkbox_habitable,
              pn.panel(filter_df, width=800),pn.Row(pn.panel(radius_mass,width=400))))

filtered_view