# Exoplanet Discovery
For the past 25+ years, NASA has used ground- and space-based methods to [identify exoplanets](https://exoplanets.nasa.gov/exep/about/missions-instruments) (planets outside of our solar system). In the past ten years in particular, campaigns like Kepler, K2, and TESS have produced an explosion of results. To date, approximately 4,400 exoplanets have been identified, and over 3,000 potential exoplanet candidates have been discovered. In this notebook, we will use Holoviews and Panel together with Astropy to visualize exoplanet discovery since 1996.


In [None]:
import pandas as pd
import hvplot.pandas
import holoviews as hv
import panel as pn
from colorcet import fire

pn.extension()

# Loading data
For this notebook, we will be loading our exoplanet data from three different CSV files: [stars](data/stars.csv), a [dataset of 257,000 stars](https://www.kaggle.com/solorzano/257k-gaia-dr2-stars?select=257k-gaiadr2-sources-with-photometry.csv) identified by the European Gaia space mission; [exoplanets](data/exoplanets.csv), a collection of 480 exoplanets obtained from the [NASA Exoplanet Archive](https://exoplanetarchive.ipac.caltech.edu/); and [candidates](data/candidates.csv), a collection of approximately 3,000 candidate exoplanets collated from the [Kepler](http://exoplanets.org/table?datasets=kepler) and [TESS](https://exofop.ipac.caltech.edu/tess/view_toi.php) campaigns. Note that the 480 exoplanets used in this example are the only ones of the 4,400 identified exoplanets to have both mass and radius recorded.

In [None]:
stars = pd.read_csv("data/stars.csv")
exoplanets = pd.read_csv("data/exoplanets.csv")
candidates = pd.read_csv("data/candidates.csv")

If we were to use the entire ``stars`` dataframe in our plot, this program would run very slowly. Thus, we consider only the brightest stars (those with the G magnitude, abbreviated "phot_g_mean_mag," over 11), and of those, we sample 10% using ``.sample(frac=0.1)``. Similarly, we sample 10% of the ``candidates`` dataframe, since it is much larger than the ``exoplanets``, and plotting the entire ``candidates`` dataframe would overshadow the confirmed points.

In [None]:
stars = stars[stars["phot_g_mean_mag"]>11].sample(frac=0.1)
candidates = candidates.sample(frac=0.1)

# Converting coordinates
Because our goal is to generate a map of the exoplanets and stars, we need a standardized coordinate system for all three of our dataframes. Here, we'll use the [Astropy](https://www.astropy.org/) package to perform coordinate transformations. The original datasets use an equatorial coordinate system, given by ``ra`` (right acension) and ``dec`` (declination), but the specific notation varies among the datasets, and equatorial coordinates are less commonly used to visualize space. We will convert to galactic coordinates, a spherical coordinate system centered at the sun. Points in the galactic coordinate system are represented by two values: longitude (abbreviated "l") and latitude (abbreviated "b").

Using the Astropy ``SkyCoord`` function, we define two functions, ``eqtogalL`` and ``eqtogalB``, which convert equatorial coordinates to galactic coordinates. ``eqtogalL`` takes (``ra``, ``dec``) as an argument and produces the longitude ``l``, while ``eqtogalB`` takes (``ra``, ``dec``) and produces the latitude ``b``.

After converting, we create new columns in each of our dataframes for latitude and longitude. Note that the size of the ``stars`` dataset means that converting to galactic coordinates can take up to a minute or two. To speed up the process, you can sample a smaller fraction of the stars in the previous step.

In [None]:
from astropy import units as u
from astropy.coordinates import SkyCoord

def eqtogalL(a,b):
    "Convert right acension and declination to longitude"
    ret = SkyCoord(ra=a*u.degree,dec=b*u.degree,frame='icrs').galactic
    return float(ret.to_string("decimal").split( )[0])

def eqtogalB(a,b):
    "Convert right acension and declination to latitude"
    ret = SkyCoord(ra=a*u.degree,dec=b*u.degree,frame='icrs').galactic
    return float(ret.to_string("decimal").split( )[1])

stars['l'] = pd.Series(eqtogalL(a,b) for (a,b) in zip(stars["ra"],stars["dec"]))
stars['b'] = pd.Series(eqtogalB(a,b) for (a,b) in zip(stars["ra"],stars["dec"]))

exoplanets['l'] = pd.Series(eqtogalL(a,b) for (a,b) in zip(exoplanets["ra"],exoplanets["dec"]))
exoplanets['b'] = pd.Series(eqtogalB(a,b) for (a,b) in zip(exoplanets["ra"],exoplanets["dec"]))


candidates['l'] = pd.Series(eqtogalL(a,b) for (a,b) in zip(candidates["ra"],candidates["dec"]))
candidates['b'] = pd.Series(eqtogalB(a,b) for (a,b) in zip(candidates["ra"],candidates["dec"]))

# Defining widgets

We will use Panel to define widgets for our dashboard: a slider representing discovery year, a checkbox determining whether to show unconfirmed exoplanets, a second checkbox determining whether to display only planets in the potentially habitable zone (those whose distance from a star would allow for liquid water on their surface), and two dropdown menus to determine what the size and color of the points on the plot will represent.

In [None]:
year_slider = pn.widgets.RangeSlider(name='Discovery year range', start=1996, end=2021)
checkbox_candidates = pn.widgets.Checkbox(name='Show sampling of uncomfirmed exoplanets')
checkbox_habitable = pn.widgets.Checkbox(name='Show only planets in potentially habitable zone')
select_size = pn.widgets.Select(name='Size points by:', options={"Earth radius":"radius", "Earth mass":"mass"})
select_color = pn.widgets.Select(name='Color points by:', options={"Earth radius":"radius", "Earth mass":"mass",
                                                                   "Temperature": "temperature"})

We'll also create a point representing the sun to orient users.

In [None]:
d = {'b':[0],'l':[0]}
origin = pd.DataFrame(data=d)

# Filtering and plotting points
To generate our plot, we'll need a function ``filter_df`` that takes the values of our widgets as input, uses them to filter the data, and outputs a plot of the relative positions of the exoplanets (and candidates, depending on whether the corresponding checkbox is selected) with the data points from ``stars`` as the background and a yellow point ``sun`` at (0,0) representing the sun.

Note that when "mass" is selected to deterimine the size of the points, we scale the points to 1% of the mass using ``size_scale``; this way, planets with large masses do not overwhelm the plot but the relative size of the points retains its meaning.

In [None]:
@pn.depends(year_slider, checkbox_candidates, checkbox_habitable, select_size, select_color)
def filter_df(year_range, checkbox_candidates, checkbox_habitable, select_size, select_color):
    exo_lower = exoplanets.disc_year>=year_range[0]
    exo_upper = exoplanets.disc_year<=year_range[1]
    hab = exoplanets.habitable == True
    exo_filter = exo_lower & exo_upper
    if checkbox_habitable:
        exo_filter = exo_filter & hab
    filtered_exoplanets = exoplanets[exo_filter]
    star_background = (stars.hvplot.scatter(x='b',y='l',datashade=True,
                                               color="phot_g_mean_mag",cmap=fire,
                                               colorbar=True))
    overlay_points = (filtered_exoplanets.hvplot.scatter(x='b',y='l',color=select_color,
                                               xlabel='longitude (deg)',
                                               ylabel='latitude (deg)',
                                               clabel=select_color).opts(cmap='blues',
                                               size=hv.dim(select_size)))
    # scaling by mass not working anymore, and if you select a different optoin for coloring, the sizing disappears
    size_scale = 0.01 if select_size == "mass" else 1
    overlay_points.opts(size = size_scale*hv.dim(select_size))
    sun = origin.hvplot.scatter(x='b',y='l',size=60,color="yellow")
    layers = [star_background, sun, overlay_points]
    if checkbox_candidates:
        can_lower = candidates.year>=year_range[0]
        can_upper = candidates.year<=year_range[1]
        can_mask = can_lower & can_upper
        filtered_candidates = candidates[can_mask]
        candidate_points = (filtered_candidates.hvplot.scatter(x='b',y='l',
                                                       size=30,color="#33FF36",alpha=0.5).opts(cmap='greens',
                                                                                               cnorm='log'))
        layers.append(candidate_points)
    return hv.Overlay(layers).collate().opts(bgcolor="black")
    

Then we'll define a funciton ``radius_mass`` that outputs a scatterplot of mass compared to radius of confirmed exoplanets, with points colored according to habitability.

In [None]:
# would like some other simple plots -- ideas?
def radius_mass():
    habitable = exoplanets[exoplanets['habitable']==True]
    uninhabitable = exoplanets[exoplanets['habitable']==False]
    habitable_points = habitable.hvplot.scatter(x='mass',y='radius',color="red",
                                                label="Potentially habitable",size=30)
    uninhabitable_points = uninhabitable.hvplot.scatter(x='mass',y='radius',
                                                        color="blue",alpha=0.5,
                                                        label="Uninhabitable",size=10)
    return uninhabitable_points*habitable_points

# Putting it all together
Finally, we create a panel from our widgets and plots to display the final dashboard.

In [None]:
filtered_view = pn.Row(
    pn.Column(year_slider, select_size, select_color, checkbox_candidates, checkbox_habitable,
              pn.panel(filter_df, width=800),pn.Row(pn.panel(radius_mass,width=400))))

filtered_view