# Satellite science data - Exoplanets

## Introduction - the importance of Data

Every satellite that we send into space has a $\textbf{mission}$. In the case of telecommunication satellites the mission is to provide communication services; GPS satellites are launched to provide navigation services; Earth observation satellites take care of monitoring our planet (climate change, ocean dynamics, deforestation, etc) and space telescopes are used to study astrophysical phenomena. 

In all cases, these satellites are not just flying around doing nothing. They are constantly generating $\textbf{data}$ and it is "what we do with that data" what justifies the mission in the first place. Therefore, $\textbf{data processing}$ is an integral part of any satellite mission no matter its particular characteristics.

With this $\textbf{Hands-On Project}$ we will learn how scientists treat the data generated by satellites to produce some exciting scientific results. Specifically, we will look into the field of $\textbf{exoplanets}$

## Exoplanet research

$\textbf{Exoplanets}$ are planets that orbit stars outside our Solar System. Nowadays we know that there are many of these planets orbiting stars in our Galaxy, and some of them are very similar to our own home planet. This is a very exciting topic because these planets are good candidates for containing some form of life. So in order to look for and characterize exoplanets several space missions have been launched in the past and will be launched in the near future.

The most famous one is probably $\textbf{Kepler}$, a NASA mission launched in 2009 which managed to find over $\textbf{1200}$ exoplanets

<img src="Kepler.jpg",width=1000,height=600>
$\textit{Kepler space observatory, a NASA mission to search for exoplanets}$


But how did Kepler manage to detect the presence of exoplanets around stars that are many light-years away from our Solar system?

## How do we find exoplanets?

There are 2 techniques/methods that can be used to look for exoplanets:

(1) $\textbf{Transit method}$: this is the one Kepler used. When an exoplanet passes in front of its host star, it produces a mini-$\textit{eclipse}$ partially blocking the light. This creates a momentary decrease in brightness which is usually extremely small but detectable with sophisticated instruments. As the planet is orbiting the star at a fixed period, this 'dip' is observed periodically at fixed intervals and can be used to confirm the presence of the exoplanet

<img src="Transit.jpeg",width=500,height=500>
$\textit{The transit method used to detect exoplanets}$

(2) $\textbf{Radial velocity method}$: this technique is slightly more complicated to understand but it essentially boils down to measuring the Doppler shift induced by the exoplanet on its host star


## The exercise

We will look at simulated data of a $\textbf{transit}$ just like the ones Kepler observed. We will mimic the process scientists follow when analyzing this sort of data, and we will draw conclusions regarding the $\textbf{size}$ of the exoplanet.

Let's begin by taking a look at the data we will be using

In [1]:
import numpy as np
import matplotlib.pyplot as plt

<img src="light.png",width=500,height=500>
<img src="transit.png",width=500,height=500>


As we can see the light curve is approximately flat until we observe a periodic dip which suggests the presence of an exoplanet transiting in front of the start.

## (0) Estimate the orbital period of the exoplanet $P$

The brightness $B$ here is defined as the ratio with respect to the nominal brightness of the star, such that when there is no eclipse, the brightness is 1.0; and while the exoplanet is transiting the brightness is 0.975 (around 97.5% of the nominal value)

\begin{equation}
B = \frac{B_{\text{observed}}}{B_{\text{star}}}
\end{equation}

As the brightness is proportional to the area of the star that we actually see, during the transit the brightness becomes:

\begin{equation}
B = \frac{A_{\text{star}} - A_{\text{planet}}}{A_{\text{star}}} = 1 - \frac{A_{\text{planet}}}{A_{\text{star}}} = 1 - \frac{\pi R_{\text{planet}}^2}{\pi R_{\text{star}}^2}
\end{equation}

Which depends on the ratio between the $\textbf{radius}$ of the exoplanet and the $\textbf{radius}$ of the star. Based on this we can conclude that if the planet is very large (like Jupiter) it will cover a significant portion of the star during the transit, and thus the dip in brightness will be larger.

## (1) Estimating the planet Radius $R_p$

We can assume that this star is similar to our own Sun; after all the Sun is a pretty average star. Thus allows us to estimate the $\textbf{radius}$ of the exoplanet from the dip in brightness:

\begin{equation}
R_{\text{planet}}^2 = (1 - B) R_{\text{star}}^2
\end{equation}

\begin{equation}
R_{\text{planet}} = \sqrt{(1 - B)} R_{\text{star}}
\end{equation}

In our case, by looking at the light curve we can estimate that the brightness observed during the transit is around $\textbf{0.977}$ so:

\begin{equation}
R_{\text{planet}} = \sqrt{(1 - 0.977)} R_{\text{star}} = 0.15 R_{\text{star}}
\end{equation}

In exoplanet research, it is common to express the size of planets with respect to the size of Jupiter, instead of with respect to the Sun. The radius of the Sun is around 9.9484 times larger than that of Jupiter so 

\begin{equation}
R_{\text{planet}} = 0.15 R_{\text{star}} = 0.15 \cdot 9.9484  R_{\text{Jupiter}} = 1.49 R_{\text{Jupiter}}
\end{equation}

With this information we can conclude that the exoplanet we are looking at is approximately $\textbf{1.5}$ times the size of Jupiter; so a pretty large exoplanet.

## (2) Estimating the planet Mass $M_p$

The Transit Method only relies on brightness measurements that depends on the $\textbf{size}$ of the exoplanet. Therefore, it cannot provide any estimation for the $\textbf{mass}$ as this does not affect the light curve.

We need to use other method for that. The method of $\textbf{Radial Velocity}$ is the most suitable for this task. In order to understand how it works we need to introduce some concepts first

### Solar spectral lines

The image below shows the typical spectrum of light for the Sun, i.e. light decomposed into its different colours ($\textit{wavelengths}$). We can clearly see many black lines which correspond to the $\textbf{absorption}$ by some of the molecules in the solar atmosphere. This means that at certain wavelengths (which very well known) we observe no light coming from the Sun.
<img src="sunx.jpg",width=1000,height=600>
Spectral Lines in the Solar Spectrum (National Optical Astronomy Observatory)

### But how does this help us?
Well, all stars including the ones hosting exoplanets have a similar behaviour and have spectral lines. However, because of the presence of exoplanets, stars move around ever so slightly. This means that depending on the orbital position of the exoplanet, stars will be moving $\textbf{towards us}$ or $\textbf{away from us}$ periodically. 
<img src="RV_concept.png",width=1000,height=600>

Consequently we will be able to detect the $\textbf{Doppler shift}$ of the star because the spectral lines will move back and forth with the same period as the exoplanet. This is called $\textbf{Radial Velocity}$ measurements and it looks very much like a periodic signal

<img src="RV_plot1.png",width=1000,height=600>

The semi-amplitude of the signal $K$ directly depends on the orbital parameters of the exoplanet:

\begin{equation}
K = \left( \frac{2 \pi G}{P} \right) ^{1/3} \frac{M_{\text{planet}} \sin i}{\left( M_{\text{planet}} + M_{\text{star}} \right)^{2/3}} \frac{1}{\sqrt{1 - e^2}}
\end{equation}

Where $G$ is Newton's graviational constant, $P$ is the orbital period of the exoplanet, $i$ is the inclination of the orbit, and $e$ is the excentricity. For simplicity we will assume that $\sin i=1$, $e=0$ and that $ M_{\text{planet}} << M_{\text{star}}$ so that the formula is simplified into:

\begin{equation}
K = \left( \frac{2 \pi G}{P} \right) ^{1/3} \frac{M_{\text{planet}}}{\left( M_{\text{star}} \right)^{2/3}}
\end{equation}

In order to estimate the $\textbf{mass}$ of the exoplanet, simply estimate the semi-amplitude of the signal $K$ and use the cell below to take care of the operations


In [16]:
G = 6.67e-11      # Gravitational Constant [m^3 kg^-1 s^-2]
M_star = 2e30     # Mass of the Star [kg]
M_jup = 1.89e27  # Mass of Jupiter [kg]

# User inputs
K = 56             # Semi-amplitude [m/s]
P_days = 4.25      # Orbital period [days]

# Outputs
P = P_days * 24 * 3600
M_p = K * (2 * np.pi * G / P)**(-1/3) * (M_star)**(2/3)

print('Assuming a K of %.2f [m/s]' %K)
print('Assuming a period P of %.2f [days]' %P_days)
print('The estimated mass of the exoplanet is %.3f M_jupiter' %(M_p/M_jup))

Assuming a K of 56.00 [m/s]
Assuming a period P of 4.25 [days]
The estimated mass of the exoplanet is 0.450 M_jupiter


## Solving the Mystery

Let's make a summary of the exoplanet parameters we have estimated:

$\textbf{Period}$ - 4.25 days

$\textbf{Radius}$ - 1.49 radius of Jupiter

$\textbf{Mass}$ - 0.45 mass of Jupiter

### Based on this data can you estimate what kind of exoplanet this is?
Try to think about it size and mass with respect to the reference of Jupiter

NASA keeps an online database of all confirmed exoplanets with their measured properties. Let's try to idenfity our mysterious exoplanet in this database

https://exoplanetarchive.ipac.caltech.edu/cgi-bin/TblView/nph-tblView?app=ExoTbls&config=planets

## Discussion - The problem of bias

The concept of $\textit{bias}$ in research essentially means that external factors can cause a discrepancy between what we observe and what it's actually true After all, research is done by humans so external factors can be $\textit{prejudice}$ which leads to $\textit{observational bias}$: the tendency to draw the conclusions that we initially believed to be true.

Even in a perfectly rational environment with no human bias, there are some intrinsic biases on the $\textit{methodology}$. To illustrate this issue let's think of the following analogy:

Imagine you are a very lazy $\textbf{marine biologist}$ and you want to study the populations of fish in the North Atlantic Ocean. You have decided to build an automated ship that goes around every morning catching fish with a $\textbf{net}$ of a certain hole size. Once it has finished fishing, a robot inside the ship classifies the fish of that day according to their size. At the end of the day, the ship sends you a report with the results of your research, including number of fish caught classified by the size.

Your research is completely automated and is not influenced by human interaction. 

### Is your research un-biased? Discuss with your colleagues any possible source of bias



Your research is obviously biased because of your methodology. The net has an intrinsic minimum size, therefore any fish below that size will simply escape the net. Any statistical analysis on the fish population will not include a significant portion of the population and your conclusions will not hold unless you account for the bias.

### Bias on exoplanet research

Just like the problem of the net, the methods we use for exoplanet research are biased. This doesn't mean they are "bad" or that we shouldn't use them. Think of it as "$\textit{limitations}$"; there is no perfect net to catch all kinds of fish, and there is no perfect method to detect all kinds of exoplanets.

### Can you think of reasons the Transit Method is biased? What kind of exoplanets can it detect?