# Project 5: How can we measure the masses of exoplanets?

Exoplanets -- planets orbiting stars other than our own Sun -- are a rapidly growing area of research in astronomy, with thousands of confirmed detections in recent years. One of the key methods used to discover and characterize these planets is the [**radial velocity (RV) method**](https://www.planetary.org/articles/color-shifting-stars-the-radial-velocity-method#:~:text=The%20radial%2Dvelocity%20method%20for,tug%20of%20its%20smaller%20companion.), which measures the small changes in the velocity of a host star due to the gravitational influence of an orbiting planet. By analyzing the star's velocity changes over time, we can estimate the planet’s orbital period and minimum mass.

In this project, you'll be assigned a specific exoplanet to study. You'll use observations of the planet's host star to make your own measurements of the planet's mass, and explore how the assumptions you make about the host star’s properties affect your mass estimate. 

---

## Data 

The radial velocity data for your assigned exoplanet system will come from the California Legacy Survey, which has aggregated over 100,000 RV measurements for 719 stars. You can download the full catalog from Github [here](https://github.com/leerosenthalj/CLSI/blob/master/legacy_tables/legacy_data.csv). The columns that we're interested in are:

1. `name`: The name of the each star that was observed (excluding the "HD" prefix)
2. `jd`: The date (in days) when a radial velocity measurement was taken; read more about Julian dates [here](https://aa.usno.navy.mil/data/JulianDate))
3. `mnvel`: The radial velocity measurement (in m/s) 
4. `errvel`: The error on the radial velocity measurement

You will also need to access the [NASA Exoplanet Archive](https://exoplanetarchive.ipac.caltech.edu/index.html), an online portal that aggregates information about all confirmed (and candidate!) exoplanets that have been published in the astronomical literature. To find information for your system, just enter its name under **Explore the Archive** and click **Search**. You might see multiple entries appear; usually, these correspond to different papers that were written about your system. When retrieving planetary/stellar parameters, you should try to use the most recent reference that has the parameter you're looking for.

The first California Legacy Survey paper, [Rosenthal+2021](https://ui.adsabs.harvard.edu/abs/2021ApJS..255....8R/abstract), also contains information about your exoplanet system that might be helpful. To find your system, search for the name of its host star in Table 3 (which can be found in Appendix C). 

---

## Analysis tasks

### 1. Retrieve and organize your data

#### 1a. Get RV data

Download the California Legacy Survey database from [Github](https://github.com/leerosenthalj/CLSI/blob/master/legacy_tables/legacy_data.csv) and load it into your notebook in a format that's easy to work with. (Astropy's `Table` class is highly recommended. The `Table.read()` function can handle CSVs without any additional tweaking!) 

Once you've read in the data, filter the table so that it only contains rows for which `name` is the same as the name of your exoplanet system. (Note that the table does not include the "HD" prefix in any of the star names, so you'll have to leave it off and search with only the string of numbers.) For help filtering tables, look back at the `intro_to_gaia_data` notebook from week 7.

Make a scatterplot with time (`jd`) on the x-axis and RV (`mnvel`) on the y-axis. Briefly describe any trends that you see in the plot. 

#### 1b. Get the mass of your host star and the period of your planet

Visit the [NASA Exoplanet Archive](https://exoplanetarchive.ipac.caltech.edu/index.html) and search for your exoplanet system. Under the **Stellar parameters** header on the results page, find the stellar mass of your system's host star. Under the **Planet parameters** header, find the orbital period and mass (in Jupiter masses, $M_{\text{Jup}}$) of your planet. Copy each of these values and store them in variables near the top of your notebook. Also store the uncertainties on these values, if available. (Though you're going to measure your own mass for the planet, you'll eventually want to compare to the literature value!)

*Note: In principle, the orbital period of the planet can also be  estimated from RV data. Since this can be tricky for certain datasets,  we're taking a shortcut for this project -- but reach out to me if you'd like to discuss this more!*

#### 1c. Get the eccentricity of your planet's orbit

Find your exoplanet system in Table 3 of [Rosenthal+2021](https://ui.adsabs.harvard.edu/abs/2021ApJS..255....8R/abstract). The column labeled `e` reports the eccentricity of your planet's orbit. (Eccentricity values range from 0 to 1, with 0 meaning "perfectly circular" and values close to 1 meaning "very elliptical.") Copy this value and store it in a variable alongside the information that you retrieved for 1b.

### 2. Convert the RV data to phase space

Your radial velocity (RV) data is recorded as a set of (time, RV) pairs. However, because telescope time is limited, you might notice that the observations of your system are irregularly spaced. This is a common problem, and it means that the full periodic behavior of a system's RVs may not be well-sampled in a single continuous dataset.

To better reveal the periodic signal, we can convert time data into **phase space** by folding all observations into a single cycle of the planet’s orbit. This conversion is done with the following formula:

$\text{phase} = \frac{t  \%  P}{P}$

where $P$ is the orbital period of the planet, $t$ is each point in time that you have an RV for, and $\%$ is the modulus operator (just like in Python). Use this formula and the period of your planet to convert your time data to phase space. Once you've done the conversion, make a scatterplot with phase on the x-axis and RV (`mnvel`) on the y-axis. Briefly describe any trends that you see, and discuss similarities or differences from the plot you made for 1a.

*Note: This transformation only affects the time (x) values of your data -- the RV values should stay the same!*

### 3. Measure the semi-amplitude of the RV curve

Define a function that represents a generic sinusoid of the form $y = K \sin\left(\phi x\right)$. Use `scipy.optimize.curve_fit` to fit a sinusoid to your data and report the best-fit value for $K$. This is the **semi-major amplitude** of your RV curve, which you'll need to estimate your planet's mass. Replot your phased RV data and show the best-fit sinusoid on your plot.

### 4. Compute the minimum mass of the exoplanet

Using simple physical laws, we can relate the mass of the planet to the parameters derived from the RV curve:  

$K = \frac{(2\pi G)^{1/3}}{(1 - e^2)^{1/2}} \frac{M_p \sin i}{M_*^{2/3}} \frac{1}{P^{1/3}}$

where 

- $K$ is the semi-major amplitude of the RV curve (measured in step 3), which is related to the maximum velocity of the star as the planet tugs on it
- $G$ is the gravitational constant
- $e$ is the orbital eccentricity (obtained from [Rosenthal+2021](https://ui.adsabs.harvard.edu/abs/2021ApJS..255....8R/abstract))
- $M_p$ is the mass of the planet (to be derived)
- $M_*$ is the mass of the host star (obtained from the NASA Exoplanet Archive)
- $P$ is the orbital period of the planet around the star (also obtained from the NASA Exoplanet Archive)
- $i$ is the inclination angle of the planet’s orbit relative to our line of sight (which is  **unknown** for this project, but could be derived from a transit light curve)

For details of how this is derived, check out [this site](https://sites.astro.caltech.edu/~srk/BlackHoles/Literature/RV_Derivation.pdf), especially the section called **Radial Velocity Semi-Major Amplitude**. 

Since the inclination $i$ is unknown, we can only measure the quantity $M_p\sin i$, meaning that our result is a minimum mass. If the system is perfectly edge-on ($i = 90^∘$), then $\sin i = 1$, and the minimum mass is equal to the true mass. However, if the system is inclined at a lower angle, the true mass of the planet is larger than the measured value.

Using this formula and the known/measured parameters of your planet, derive the minimum mass $M_p\sin i$. This is a great time to use the `astropy.units` module, since you'll have to make sure that all your quantities have matching units. Report the minimum mass in Jupiter masses ($M_{\text{Jup}}$) and comment on how it compares to the mass measurement that you retrieved from the NASA Exoplanet Archive.

Finally, choose a range of reasonable values for $i$ and create a scatterplot showing how the true mass of the exoplanet changes with $i$. (You should plot $i$ on the x-axis and the mass on the y-axis.)

### 5. Investigate how host star uncertainties affect your mass estimate

Uncertainties in host star properties can significantly affect our understanding of their planets. Change the value of $M_*$ within the bounds of its uncertainty (which you retrieved in 1b) and compute the resulting range of possible values for $M_p$. Report your final estimate for the mass of your planet as $M_p \pm \sigma$, where the uncertainty $\sigma$ is half of the range of possible $M_p$ values. 


---

## Reflection

Write a brief (1-2 paragraphs) interpretation of the results you found above. Link it back to your original research question and key concepts from your literature review. (For this project in particular, you might consider thinking about why knowing the masses of exoplanets is interesting.)

Then, write a brief (1-2 paragraphs) reflection on the limitations of your analysis. Are there any caveats or assumptions in your analysis? Could more data or a different method provide more robust results?

---

## Extending your analysis (optional)

Are there additional aspects of the dataset that you’d like to explore? Do you have ideas for refining the methods used in this notebook? Or maybe you’ve noticed an interesting pattern in your results that raises new questions? If you answered yes to any of these questions, I encourage you to extend your analysis! Feel free to reach out to me via email or visit office hours to discuss your ideas. If you're interested in diving deeper but aren’t sure where to start, I’m also happy to brainstorm with you. This is a great opportunity to practice developing your own research questions and exploring a dataset in a way that interests you.

---