# Project 7: What affects the rotation speed of stars?

Stellar rotation is a fundamental property of stars that influences their structure, evolution, and magnetic activity. As stars age, their stellar winds carry away angular momentum in a process called **magnetic braking**, slowing down their rotation. This relationship, called **gyrochronology**, enables astronomers to estimate a star's age from its rotation period. But other factors can also influence rotation rates: massive stars tend to rotate more quickly than low-mass stars, and stars in binary systems may be "spun-up" by interactions with their companion. This means that the relationship between a star's rotation rate and other parameters might not be as straightforward as gyrochronology implies.

In this project, you'll investigate stellar rotation by analyzing light curves from the Kepler Space Telescope. By measuring the periodic brightness variations caused by starspots moving in and out of view, you'll determine the rotation periods of stars and explore how rotation rates vary with mass and age.

---

## Data

The [Kepler Space Telescope](https://science.nasa.gov/mission/kepler/) was a NASA mission that continuously monitored the brightness of over 150,000 stars in a fixed region of the sky. Kepler’s primary goal was to detect planetary transits: small dips in brightness caused by exoplanets passing in front of their host stars. However, Kepler’s long-term, high-precision observations also provide an excellent dataset for studying stellar variability, including rotation periods.

Kepler **light curves**, which track how the brightness of a star changes over time are publicly available through the [Mikulski Archive for Space Telescopes (MAST)](https://archive.stsci.edu/). For this project, we'll use a Python package called `lightkurve` (documentation [here](https://lightkurve.github.io/lightkurve/)) to access Kepler data. 

We'll also make use of the **Kepler Input Catalog (KIC)**, a database that contains detailed information about the stars observed by Kepler. Specifically, we'll use the `kepler_stellar_17` table from [this online archive](https://archive.stsci.edu/missions/kepler/catalogs/), which contains updated properties for ~200,000 stars from the 25th Kepler data release. The columns we're most interested in are:

- `kepid`: A unique identifier for each star in the Kepler field
- `teff`: The effective temperature of the star
- `logg`: A measure of the gravitational pull at the star's surface, also called "surface gravity"
- `st_radius`: The radius of the star in units of solar radii
- `mass`: An approximation of the star’s mass in units of solar mass
- `feh`: A measure of the star's [metallicity](https://icc.dur.ac.uk/~tt/Lectures/Galaxies/TeX/lec/node27.html): the fraction of elements heavier than helium, normalized to the solar value

A full description of the columns in this table can be found [here](https://archive.stsci.edu/search_fields.php?mission=kepler_stellar17).

---

## Analysis tasks

### 1. Obtain and clean the Kepler Input Catalog (KIC) 

Download `kepler_stellar_17.csv.gz` from [this online archive](https://archive.stsci.edu/missions/kepler/catalogs/) and uncompress the file. Load the catalog into your notebook in a format that's easy to work with. (Astropy's `Table` class is highly recommended. The `Table.read()` function can handle CSVs without any additional tweaking!)

Since we're primarily interested in measuring rotation periods, we want to remove objects from the catalog that might show variability for other reasons, like the presence of a binary companion or transiting exoplanet. A list of known binaries in the KIC was published in Kirk+2016. Download their catalog from [this FTP](https://cdsarc.cds.unistra.fr/viz-bin/cat/J/AJ/151/68#/browse) (you'll want `catalog.dat.gz`) and read it into your notebook. The first column contains Kepler IDs for each binary. Add these IDs to a list of objects to remove from the KIC.

A list of Kepler Objects of Interest (KOIs; KIC stars with suspected transiting exoplanets) can be found [here](https://exoplanetarchive.ipac.caltech.edu/cgi-bin/TblView/nph-tblView?app=ExoTbls&config=cumulative) in the NASA Exoplanet Archive. Download the table and read it into your notebook. The first column contains Kepler IDs for each KOI, which you should add to your list of objects to remove from the KIC.

Once your list contains the IDs of all known binaries and KOIs in the KIC, create a new table that excludes all of those objects. Astropy's `Table` class has a variety of methods that might be helpful for this task; check out the documentation [here](https://docs.astropy.org/en/stable/table/modify_table.html) for more information.

Finally, filter the table to remove all rows with invalid (negative) `teff` or `logg` values. For help filtering tables, look back at the `intro_to_gaia_data` notebook from week 7.

### 2. Measure the rotation period of KIC 1026146

Follow each of the steps below to measure the rotation period of a specific star, KIC 1026146. Note that you will reuse this code to measure the periods of other stars in step 3, so try to modularize your code in functions wherever possible!

#### 2a. Obtain and plot the light curve 

Use the provided code to retrieve the light curve for KIC 1026146 through the `lightkurve` Python package. The light curve will consist of three equal-length arrays: times, fluxes (brightness measurements), and errors on the fluxes. Plot the light curve, with time on the x-axis and flux on the y-axis. 

#### 2b. Clean the light curve

You'll notice that the light curve contains a few outliers -- single, isolated points that fall high above the rest. These points are almost certainly due to electronics issues and shouldn't be considered along with the other measurements. To remove them, perform one round of sigma-clipping on the flux values of the light curve: 

1. Compute the mean $\mu$ and standard deviation $\sigma$ of the list
2. Remove all members of the list that fall more than $x\sigma$ away from $\mu$, where $x$ is a user-defined parameter (usually chosen to be 3 or 5)

*Note: Usually sigma-clipping is an iterative process, with steps 1 and 2 repeated until no additional outliers are removed. Since we only want to catch the most deviant outliers here, we'll just perform one round.*

The light curve may also contain NaN values to indicate missing or invalid data points. Remove any values where either the flux or flux error is NaN. For both the NaN values and the outliers, make sure that you're removing the correct entry from all three arrays making up the light curve (the times, fluxes, and flux errors), not just the array you're using to reject data.

<font color='red'>Caution: Remember not to remove points directly from a list you're iterating over! Consider building a new list instead, or using filtering techniques that modify the whole list in place (see section 4.9 of the textbook).</font>

Finally, **normalize** the flux of the light curve by dividing each flux value by the median flux. This makes it easier to compare light curves from different stars by eliminating differences caused by absolute brightness levels. Make a new plot of your cleaned, normalized light curve.

#### 2c. Determine the rotation period of the star with a Lomb-Scargle periodogram

To find the rotation period of a star, you'll use the **Lomb-Scargle periodogram**, a mathematical tool used to detect periodic signals in unevenly spaced data (like Kepler observations). The periodogram will reveal peaks at frequencies corresponding to the most significant periodic signals in the light curve. The inverse of the highest peak's frequency gives us the estimated rotation period of the star, since $f = \frac{1}{P}$

In Python, we can compute a Lomb-Scargle periodogram using the `LombScargle` class from the `timeseries` subpackage of `astropy` (documentation [here](https://docs.astropy.org/en/latest/timeseries/lombscargle.html). For measuring the rotation periods of stars, the recommended workflow is:

1. Define a [`LombScargle` object](https://docs.astropy.org/en/latest/api/astropy.timeseries.LombScargle.html) containing the clean time and flux data for your star.
2. Make a rough estimate of the star's rotation period based on the number of days it takes the light curve to complete one full sinusoidal cycle. You can estimate this by eye from a plot of the star's light curve.
3. Define an array of periods to search for that covers a range of values around your estimated period.
4. Pass your array of frequencies into the [`power()` method](https://docs.astropy.org/en/latest/api/astropy.timeseries.LombScargle.html#astropy.timeseries.LombScargle.power) of the LombScargle class to create the periodogram. The output will be an array containing the power at each provided frequency.
5. Plot the periodogram, with period (recommended) or frequency on the x-axis and power on the y-axis. Identify the highest peak as the likely rotation period of the star. 

Use the Lomb-Scargle periodogram to determine the most likely rotation period for KIC 1026146. Find the index that corresponds to the highest power and print out the corresponding period.

#### 2d. Test your rotation period by folding the light curve

**Folding** a light curve means plotting the brightness data as a function of phase instead of time, aligning repeated cycles on top of each other. This allows us to verify whether the detected period correctly represents a repeating pattern in the star’s brightness variations. If the light curve folds into a consistent shape, it confirms that we have correctly identified the rotation period.

Convert your time data to **phases** with the following formula:

$\text{phase} = \frac{t  \%  P}{P}$

where $P$ is the rotation period of the star, $t$ is each point in time that you have a flux for, and $\%$ is the modulus operator (just like in Python). Once you've done the conversion, make a scatterplot with phase on the x-axis and flux on the y-axis. Are all the measurements properly aligned to produce a smooth sinusoid? If not, the period you estimated may be wrong -- try modifying the period grid you provided to your periodogram.

### 3. Measure the rotation periods for a sample of other stars 

Repeat the process from step 2 to measure the rotation periods for a sample of at least 10 more KIC stars. Try to pick stars that have a range of `teff` and `logg` values (which you can retrieve from the KIC table that you set up in step 1). Store the KIC IDs, effective temperatures `teff`, surface gravities `logg`, and your estimated rotation periods in an Astropy `Table` (see [here](https://docs.astropy.org/en/stable/table/construct_table.html) for information on how to create your own).

Not all stars will have nice light curves, so you should be sure to examine the data first before trying to measure the rotation period! Look for light curves with smooth, sinusoidal shapes and few outliers. 

### 4. Investigate how rotation period depends on other stellar properties

Create a scatterplot with `teff` on the x-axis, `logg` on the y-axis, and your estimated rotation periods represented as the color of the points. (See the `intro_to_plotting` notebook from week 6 for examples of how to do this.) Comment on any trends you see in rotation period as a function of `teff` and/or `logg`. 

*Note: Plots that show effective temperature on the x-axis and surface gravity on the y-axis are called Kiel diagrams, and can be used to distinguish the evolutionary states of stars. Traditionally, both axes in a Kiel diagram are inverted, so that temperature increases towards the left and surface gravity increases towards the bottom of the plot. Feel free to implement this in your plot!*

---

## Reflection

Write a brief (1-2 paragraphs) interpretation of the results you found above. Link it back to your original research question and key concepts from your literature review. (For this project in particular, you might consider thinking about what underlying physics could explain the trends you found in step 4.)

Then, write a brief (1-2 paragraphs) reflection on the limitations of your analysis. Are there any caveats or assumptions in your analysis? Could more data or a different method provide more robust results?

---

## Extending your analysis (optional)

Are there additional aspects of the dataset that you’d like to explore? Do you have ideas for refining the methods used in this notebook? Or maybe you’ve noticed an interesting pattern in your results that raises new questions? If you answered yes to any of these questions, I encourage you to extend your analysis! Feel free to reach out to me via email or visit office hours to discuss your ideas. If you're interested in diving deeper but aren’t sure where to start, I’m also happy to brainstorm with you. This is a great opportunity to practice developing your own research questions and exploring a dataset in a way that interests you.

---

In [None]:
import lightkurve as lk

#Using the lightkurve package to download the lightcurve
#This will retrieve ALL of the Kepler light curves for this object: 1 for each observation period ("quarter")
kic_id = "1026146"
lcs = lk.search_lightcurve(kic_id, mission='Kepler').download_all()

#Separating the lightcurve into three arrays for easy access
#Note that we're selecting the first light curve with lcs[0]
time = lcs[0].time.value
flux = np.array(lcs[0].flux.value.data)
flux_err = np.array(lcs[0].flux_err.value.data)