## Fitting real space telescope data

Okay!  So we've spent this entire week modifying and improving the way we make our `square_dip` function.  Now we get to use it to fit real Kepler Space Telescope data!

In [None]:
import numpy as np # math package
import matplotlib.pyplot as plt # plotting package
import os # talks to your computer
import pandas as pd # works with tables of data

For this final round of modifying our `square_dip` function, because we'll be working with real data this time we can remove the `seed=88` optional input parameter.  Also, we've commented out the line were we randomly set the starting value for the first dip and INSTEAD we've added a `start` input parameter to the function.

This will help us fine tune where the first dip in real our data starts -- which is really important because that will set us up to fit the rest of the dips properly.

In [None]:
def square_dip(t,array,mass,depth,period,start):
    '''
    t: The array for our x values (representing time)
    array: Our y-values! This allows us to have more
           than one planet :)
    mass: How large we want our planet to be!
          (Make this a number between 4 and 12)
    depth: How much should our planet dim the brightness?
           (Pick a value between 0.5 and 0.98)
    period: How frequently do we want this to happen?
    start: specifying the starting index!
    '''    
    
    transit_expected = array.copy()
    #start = round(np.random.uniform(1,20)) # instead of randomly choosing, we'll be specifying this
    stop = start + mass
    while stop < len(t):
        transit_expected[start:stop] = depth
        start += period
        stop = start + mass
    return transit_expected


Okay!  To upload our Kepler data, follow these steps: 

1. Find the file called `KIC_7529266.tbl` in the Day 5 folder and download that file. 
1. Click on the little folder icon on the left side of this notebook.
1. On the panel that pops out, notice that there are three icons underneath the header "**Files**".  Click on the left icon that looks like a page with and arrow pointing up.
1. Navigate the file browser that pops up to where you downloaded the `KIC_7529266.tbl` file and double click on it to upload!

Once that's done, and you can see the `KIC_7529266.tbl` in the left panel, we can run the code below.

In [None]:
Kepler435b = pd.read_csv("KIC_7529266.tbl", skiprows=3, delimiter='\s+',names=['time','lightcurve'])
Kepler435b  # shows us what this table looks like

In [None]:
time = Kepler435b.time        # writing column to a variable
data = Kepler435b.lightcurve  # writing column to a variable

In [None]:
plt.figure(figsize=(20,5))
plt.scatter(time,data) # telling it to plot time on the x-axis and data on the y-axis

# we're going to fit these dips using our "square_dip" function
star = np.zeros(len(time))

# play with the numerical inputs to the "square_dip" function to match the Kepler data!
planet = square_dip(time,star,8,-0.0016,207,45)
plt.plot(time,planet,color='r')

plt.xlabel('Time [days]') # add a name for the x-axis
plt.ylabel('Change in brightness') # add a name for the y-axis

# if you uncomment this, you can look at a smaller range of x-values 
# (and see fewer periods); play with commenting this out and uncommenting
plt.xlim(130,165) 

plt.show()

Play with modifying the 4 numerical inputs to our current `square_dip` function (mass, depth, period, and starting index) until the red fitting curve fits the real Kepler data really well!

------------------
  
  

Advanced
--------------

The code below walks you through a more advanced way to access and play with Kepler Space Telescope data.  As part of this, we'll be downloading a FITS file from the Space Telescope Science Institute (STScI) archive for Kepler data.

In [None]:
from astropy.io import fits
from astropy.table import Table
from astropy.utils.data import download_file

In [None]:
# here we're downloading a FITS data file from the Space Telescope Science Institute's archive for Kepler
Kepler435b_fits = download_file( 'http://archive.stsci.edu/pub/kepler/lightcurves/0075/007529266/kplr007529266-2011053090032_slc.fits ', cache=True )

In [None]:
hdu_list = fits.open(Kepler435b_fits, memmap=True)
hdu_list.info() # shows us what the FITS image file is containing

Great -- so it looks like the data we want is on the second layer, labeled "LIGHTCURVE".  Let's extract that slice of this FITS image so we can plot the lightcurve!

In [None]:
# converting that slice to a machine-readable table of data
Kepler435b_data = Table(hdu_list[1].data)
Kepler435b_data # looking at the table!

Great -- it all looks good.  Now we can write out the columns we want from this table to variables (to make it easier to plot things cleanly).  ALSO, we're going to add in the `square_dip` fitting as well!

In [None]:
# writing the things we need to variables
time = Kepler435b_data['TIME']
data = Kepler435b_data['PDCSAP_FLUX'] - np.nanmean(Kepler435b_data['PDCSAP_FLUX'])
errors = Kepler435b_data['PDCSAP_FLUX_ERR']

# plotting the Kepler data!
plt.figure(figsize=(20,4.5))
plt.errorbar(time,data,errors)

# adding in our fit!
# we have to make a new time variable so we can cover the empty space where Kepler didn't take data
fit_time = np.linspace(time[0],time[-1],len(time)+1000)
star = np.zeros(len(fit_time))

# play with the numerical inputs to the "square_dip" function to match the Kepler data!
planet = square_dip(fit_time,star,300,-400,10810,570)
plt.plot(fit_time,planet,color='r',zorder=3)


plt.ylim(-500,500)

plt.show()

Play with the same numerical inputs to the `square_dip` function until it fits well.  Notice that for fitting real Kepler data, our input numbers get very large -- for example, using our code we have to set the "mass" of this exoplanet to be over 300!  

HOWEVER, this doesn't actually represent the mass of the planet.  This is where our `square_dip` code could use more refinement, because the way we've constructed it uses the *range of indexes* to specify the "mass" of the planet.  This is why that number can get so large, because there are so many data points in this Kepler data!  

The Kepler space telescope looked at this star and exoplanet system a LOT in a short amount of time.  This is why the dips in this Kepler data have >300 points in them, and so our "mass" input in our `square_dip` function has to be larger than 300.  The same applies to the other input values!

### WELL DONE!  You've successfully fit Kepler transit data for two different exoplanets!!