**Tutorial 3a - Measuring the Properties of an Eclipsing Extrasolar Planet**

In this tutorial you will measure the parameters of the light curve of an eclipse of a star by a planet.  We will do this using Bayesian statistical inference.

We will first find the best-fit model parameters which in this case is the maximum likelihood.  We will then find the contingent posteriors for three of the parameters.  Finally we will find the 2 dimensional posterior marginalized over a third parameter.

The data consists of measured fluxes of a star at different times.  As the planet passes in front of the star there is a dip in the flux.  The important quantities to measure are : 1) What is the baseline flux before and after the eclipse ; 2) How deep this dip which is related to how large the planet is compared to the star ; 3) How long the eclipse is which is related to the orbital speed and sizes of the star and planet ; 4) How steep are the transition into and out of the eclipse which is related to the limb darkening of the star combined with the size and speed of the planes.  These will be parameters that will be found from the light curve and later interpreted in terms of physical quantities.


1) First you must import `numpy`, `pandas`, `matplotlib.pyplot` and `scipy.optimize`.

2) Now import the data file "eclipsing_planet_data.csv" into a dataframe.  This file has the measured fluxes, the time of each observation and the estimated errors in each observation.  Put these into separate vector `t`, `f` and `sigma`.

3) Plot time vs flux.  The units are in minutes and arbitrary relative flux units.

4) We must build a model for this light curve.  We will use the model :
    
$ f(t) = f_o - \delta F \exp[- \frac{|t - to|^n}{(\Delta t)^n}  ]  $ 

The parameters are the baseline flux ($f_o$), the depth of the dip ($\delta F$), the time of the eclipse ($t_o$), the width of the eclipse ($\Delta t$) and the parameter $n$ which controls the speed of the drop into the eclipse.

Write a function that takes first the time, $t$ and then the other parameters and returns the predicted flux.

In [None]:
def light_curve(t,Fo,dF,dt,to,n) :
 

5) We will assume the noise is Gaussian distributed.  We need to make a $\chi^2$ function that we will minimize to find the best fit parameters.

$\chi^2(p) = \sum_i \frac{(f_i - L(t_i,p) )^2}{\sigma_i^2}$

Where $L(t_i,p)$ is the model for the lightcurve.  Make a function that returns the $\chi^2$ and takes the parameters as a vector $p$.  These are the same parameters ($f_o$, $\delta F$, $\Delta t$, $t_o$, $n$), but they need to be encoded in a vector to work with the optimizer. Since `t` and `f` are in global memory they can be used inside this function and don't need to be arguments.

Write this $\chi^2(p)$ function.

In [None]:
def chi2(p):
    ...

6) Now we can find the set of parameters that minimises $\chi^2(p)$.  To do this we will use `scipy.optimize`.  In this function use `method = 'Nelder-Mead'`.  You will need to make an initial guess as to what these parameters are.  Look at the data and estimate them.  Check the output of the optimizer.  If it has not converged properly your guess was not good and you should try again (or there is some other problem).  Save the results in `best_fit`. 

Print out the best fit parameters with labels.

In [None]:
guess = ...
best_fit = ...

print('best fit model :')
print('    Fo = ',best_fit.x[0],'flux')
print('    dF = ',best_fit.x[1],' flux')
print('    dt  = ',best_fit.x[2],' mins')
print('    to = ',best_fit.x[3],'  mins')
print('    n = ',best_fit.x[4])

7) Make a new vector of times that is evenly spaced.  Use `light_curve()` to make a model lightcurve with the best-fit parameters.  Plot this and the data together.  Make sure it looks like a reasonable fit.

8) Now we want to find the *conditional* posterior distribution for the centeral time $t_o$ with a uniform prior distribution.  

Set the all the other parameters to the best fit values you have already found.  Make a loop through possible $t_o$s.  At each value calculate the $\chi^2$ and store it in and array.  Because the noise is Gaussian distributed, the log of the likelihood is

$\ln L (t_o,F_o,\delta F,\Delta t,n) = - \frac{1}{2} \chi^2(t_o,F_o,\delta F,\Delta t,n) + const.$

Use `np.trapz(L,x=tos)` to normalize the likelihood and plot it as a function of $t_o$.  Make sure you have used enough $t_o$s and a wide enough range that the integral is accurately calculated. 

In [None]:
p = best_fit.x
d = 0.25

tos = np.arange(270,280,d)
lnL = np.zeros(len(tos))
i=0
for tt in tos :
    p[3] = tt
    lnL[i] = ...
    i += 1

# normalize the likelihood to 1 using numpy.trapz() to do the integral 
....

plt.plot(tos,L)
plt.xlabel(r'$t_o$')
plt.ylabel(r'$P(t_o | \hat{F}_o,\hat{\delta F},\hat{\Delta t},\hat{n})$')
plt.show()

9) Calculate the mean and standard deviation of $P(t_o | \hat{F}_o,\hat{\delta F},\hat{\Delta t},\hat{n})$.  You sould do this by approximating an integral over the pdf as a sum over the tabulated values for the pdf that you have found in 8).

In [None]:
.
.
.

print('to = ',ave_to,' +/- ',np.sqrt(var_to))

10) Do the same for $P(\delta F | \hat{\Delta t}, \hat{F}_o,\hat{t_o},\hat{n})$

In [None]:

p = best_fit.x
d = 0.001
dFs = np.arange(0.12,0.18,d)

lnL = np.zeros(len(dFs))
i=0
for dF in dFs :
  ...

# normalize the likelihood to 1 using numpy.trapz() to do the integral 

.
.
.

plt.plot(dFs,L)
plt.xlabel(r'$ \delta F$')
plt.ylabel(r'$P(\hat{\delta} | \Delta t, \hat{F}_o,\hat{t_o},\hat{n})$')
plt.show()

.
.
.

print('dF = ',ave_dF,' +/- ',np.sqrt(var_dF))


11) Do the same for $P(\Delta t | \hat{F}_o,\hat{\delta F},\hat{t_o},\hat{n})$.

In [None]:
p = best_fit.x
d = 0.1
dts = np.arange(45,55,d)
.
.
.

# normalize the likelihood to 1 using numpy.trapz() to do the integral 
.
.
.

plt.xlabel(r'$\Delta t$')
plt.ylabel(r'$P(\Delta t | \hat{F}_o,\hat{\delta F},\hat{t_o},\hat{n})$')
plt.show()

.
.
.

print('Delta t = ',ave_dt,' +/- ',np.sqrt(var_dt))

12) Now we want to find the posterior *marginalized over* $t_o$.  This involves integrating over $t_o$ for each value of $\delta F$ and $\Delta t$ :

$P(\delta F,\Delta t | \hat{F}_o,\hat{n}) = \int dt_o~ P(\delta F,\Delta t, t_o | \hat{F}_o,\hat{n})$

We will keep $F_o$ and $n$ fixed to thier best fit values $\hat{F}_o$ and $\hat{n}$ to make things simpler.

Make two nested loops over $\delta F$ and $\Delta t$ within which the likelihood is integrated over $t_0$.  To do this we must define a function that can be integrated by `scipy.integrate.quad`.  The first argument of this function is the variable over which the integral and the second is the other parameters.

These likelihoods should be stored in a 2 dimentional array `L[i,j]`.  Then make contour plot of $L$.  This calculation might take a while.  If it takes too long you can reduce the resolution in $\delta F$ and $\Delta t$.

In [None]:
from scipy.integrate import quad

def integrand3(t_o,p):
    p[3] = t_o
    return np.exp(-0.5*chi2(p))



p = ...
L = ...
j= ...
for dF in dFs :
    i= ...
    for dt in dts :
        p[1] = ..
        p[2] = ..
        L[i,j] = ..
        i += ..
    j += ..
    
plt.contour(dFs,dts,L)
plt.xlabel(r'$\delta$')
plt.ylabel(r'$\Delta F$')
plt.show()

So far we have calculated the un-normalized posterior.  We need to find the contours in parameter space that contain 68%, 95% and 99% of the posterior probability.

Here is a function which given a tabulated distribution distribution `poterior` and a fraction will return the contour level within which that fraction of the distribution  is contained.  For example : The contour within which half of the probability is contained would be `find_level(probability_table,0.5)[0]`.  The function assumes that the `probability_table` is evaluated on a uniform (in prior probability) grid.

In [None]:
def find_level(prob,fraction) :
    '''
    this function finds the level for a 
    contour that contains a fixed fraction  
    of the total sum of pixels
    '''
    tot = np.sum(prob)
    max = np.max(prob)
    min = np.min(prob)

    level = 0.5*(max + min)
    frac = np.sum( prob[ prob >= level ]  )/tot
    res = np.min( prob[ prob >= level ]  )/tot

    while( abs(frac - fraction) > res  ) :
        
        if( frac > fraction) :
            min = level
        else :
            max = level
            
        level = 0.5*(max + min)
            
        frac = np.sum( prob[ prob >= level ] )/tot
        res = np.min( prob[ prob >= level ]  )/tot

    return level,frac


13) Make a credibility plot for $\delta F$-$\Delta t$ with 68%, 95% and 99% contours.  The function above can be used to find the levels.

In [None]:
.
.
.

print(levs)

fig, ax = plt.subplots()
CS = ax.contour(dFs,dts,L,levs)
## the stupidly complicated way contours are labeled 
fmt = {}
strs = [ '99%', '95%', '68%']
for i,s in zip( CS.levels, strs ):
    fmt[i] = s
ax.clabel(CS, inline=True, fontsize=10,fmt=fmt)

plt.xlabel(r'$\delta F$')
plt.ylabel(r'$\Delta t$')
plt.title(r'$P(\delta F,\Delta t | \hat{F}_o)$')
plt.show()