# Theoretical modeling

To assess the effectiveness of each filtering methods, we will apply a theoretical modeling to the lightcurves, in order to determine the following parameters of the exoplanets orbiting the starts involved: $a/R*$ (where $a$ is the distance from the exoplanet to the starts that orbits and $R*$ is the radius of the star) , $R_p/R*$ (where $R_p$ is the radius of the exoplanet) , $b$ (which correponds to the impact parameter of the passage of the exoplanet in front of the star) and $p$ (which corresponds to the orbital period of the planet). This modeling will be done using the formalism created by [*Mandel & Agol at Analytic Light Curves for Planetary Transit Searches (2008)*](https://iopscience.iop.org/article/10.1086/345520/pdf) and, from the analysis of the values of $\chi^2$, all parameters will be obtained with their respective uncertainties. Applying this modeling to each light curve with a confirmed exoplanet, before and after the different types of filtering applied, it will be possible to evaluate the effectiveness of filtering in improving data quality, without compromising them.


Primeiramente, é necessario calcular o eclipse médio das curvas de luz, ou seja, a curva phase folded. Para isso, vamos nos basear no algoritmo descrito no Capítulo 12 - Variable Stars and Phase Diagrams, do Hands-On Astrophysics Manual, disponível [neste website](https://www.aavso.org/education/vsa).



A próxima etapa é implementar o algoritmo na linguagem Python, aplica-lo para uma curva de luz, realizar algumas considerações, e então aplica-lo para todas as curvas cruas (raw) e filtradas. Dessa forma, construiremos um dataset composto pelas Folded Light Curve para cada combinação de parâmetros, de cada técnica de filtragem.

In [None]:
# !pip install /content/imt_lightcurve-1.2-py3-none-any.whl --force-reinstall

In [2]:
# Importing packages

from imt_lightcurve.models.lightcurve import LightCurve
from imt_lightcurve.visualization.data_viz import line_plot, multi_line_plot

import pandas as pd
import numpy as np

Loading a random lightcurve

In [32]:
# Chosen lightcurve
CURVE_ID = '100725706'

# Importing lightcurve data from github
data = pd.read_csv('https://raw.githubusercontent.com/Guilherme-SSB/IC-CoRoT_Kepler/main/resampled_files/' + CURVE_ID + '.csv')
time = data.DATE.to_numpy()
flux = data.WHITEFLUX.to_numpy()

normalized_flux = flux/np.median(flux)

# Create the LightCurve object
curve = LightCurve(time=time, flux=normalized_flux)
curve.plot()

## Creating a *folded light curve* 

In [33]:
folded_curve = curve.fold(CURVE_ID)
folded_curve.plot()

# Simulation algorithm

In [34]:
from imt_lightcurve.simulation.simulation import Simulate

## 1. Simulate a *folded light curve*

In [35]:
SimulationObject = Simulate()

Loading the observed-folded curve

In [36]:
observed_curve_lc = folded_curve

Defining parameters

In [37]:
x_values = [1.5, 1.4, 1.3, 1.2, 1.1, 1.0 , 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.0 , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0 , 1.1, 1.2, 1.3, 1.4, 1.5]

In [None]:
CoRoT;    Run  ;Epoch         ; e_Epoch ;Per      ; e_Per    ; Depth; e_Depth; Dur  ; e_Dur; Rp/R* ; e_Rp/R*; a/R*;  e_a/R*; b;    e_b;  rho*;  e_rho*;rmag;SpType;Flag;Nature;FU;_RA;_DE
100725706;LRc01;2454233.624803; 0.000982;13.240160; 0.000160; 1.514 ; 0.023  ; 2.542; 0.042; 0.1483; 0.023  ; 22.36;   1.10; 0.92; 0.05; 0.85;  0.13;14.76;G2;     1;P   ;yes;291.0637200;+00.7461300

In [42]:
ID = int(CURVE_ID)

b_real = 0.92
p_real = 0.1483
period_real = 13.240160
adivR_real = 22.36

### Simulating curve... 

In [43]:
simulated_curve = SimulationObject.simulate_lightcurve(observed_curve=observed_curve_lc, b_impact=b_impact, p=p, period=period, adivR=adivR, x_values=x_values)

simulated_curve.view_simulation_results()
chi2 = simulated_curve.compare_results(see_values=False)
print('\nChi2 =', chi2)

Building the light curve...
Plotting simulation results



Chi2 = 22.45810235235106


## 2. Simulate a grid of values

In [40]:
SimulationObject = Simulate()

Loading the observed-folded curve

In [41]:
observed_curve_lc = folded_curve

Defining parameters

In [44]:
# Load parameters

# Defining grid of parameters to search
period_values = LightCurve.define_interval_period(period_real)
p_values = LightCurve.define_interval_p(p_real)
adivR_values = LightCurve.define_interval_adivR(adivR_real)
b_values = LightCurve.define_interval_b(b_real)

### Simulating values ... 

In [51]:
final_table = SimulationObject.simulate_values(
    CoRoT_ID=ID,
    observed_curve=observed_curve_lc, 
    b_values=b_values, 
    p_values=p_values, 
    period_values=period_values, 
    adivR_values=adivR_values, 
    set_best_values=True,
    results_to_csv=False)

Starting simulation...


Simulating: 100%|[34m██████████[0m| 900/900 [00:00<00:00, 904.23it/s]


In [None]:
final_table.head()

In [49]:
## Simule a lightcurve with the best fitting values

best_simulated_curve = SimulationObject.simulate_lightcurve(observed_curve=observed_curve_lc, x_values=x_values)
best_simulated_curve.view_simulation_results()
chi2 = best_simulated_curve.compare_results(see_values=False)
print('\nChi2 =', chi2)

Building the light curve...

Using the best b_impact, computed earlier
Using the best p, computed earlier
Using the best period, computed earlier
Using the best adivR, computed earlier



TypeError: ignored

## Uncertanties

In [None]:
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
output_notebook()

def plot_histogram(data=None, bins=30):

    hist, edges = np.histogram(data, density=True, bins=bins)

    p = figure(title='Histogram plot',
          plot_width=650, plot_height=400,
          background_fill_color='#fafafa')

    p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:],
            fill_color='navy', line_color='white', alpha=0.5)

    p.y_range.start = 0

    show(p)

def plot_gaussian(data, amplitude, mu, sigma, bins, factor=0.005):
  hist, edges = np.histogram(data, density=True, bins=bins)

  x = np.linspace(min(data)-factor, max(data)+factor, len(data))

  pdf = amplitude * (1/(sigma * np.sqrt(2*np.pi)) * np.exp(-(x-mu)**2 / (2*sigma**2)))

  p = figure(plot_width=650, plot_height=400,
        background_fill_color='#fafafa',
        x_range=(min(data)-factor, max(data)+factor) )

  p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:],
          fill_color='navy', line_color='white', alpha=0.5)

  p.line(x, pdf, line_color='#ff8888', line_width=4, alpha=0.7, legend_label='PDF')

  p.title.text = f'Normal Distribution Approximation (amp = {amplitude}, μ={round(mu,12)}, σ={round(sigma,12)})'
  p.y_range.start = 0
  p.legend.location = "center_right"

  show(p)

In [None]:
final_table = pd.read_csv('final_table.csv'); final_table

Unnamed: 0,b_impact,p,period,adivR,chi2
0,0.87,0.08,6.1,54.4,25.899242
1,0.87,0.08,6.2,54.6,25.899242
2,0.87,0.08,6.2,54.5,25.899242
3,0.87,0.08,6.2,54.4,25.899242
4,0.87,0.08,6.2,54.3,25.899242
...,...,...,...,...,...
295,0.80,0.08,6.2,54.7,92.431063
296,0.80,0.08,6.2,54.8,92.431063
297,0.80,0.08,6.2,54.9,92.431063
298,0.80,0.08,6.1,54.5,92.431063


In [None]:
len(final_table.b_impact[final_table.chi2 == final_table['chi2'].min()])

30

In [None]:
parameter = 'b_impact'

parameter_table = final_table[parameter]

In [None]:
tolerance = 0.95

min_error = final_table['chi2'].min()
data = []
chi2_data = []

for i in range(len(final_table['chi2'])):
  if final_table['chi2'].loc[i] < (min_error + tolerance):
    data.append(parameter_table.loc[i])
    chi2_data.append(final_table['chi2'].loc[i])

print(pd.Series(data).value_counts())

0.87    30
0.86    30
dtype: int64


In [None]:
# plot_histogram(data, bins=int(pd.Series(data).nunique()))
plot_gaussian(data, amplitude=1.25, mu=np.mean(data) , sigma=np.std(data), bins=int(pd.Series(data).nunique()))